Figure 7From: N-gram analysis of 970 microbial organisms reveals presence of biological language models4-gram distribution in the proteomes of different genera. Frequency of 4-grams for six different genera with the x-axis limited to the top forty most frequently occurring 4-grams of B. suis. The six genera shown are (A) Brucella, (B) Burkholderia, (C) Bacillus, (D) Xanthomonas, (E) Pseudonomas and (F) Escherichia.Back to article page