Figure 1From: N-gram analysis of 970 microbial organisms reveals presence of biological language modelsDataset. The pie-chart represents distribution of microbial organisms in the dataset. For proteobacteria which is a very large phylum, the classes within this phylum are labelled. For all other phyla only the phylum name is shown.Back to article page