Fig. 3From: Organizing the bacterial annotation space with amino acid sequence embeddingsA Sequence embeddings of Bacillus carbohydrate metabolism sequences embedded using the Bacillus carbohydrate metabolism Protvec model, k-mer frequency and the Swiss-Prot Protvec model. Sequences are colored by their subclass and visualized using PCA. B CH index of Bacillus carbohydrate metabolism sequences (n = 5000) embedded using the Bacillus carbohydrate metabolism Protvec model, k-mer frequency and the Swiss-Prot Protvec model for K = 2:150 clusters. For each value of K, 500 bootstrap iterations were usedBack to article page