Fig. 3From: KEGG orthology prediction of bacterial proteins using natural language processingIdentity distribution. The width of the violin plot along the X-axis corresponds to the frequency of data points. a The identity distribution of the predicted sequences of all match cases and the clustered sequence. b The identity distribution of the sequences not predicted by other methods in our match casesBack to article page