Skip to main content
Fig. 10 | BMC Bioinformatics

Fig. 10

From: Protein language models can capture protein quaternary state

Fig. 10

Accuracy of qs prediction by different approaches for the holdout set: The prediction is based on transfer of the qs annotation to each sequence in the holdout set based on A. the closest sequence in the training set (as determined by blast); B. The highest similarity in embedding space in the Training set (i.e., cosine similarity between embedded vectors); and C. QUEEN trained on the embeddings (see Text). The confusion matrix includes the frequency of cells representing predicted vs. actual label (on x and y-axes, respectively), where a matrix occupying only the diagonal represents full success, while off-diagonal values represent wrong predictions. The balanced accuracy increases from left to right as indicated by the darker diagonal, highlighting improved prediction when moving from sequence, to language model representation, to the deep learning model

Back to article page