Figure 2From: Automatic discovery of cross-family sequence features associated with protein functionExamples of sequence-to-function relationships found by self-supervised learning. Three examples of sequence classifiers and their associated, co-evolved annotation-based classifiers are shown in panels A,C&E. In panels B,D&F, the correlation between the sequence-based classification and the annotation-based classification is shown for both training and testing data during the 8 h runs which produced the final individuals shown in panels A,C&E. Although these are hand-picked examples, note how the test set correlation generally follows the training set correlation in an upward trend. Because the test set proteins are minimally related to the training set proteins (less than 10% sequence identity), this shows that general sequence features related to function have been discovered.Back to article page