Skip to main content


Table 2 Correlations between function predictors.

From: Automatic discovery of cross-family sequence features associated with protein function

Function A Function B r A,B
nuclear nuclear-copy 0.979
secreted secreted-copy 0.964
cytoplasmic cytoplasmic-copy 0.899
transcription nuclear 0.860
membrane integral 0.798
inhibits secreted 0.780
biosynthesis cytoplasmic 0.765
DNA nuclear 0.737
cytoplasmic nuclear 0.721
DNA transcription 0.696
cytoplasmic transcription 0.680
catalyzes biosynthesis 0.665
antibacterial secreted 0.643
antibacterial inhibits 0.630
cytoplasmic DNA 0.583
catalyzes inhibits -0.525
catalyzes secreted -0.568
inhibits cytoplasmic -0.598
biosynthesis inhibits -0.617
secreted cytoplasmic -0.623
biosynthesis secreted -0.650
  1. Pearson's correlation coefficient, rA,Bis calculated for all pairs of fixed-target predictor using the "consensus prediction scores" from test set sequences. Only predictor pairs where |rA,B| > 0.5 are shown. The strongest correlations shown at the top of the table are for "self comparisons" using duplicate predictors (trained independently with a different random seed). These indicate what "perfect" correlations would be, taking into account experimental noise. The highest non-self correlation, 0.86, is found between "nuclear" and "transcription" predictors (the raw data is shown by the blue data points in Figure 5(A)).