Skip to main content

Table 2 Correlations between function predictors.

From: Automatic discovery of cross-family sequence features associated with protein function

Function A

Function B

r A,B

nuclear

nuclear-copy

0.979

secreted

secreted-copy

0.964

cytoplasmic

cytoplasmic-copy

0.899

transcription

nuclear

0.860

membrane

integral

0.798

inhibits

secreted

0.780

biosynthesis

cytoplasmic

0.765

DNA

nuclear

0.737

cytoplasmic

nuclear

0.721

DNA

transcription

0.696

cytoplasmic

transcription

0.680

catalyzes

biosynthesis

0.665

antibacterial

secreted

0.643

antibacterial

inhibits

0.630

cytoplasmic

DNA

0.583

catalyzes

inhibits

-0.525

catalyzes

secreted

-0.568

inhibits

cytoplasmic

-0.598

biosynthesis

inhibits

-0.617

secreted

cytoplasmic

-0.623

biosynthesis

secreted

-0.650

  1. Pearson's correlation coefficient, rA,Bis calculated for all pairs of fixed-target predictor using the "consensus prediction scores" from test set sequences. Only predictor pairs where |rA,B| > 0.5 are shown. The strongest correlations shown at the top of the table are for "self comparisons" using duplicate predictors (trained independently with a different random seed). These indicate what "perfect" correlations would be, taking into account experimental noise. The highest non-self correlation, 0.86, is found between "nuclear" and "transcription" predictors (the raw data is shown by the blue data points in Figure 5(A)).