Skip to main content


Springer Nature is making Coronavirus research free. View research | View latest news | Sign up for updates

Figure 7 | BMC Bioinformatics

Figure 7

From: Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment

Figure 7

Relationship between pathaway similarity score, measured as the Jaccard coefficient between the proteins' KEGG pathway memberships, and profile similarity score (using reference set BAE3a), measured as the mutual information score of proteins' profiles. Each data point in the plot represents a pair of proteins. (a) 708, 645 pairs of E. coli proteins, out of which 664,677 had zero pathway similarity score. A weak positive correlation (R = 0.14) is found to exist between the pathway similarity score and the mutual information score. Rather than computing the correlation of all data points, Data and Marcotte computed the correlation of "representative" data points, each of which represents the average values for 1000 data points. This results in an artificial increase in the correlation (R = 0.89, inset). (b) 635 628 pairs of yeast proteins, out of which 599,954 had zero pathway similarity score. A weak positive correlation is observed (R = 0.16) between the pathway and profile similarity measures. An artificial increase in the correlation is observed (R = 0.65, inset) when Date and Marcotte's correlation computation strategy is employed.

Back to article page