Skip to main content
Figure 1 | BMC Bioinformatics

Figure 1

From: Prediction of gene-phenotype associations in humans, mice, and plants using phenologs

Figure 1

Prediction of disease-genes from orthologous phenotypes. (a) Two phenotypes are said to be orthologous (“phenologs”) if the sets of underlying genes for those phenotypes have a statistically significant intersection, as determined using gene orthology. Statistical significance is calculated as the probability of seeing an intersection of v or greater given m genes with phenotype A and n with phenotype B, out of N total genes with orthologs in both species. Genes associated with A but not B are said to be predicted to be involved with B, and vice-versa. McGary et al. observed that approximately v/m of the predictions tended to be true positives for B, and v/n to be true positives for A. (b) illustrates a validated example from McGary et al. predicting genes involved in a human neural crest defect, Waardenburg syndrome, using the Arabidopsis negative gravitropism defect phenotype. In this example, the overlap between gene sets affiliated with Waardenburg and gravitropism is highly statistically significant (p≤10−6). In the right-hand circle and intersection, the human orthologs of the gravitropism genes are shown, for simplicity (VAM3 corresponding to STX7, STX12; SGR2 to DDHD2, SEC23IP; and GRV2 to DNAJC13). (c) In this paper, we extend the phenolog formalism to consider additional gene-phenotype associations from multiple model organisms to develop a quantitative ranking scheme for phenolog-based predictions. Those genes predicted by a single phenolog, as in (a), are weakly predicted for A; whereas those predicted by two phenologs are strongly predicted for A. In general, the addition of a third phenolog contributing to a predicted association will cause that gene to be ranked higher than if only two phenologs predict it. However, not all phenologs are equal; phenologs derived from less similar gene sets exert less influence over predictions than phenotypes with highly overlapping sets of affiliated genes.

Back to article page