PhenUMA: a tool for integrating the biomedical relationships among genes and diseases
© Rodríguez-López et al; licensee BioMed Central Ltd. 2014
Received: 21 April 2014
Accepted: 4 November 2014
Published: 25 November 2014
Several types of genetic interactions in humans can be directly or indirectly associated with the causal effects of mutations. These interactions are usually based on their co-associations to biological processes, coexistence in cellular locations, coexpression in cell lines, physical interactions and so on. In addition, pathological processes can present similar phenotypes that have mutations either in the same genomic location or in different genomic regions. Therefore, integrative resources for all of these complex interactions can help us prioritize the relationships between genes and diseases that are most deserving to be studied by researchers and physicians.
PhenUMA is a web application that displays biological networks using information from biomedical and biomolecular data repositories. One of its most innovative features is to combine the benefits of semantic similarity methods with the information taken from databases of genetic diseases and biological interactions. More specifically, this tool is useful in studying novel pathological relationships between functionally related genes, merging diseases into clusters that share specific phenotypes or finding diseases related to reported phenotypes.
This framework builds, analyzes and visualizes networks based on both functional and phenotypic relationships. The integration of this information helps in the discovery of alternative pathological roles of genes, biological functions and diseases. PhenUMA represents an advancement toward the use of new technologies for genomics and personalized medicine.
Integration of clinical and biomolecular data is a key step in the advancement of current biomedical research and development. One of the greatest limitations of this process is the absence of standard platforms to merge clinical and research studies . Some recent initiatives have focused on data sharing to provide precise phenotypic descriptions of patients in combination with genetic variation ,. An effective integration of clinical features with their molecular context, including genetic, physical and metabolic interactions, is expected to produce new insights for biomedical research . In fact, the phenome and the interactome were recently listed among the five most up-and-coming `omes’ that may offer new insights in science . Therefore, new integrative data tools are required to establish these functional and phenotypic links for genome-scale analyses.
Although inherited disorder databases such as OMIM  and Orphanet , provide extremely valuable details about the molecular nature of pathological conditions, these databases lack direct procedures for integrating biomolecular information. Biomedical ontologies are promising standard resources to address a systematic integration of phenotypes into the molecular background of mutated genomic regions ,,. For instance, the Human Phenotype Ontology (HPO) currently contains over 10,000 terms that represent each one an individual phenotype . An intuitive approach for determining similarities between sets of ontological terms (HPO terms), that could represent the phenotypic spaces of disorders or even genes, is to estimate their proximity in the ontology.
On the other hand, the Gene Ontology (GO) is an organized vocabulary of terms that can be subdivided into three sub-ontologies: biological processes, cellular components and molecular functions. Genes are associated with consistent annotations that conform sets of GO terms that are useful to describe the cellular and molecular events involving genes . Furthermore, biomolecular interactomes, such as protein-protein interactions and metabolic and gene regulatory networks, should also be used to obtain a systemic view of the molecular and biochemical reactions related to disease-causing genes .
In particular, because ontologies have been beneficial in understanding diseases as a set of phenotypes rather than conceptual entities, studying correlations among distinct biological conditions affected by genetic variations would be very useful .
The main purpose of this application is to provide a friendly platform that facilitates the analysis of phenotypic and functional information and the discovery of emergent or unnoticed relationships between pairs of genes or genetic diseases. PhenUMA also complies useful biological information from different interactomes, including protein-protein interactions from STRING  and metabolic flux correlations . Altogether, PhenUMA may be useful for discovering interesting new insights on or features shared by human diseases, increasing the potential for diagnosis and pharmacological intervention.
Knowledge base: data processing and storage
The Gene Map file provided by OMIM was used to extract 4,261 relationships between 2,794 OMIM genes and 3,486 OMIM phenotypes; OMIM genes were mapped to their GeneID. The PhenUMA knowledge base also contains the associations between Orphanet diseases and genes. This information was extracted from the file “Diseases with their associated genes”, included at Orphadata , and was used to develop 4,472 connections between 2,614 GeneIDs and 2,555 orphan diseases. We also included the diverse interactomes of human protein-protein interactions (96856 relationships) that were found with STRING  and 9812 gene pairs that had positive flux correlations in the metabolic network .
The inferred relationships between genes or diseases and orphan diseases are due to binary relationships, resulting in four different types of networks. For instance, an inference between two genes will be considered if at least one or more OMIM/Orphan diseases are associated with both genes. A stronger interaction between two genes will be considered when they share more than one disease. Overall, the scores that indicate the intensity of the relationship is the number of disorders involved in the relationship. The same criterion was applied to establish the inferred relationships between OMIM and Orphan disorders. In this case, the number of genes shared by the disorders is considered the score.
Semantic similarity relationships
HPO and GO were used to calculate the phenotypic similarities between genes or diseases and the functional similarities between genes, respectively. We used Ontologizer 2.0, an open-source tool, to determine the functional similarities, and it was also adapted to compute phenotypic similarities . Each gene or disease is represented by a set of terms that defines its functional or phenotypic profile. Only the most specific terms are included in the annotation files because the “true path rule” is met. This rule implies that each object related to a term also relates to all of the ancestors of this term to the root. For instance, the OMIM (MIM# 200500) disorder “Acheiropody” is associated with both “Humeral hypoplasia” (HP:0005792) and all of its ancestors, such as “Aplasia/Hypoplasia of the humerus” (HP:0006507).
Summary of main relationships in the knowledge base
Type of network
Type of interaction (source)
Inferred by Genes (OMIM)
Phenotypic Similarity (HPO)
Orphan Disease-Orphan Disease
Inferred by Genes (Orphanet)
Orphan Disease-Orphan Disease
Phenotypic Similarity (HPO)
Inferred by OMIM (OMIM)
Inferred by Orphan Disease (Orphanet)
Phenotypic Similarity (HPO)
Functional Similarity (GO Biological Process)
Functional Similarity (GO Cellular Component)
Functional Similarity (GO Molecular Function)
Protein-protein interactions (STRING)
Metabolic interactions [Veeramani and Bader]
Optimal threshold selection of semantic similarities
Each type of semantic similarity calculation requires the establishment of an optimal statistical threshold to differentiate between significant and non-significant similarity scores. Therefore, a minimal meaningful threshold was estimated for each class of phenotypic and functional similarity listed in Table 1. Four different reference datasets were generated from the information in the PhenUMA knowledge base: one for each phenotypic similarity (OMIM-OMIM, Orphan Disease-Orphan Disease and Gene-Gene) and another for all different types of functional similarity. In particular, we compared each dataset of disease pairs, which was inferred from the gene-disease association studies found in OMIM and Orphanet, to the phenotypic similarities between the diseases. The dataset for phenotypic similarities between genes was generated from the union of all inferred pairs obtained from OMIM and Orphanet. The fourth reference dataset resulted from the combination of interactomes from both metabolic and protein-protein interactions; the same dataset was used for all of the functional similarities.
As shown in Figure 2A, the number of genes and diseases began to decrease at the 98th percentile of all phenotypic similarities. Robinson's measurement clearly conserved more genes and diseases at the same cutoff points than Resnik's did measurement (solid lines above dashed lines, Figure 2A). The phenotypic similarity networks that result in different cutoffs are more similar to the reference dataset networks as we increase the similarity score cutoffs (solid lines above dashed lines, Figure 2B). This trend is especially notable for the evolution of Jaccard's similarity coefficient for the phenotypic similarity gene networks at the 98th percentile, where Resnik's measurement has a maximum similarity of approximately 3% and Robinson's one increases up to 10%. Indeed, this coefficient even decreased in Resnik's measurement at the 99th percentile (blue squares and dashed line, Figure 2B). The phenotypic similarity disease networks also had slightly higher Jaccard's similarity coefficients for Robinson's measurement from the 95th percentile to the top similarity score (red circles and a solid line for OMIM diseases and a green line, Figure 2B).
As it was foreseeable, the semantic similarity measurement applied by Robinson produced better performance for phenotypic similarities than Resnik's method (see Additional file 1). This analysis revealed the 98th percentile as a suitable threshold that provided a balanced tradeoff between a gain in specificity for phenotypic similarities and a loss of information for disease and gene pairs (Figure 2). For this reason, we selected the 98th percentile of Robinson's measurement as the lowest similarity value and the minimal appropriate cutoff to build phenotypic similarity based networks.
On the other hand, functional similarities are strongly dependent on large ontological domains that cluster genes with similar scores. Consequently, we set the lower cutoff at the 99.5th percentile, which considerably increases the similarity's significance and reduces noise from non-informative similarities. Therefore, phenotypic- and functional similarity-based networks were stored in the knowledge base using the 98th and 99.5th percentile as the minimal levels of confidence, respectively (Table 1). All of the scores were normalized following a min-max normalization method, and therefore the scores take values between 0 and 1, where 0 corresponds to the minimal score greater than the cutoff, and 1 represents the highest score for semantic similarity. This method results in confident semantic similarity relationships and a manageable size of networks to be processed by PhenUMA.
Network building process
The process of network building is quite different if a set of phenotypes is used as input. In this case, the set of phenotypes is considered as a new phenotypic profile. The similarity between this set and the phenotypic space of other genes or diseases is calculated using Robinson's semantic similarity measure. In the outcome network, the set of phenotypes is represented as a node, and only the significant relationships (P-value <0.05) among the genes or diseases are included. P-values are the probability of obtaining a greater score, between the input query and each gene or disease annotated to the ontology, in the comparison with a random set of phenotypes with same size as the input set. The calculation of P-values was performed using the Monte Carlo method based on the generation of random samples (1000000 of samples for each size of query from 1 to 10) of phenotypes to calculate a estimation of the probability of a greater score, similar to those used in Phenomizer . For example, if the P-value associated to the score of the relationships between a query of five phenotypes and a disease is 5 · 10-6 means that only 5 of 1000000 random combinations of five phenotypes provides a greater score that the input set in the comparison with a specific disease.
Novel pathological relationships between genes
PhenUMA can detect whether genes are directly or indirectly involved in similar pathological events via the semantic similarity of their phenotypic profiles. For instance, some mutations in carbonic anhydrase II (CA2; MIM# 611492) are uniquely related to a monogenic disease named osteopetrosis with renal tubular acidosis (MIM# 259730 or ORPHA 2785). When using as output network of gene-gene semantic similarities from HPO with low confidence in PhenUMA, CA2 shows phenotypic similarities to TNFSF11 (MIM# 602642), TBCE (MIM# 604934) and SLC4A1 (MIM# 109270). CA2 also has a physical interaction with SLC4A1 and a functional similarity for a biological process with TNFSF11. In agreement with the whole set of HPO annotations for CA2, the most specific clinical features for this gene include: distal renal tubular acidosis (HP:0008341), extramedullary hematopoiesis (HP:0001978), periodic hypokalemic paresis (HP:0008153), optic nerve compression (HP:0007807), elevated serum acid phosphatase (HP:0003148) and diaphyseal sclerosis (HP:0003034). TNFSF11 presents phenotypic similarities with CA2 for extramedullary hematopoiesis (HP:0001978), cranial nerve compression (HP:0001293), diaphyseal sclerosis (HP:0003034), hepatosplenomegaly (HP:0001433) and cranial hyperostosis (HP:0004437). Indeed, TNFSF11 and CA2 are positive regulators in bone remodeling (GO:0046852) and reabsorption (GO:0045780). SLC4A1 shares phenotypes with CA2, including periodic paralysis (HP:0003768), renal tubular acidosis (HP:0001947) and hypokalemia (HP:0002900) and is also biochemically related to CA2 by physical interactions. TBCE and CA2 are not functionally associated, but both genes are associated phenotypically with renal tubular dysfunction (HP:0000124) and increased bone mineral density (HP:0011001). This example illustrates the novel phenotypic similarities for CA2 that are integrated with other functional relationships and additional information processed by PhenUMA. All of these results can be retrieved from PhenUMA combining network visualization, informative panels and other features such as phenotypic and functional enrichment analysis of selected nodes in resulting networks.
Clustering diseases by phenotypic similarity
Phenotypic enrichment of SSADHD and high confidence similar disorders
(607628, 607681, 611364, 600669, 608096, 607631, 607208, 300423, 608217, 600131, 271980, 604827, 300088)
Generalized myoclonic seizures
(611364, 600669, 607631, 607208, 271980, 604827)
(608096, 607208, 271980, 300088)
EEG with polyspike wave complexes
(607681, 600669, 600131)
(606053, 238350, 209800, 271980)
(143465, 606053, 238350, 167870, 209800, 271980, 300088, 190100)
(607681, 600669, 600131, 271980)
(167870, 271980, 190100)
Comparison with other resources
Comparison of PhenUMA with other tools
Phenotypic similarity method
PhenUMA aims to integrate information using network-based methods, and GeneMANIA is a useful example of the integration of biomolecular data . This web interface generates gene networks based on many different types of relationships such as protein and genetic interactions, pathways, coexpression, colocalization and protein domain similarities. However, in addition to functional interactions, PhenUMA also includes the pathological and phenotypic relationships between genes as shown in Table 3. Other tools, such as MalaCards, integrate the pathological and functional information related to human diseases by supplying an extensive repository of different information, where mouse phenotypes are used instead of human phenotypes . Two notable tools that integrate phenotypic information are Phenomizer and PhenomeNET, but both tools are not specifically designed to integrate this information with biomolecular data, which is required for an extensive systemic analysis. Phenomizer demonstrates the potential benefits of semantic- and ontology-based methods when they are applied for the systematic diagnosis of diseases ; these features were also included in PhenUMA. PhenomeNET is another tool that allows users to retrieve the semantic similarities between a single OMIM/Orphan disease, gene or phenotype and other genes or diseases, including cross-species information  and uses a Jaccard's index to calculate phenotypic similarity. Conversely, the similarity score between the diseases as calculated by MalaCards, named the Malacards Composite Related Diseases Score (MCRDS), combines an enrichment analysis of disease descriptors with different search engine ranks . The resulting ranked scores in MalacCards are also used to build disease networks based on their shared disease descriptors, but it uses murine phenotypes instead of human phenotypes.
Phenotypic enrichment of OMIM diseases similar to SSADH Deficiency (OMIM 271980)
Bonferroni corrected P-values
Generalized myoclonic seizures
Generalized tonic-clonic seizures
Delayed speech and language development
Increased body weight
Abnormality of eye movement
Abnormality of metabolism/homeostasis
PhenUMA gives a significant enrichment of status epilepticus in the top 10 and 50 of ranked diseases, while no significant enrichment was found for Phenomizer and PhenomeNET. Consequently, the diseases more phenotypically similar to SSADH deficiency are also associated with status epilepticus in PhenUMA. In addition, from the 22 phenotypes annotated for SSADH deficiency, we can count 9 significant phenotypes in the top 50 of the similar diseases retrieved by our system (Table 4). However, Phenomizer and PhenomeNET have only 4 and 5 phenotypes with a P-value below 0.05, respectively. Interestingly, there is a gradual enrichment of specific phenotypes in PhenUMA and Phenomizer as we constrain the conditions from the top 50 to the top 10 (Table 4). In contrast, the enrichment of phenotypes in PhenomeNET gives phenotypes with low IC values.
PhenUMA provides an integrative framework for biomedical and biomolecular relationships among genes and genetic diseases by combining network methods and semantic similarity calculations. This integration process uses pathological and functional information from different databases, inferences of already known relationships and computed semantic similarities using biomedical ontologies (HPO and GO), as shown in Table 1. To achieve this goal, PhenUMA uses several biocomputational technologies to unify in the same platform information that apparently is unconnected. One of the primary applications of this platform is to explore how disease-associated genes are phenotypically and functional associated. PhenUMA was shown to be useful for discovering novel pathological relationships between genes and as a new way to study groups of diseases based on the similarity of their phenotypic profiles. These phenotypic similarity relationships are strongly dependent on the ontology structure and the threshold selection. The Human Phenotype Ontology is a standardized platform with recognized clinical value , but the selection of an optimal threshold requires reference datasets to assess the precise significance of the similarity score. In PhenUMA, we set a score for semantic similarity that is suitable to detect implicit relationships in databases. The reference datasets used here were built from the inferred relationships (the union of the sets Inferred IN and Inferred OUT of Figure 4) of disease or gene pairs from OMIM or Orphanet that share at least one disease or one gene, respectively. Each type of inference has a different biomedical meaning. For example, an inferred relationship between two disorders, where both present genetic variations associated with the same gene, might indicate a potential functional dependence between these pathologies and the molecular mechanisms involving this gene. If these disorders are phenotypically similar, it supports the hypothesis that perturbations in this gene will produce similar clinical features. Therefore, the resulting thresholds for phenotypically similar diseases are the minimal scores that distinguish disease pairs that are potentially related to the same molecular background. On the other hand, an inferred relationship between genes suggests that both genes could be part of close functional modules. Therefore, mutations in these genes may be canalizing perturbations effects to cause the same clinical features. The resulting optimal threshold is useful for determining the minimal similarity score for two genes that may be involved in the same pathological processes.
Our analysis provides evidence that Robinson's measurement, which uses the entire phenotypic profile of disorders to calculate similarities between genes and diseases, performs better than the classical Resnik's measurement (Additional file 1: Figure S1). As the similarity score increases, it implies a higher phenotypic specificity between gene and disease pairs. Robinson's measure conserves more information (Figure 2A) and the resulting networks are more similar to the used reference datasets (Figure 2B). In addition, PhenUMA provides more confident phenotypic similarities between OMIM diseases than do other similar systems, such as PhenomeNET (Figure 6A and B). To compute similarity scores, both systems use the entire phenotypic profile of OMIM diseases instead of the most specific phenotype in the relationship. It means that the entire phenotypic profile of a disease will be more informative than the most specific phenotype, reinforcing the need for deep phenotyping . Our system also has a lower false positive rate than PhenomeNET (Figure 6B). A possible explanation for these differences is that PhenomeNET uses cross-species information, so it may be influencing the similarity scores.
Furthermore, we also used a case of study of SSADH deficiency to show how phenotypic similarity generates comprehensive clusters of diseases in PhenUMA (Figure 5). The resulting phenotypic enrichments of ranked OMIM diseases by their similarity to SSADH deficiency are quite different for PhenUMA and Phenomizer compared to PhenomeNET. For instance, PhenUMA and Phenomizer, which use the same similarity measures, are more significantly enriched with the clinical features associated with SSADH deficiency than those of PhenomeNET (Table 4). Our results suggest that clusters of phenotypically similar diseases are more coherent in PhenUMA compared to other current similar systems.
Our assessment of the integration of functional and phenotypic relationships was based in a network comparison and correlation analysis of distinct subsets of pairs of genes. In general, phenotypic similarity clusters genes that interact in close molecular and cellular biological conditions. While it remains difficult to systematically distinguish between meaningful relationships and background noise, phenotypic similarity gene network is significantly enriched with functional interactions. For instance, the resulting network of gene pairs from the “Novel subset” is coherent and abundant in functional interactions, especially for protein-protein interactions and functional similarities in biological process (see Additional file 1). In general, protein-protein interactions and pairs of genes with similar cellular localizations likely give more direct evidence for the inferred pathological relationships , as observed for the “Inferred IN” and “Inferred OUT” subsets (see Additional file 1). Notably, these results may be influenced by a biomedical research bias, especially for genes that are associated with the same genetic disease ,. Nevertheless, PhenUMA includes the option to filter results with the highest semantic similarity by offering a range of specificity of interactions between genes or diseases. Future improvements on this feature will be needed to extend the validity and the variety of biological interactions.
In conclusion, the information produced by PhenUMA integrates clinical and biomolecular information to supply wider insights on the phenotypic and molecular characteristics of pathological processes. This tool is useful to help clinical and basic researchers to reinterpret their results and to redesign experiments by considering apparently non-related elements a priori. PhenUMA users can download detailed tutorials and stored networks from the knowledge base on the website. Returns, including comments and criticisms, from final users will be considered for future improvements of this tool.
Availability and requirements
Project Name: PhenUMAProject home page: http://www.phenuma.uma.esOperating system(s): platform independentProgramming language: Java
RRL and ARP conceived this project. RRL and ARP wrote this paper. ARP performed the data analysis and approaches evaluation. RRL and ARP designed the database and the tool. RRL implemented the database and the tool. FSJ and MAM supervised this work. All authors read and approved the manuscript.
The authors thank PN Robinson, S. Köhler and S. Bauer for clarifying and providing details on how to associate phenotypes with genes and OMIM diseases. The authors also thank AR Palomares, JR Perkins and JAG Ranea for useful comments and suggestions.
This work is one of the activities for the Platform “Bioinformática para Enfermedades Raras” of CIBERER, which is an initiative of ISCIII.
This work was funded by CIBERER, contract AMER (CDTI, MINECO, Spain), and Grants SAF2011-26528 (MEC, Spain), CVI-06585 (Junta de Andalucia and FEDER) and PS09/02216 (MEC, ISCIII and FEDER).
- Robinson PN: Deep phenotyping for precision medicine. Hum Mutat. 2012, 33: 777-780. 10.1002/humu.22080.View ArticlePubMedGoogle Scholar
- Girdea M, Dumitriu S, Fiume M, Bowdin S, Boycott KM, Chénier S, Chitayat D, Faghfoury H, Meyn MS, Ray PN, So J, Stavropoulos DJ, Brudno M: PhenoTips: patient phenotyping software for clinical and research use. Hum Mutat. 2013, 34: 1057-1065. 10.1002/humu.22347.View ArticlePubMedGoogle Scholar
- Hamosh A, Sobreira N, Hoover-Fong J, Sutton VR, Boehm C, Schiettecatte F, Valle D: PhenoDB: a new web-based tool for the collection, storage, and analysis of phenotypic features. Hum Mutat. 2013, 34: 566-571.PubMed CentralPubMedGoogle Scholar
- Schofield PN, Hancock JM: Integration of global resources for human genetic variation and disease. Hum Mutat. 2012, 33: 813-816. 10.1002/humu.22079.View ArticlePubMedGoogle Scholar
- Baker M: Big biology: the'omes puzzle. Nature. 2013, 494: 416-419. 10.1038/494416a.View ArticlePubMedGoogle Scholar
- Online Mendelian Inheritance in Man, OMIM®. McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University (Baltimore, MD) , [http://www.omim.org]
- Orphanet: an online rare disease and orphan drug data base. © INSERM 1997 , [http://www.orpha.net]
- Rath A, Olry A, Dhombres F, Brandt MM, Urbero B, Ayme S: Representation of rare diseases in health information systems: the Orphanet approach to serve a wide range of end users. Hum Mutat. 2012, 33: 803-808. 10.1002/humu.22078.View ArticlePubMedGoogle Scholar
- Firth HV, Richards SM, Bevan AP, Clayton S, Corpas M, Rajan D, Van Vooren S, Moreau Y, Pettett RM, Carter NP: REPORT DECIPHER: database of chromosomal imbalance and phenotype in humans using ensembl resources. Am J Hum Genet. 2009, 84: 524-533. 10.1016/j.ajhg.2009.03.010.View ArticlePubMed CentralPubMedGoogle Scholar
- Robinson PN, Mundlos S: The human phenotype ontology. Clin Genet. 2010, 77: 525-534. 10.1111/j.1399-0004.2010.01436.x.View ArticlePubMedGoogle Scholar
- Mistry M, Pavlidis P: Gene ontology term overlap as a measure of gene functional similarity. BMC Bioinformatics. 2008, 9: 327-10.1186/1471-2105-9-327.View ArticlePubMed CentralPubMedGoogle Scholar
- Vidal M, Cusick ME, Barabási A-L: Interactome networks and human disease. Cell. 2011, 144: 986-998. 10.1016/j.cell.2011.02.016.View ArticlePubMed CentralPubMedGoogle Scholar
- Reyes-Palomares A, Rodríguez-López R, Ranea JAG, Sánchez Jiménez F, Medina MA: Global analysis of the human pathophenotypic similarity gene network merges disease module components. PLoS One. 2013, 8: e56653-10.1371/journal.pone.0056653.View ArticlePubMed CentralPubMedGoogle Scholar
- Szklarczyk D, Franceschini A, Kuhn M, Simonovic M, Roth A, Minguez P, Doerks T, Stark M, Muller J, Bork P, Jensen LJ, von Mering C: The STRING database in 2011: functional interaction networks of proteins, globally integrated and scored. Nucleic Acids Res. 2011, 39 (Database issue): D561-D568. 10.1093/nar/gkq973.View ArticlePubMed CentralPubMedGoogle Scholar
- Veeramani B, Bader JS: Metabolic flux correlations, genetic interactions, and disease. J Comput Biol. 2009, 16: 291-302. 10.1089/cmb.2008.14TT.View ArticlePubMed CentralPubMedGoogle Scholar
- Lopes CT, Franz M, Kazi F, Donaldson SL, Morris Q, Bader GD: Cytoscape Web: an interactive web-based network browser. Bioinformatics. 2010, 26: 2347-2348. 10.1093/bioinformatics/btq430.View ArticlePubMed CentralPubMedGoogle Scholar
- Orphadata: Free access data from Orphanet. © INSERM 1997 , [http://www.orphadata.org]
- Bauer S, Grossmann S, Vingron M, Robinson PN: Ontologizer 2. 0 — a multifunctional tool for GO term enrichment analysis and data exploration. Bioinformatics. 2008, 24: 1650-1651. 10.1093/bioinformatics/btn250.View ArticlePubMedGoogle Scholar
- Resnik P: Using information content to evaluate semantic similarity in a taxonomy. IJCAI. 1995, 1: 448-453.Google Scholar
- Lord PW, Stevens RD, Brass A, Goble CA: Investigating semantic similarity measures across the gene ontology: the relationship between sequence and annotation. Bioinformatics. 2003, 19: 1275-1283. 10.1093/bioinformatics/btg153.View ArticlePubMedGoogle Scholar
- Xu T, Du L, Zhou Y: Evaluation of GO-based functional similarity measures using S. cerevisiae protein interaction and expression profile data. BMC Bioinformatics. 2008, 9: 472-10.1186/1471-2105-9-472.View ArticlePubMed CentralPubMedGoogle Scholar
- Sevilla JL, Segura V, Podhorski A, Guruceaga E, Mato JM, Martinez-Cruz LA, Corrales FJ, Rubio A: Correlation between gene expression and GO semantic similarity. IEEEACM Trans Comput Biol Bioinforma. 2005, 2: 330-338. 10.1109/TCBB.2005.50.View ArticleGoogle Scholar
- Robinson PN, Köhler S, Bauer S, Seelow D, Horn D, Mundlos S: The human phenotype ontology: a tool for annotating and analyzing human hereditary disease. Am J Hum Genet. 2008, 83: 610-615. 10.1016/j.ajhg.2008.09.017.View ArticlePubMed CentralPubMedGoogle Scholar
- Köhler S, Schulz MH, Krawitz P, Bauer S, Dölken S, Ott CE, Mundlos C, Horn D, Mundlos S, Robinson PN: Clinical diagnostics in human genetics with semantic similarity searches in ontologies. Am J Hum Genet. 2009, 85: 457-464. 10.1016/j.ajhg.2009.09.003.View ArticlePubMed CentralPubMedGoogle Scholar
- Warde-Farley D, Donaldson SL, Comes O, Zuberi K, Badrawi R, Chao P, Franz M, Grouios C, Kazi F, Lopes CT, Maitland A, Mostafavi S, Montojo J, Shao Q, Wright G, Bader GD, Morris Q: The GeneMANIA prediction server: biological network integration for gene prioritization and predicting gene function. Nucleic Acids Res. 2010, 38 (Web Server issue): W214-W220. 10.1093/nar/gkq537.View ArticlePubMed CentralPubMedGoogle Scholar
- Rappaport N, Nativ N, Stelzer G, Twik M, Guan-Golan Y, Iny Stein T, Bahir I, Belinky F, Morrey CP, Safran M, Lancet D: MalaCards: an integrated compendium for diseases and their annotation. Database (Oxford). 2013, 2013: bat018-10.1093/database/bat018.View ArticleGoogle Scholar
- Hoehndorf R, Schofield PN, Gkoutos GV: PhenomeNET: a whole-phenome approach to disease gene discovery. Nucleic Acids Res. 2011, 39: e119-10.1093/nar/gkr538.View ArticlePubMed CentralPubMedGoogle Scholar
- Goh K-I, Cusick ME, Valle D, Childs B, Vidal M, Barabási A-L: The human disease network. Proc Natl Acad Sci U S A. 2007, 104: 8685-8690. 10.1073/pnas.0701361104.View ArticlePubMed CentralPubMedGoogle Scholar
- Adie EA, Adams RR, Evans KL, Porteous DJ, Pickard BS: SUSPECTS: enabling fast and effective prioritization of positional candidates. Bioinformatics. 2006, 22: 773-774. 10.1093/bioinformatics/btk031.View ArticlePubMedGoogle Scholar
- Wang J, Zhou X, Zhu J, Zhou C, Guo Z: Revealing and avoiding bias in semantic similarity scores for protein pairs. BMC Bioinformatics. 2010, 11: 290-10.1186/1471-2105-11-290.View ArticlePubMed CentralPubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.