HomoMINT: an inferred human network based on orthology mapping of protein interactions discovered in model organisms
© Persico et al; licensee BioMed Central Ltd 2005
Published: 1 December 2005
The application of high throughput approaches to the identification of protein interactions has offered for the first time a glimpse of the global interactome of some model organisms. Until now, however, such genome-wide approaches have not been applied to the human proteome.
In order to fill this gap we have assembled an inferred human protein interaction network where interactions discovered in model organisms are mapped onto the corresponding human orthologs. In addition to a stringent assignment to orthology classes based on the InParanoid algorithm, we have implemented a string matching algorithm to filter out orthology assignments of proteins whose global domain organization is not conserved. Finally, we have assessed the accuracy of our own, and related, inferred networks by benchmarking them against i) an assembled experimental interactome, ii) a network derived by mining of the scientific literature and iii) by measuring the enrichment of interacting protein pairs sharing common Gene Ontology annotation.
The resulting networks are named HomoMINT and HomoMINT_filtered, the latter being based on the orthology table filtered by the domain architecture matching algorithm. They contains 9749 and 5203 interactions respectively and can be analyzed and viewed in the context of the experimentally verified interactions between human proteins stored in the MINT database. HomoMINT is constantly updated to take into account the growing information in the MINT database.
The dynamic assembly of stable or transient protein complexes regulates cell physiology by presiding over basic cell functions. In principle, if we knew the kinetic details of the interaction between any macromolecule in a cell, as well as the concentration of each player, we could start thinking about modeling a virtual cell in order to understand, or infer, its response to any given stimulus.
Regrettably we are very far from this level of understanding of the interactions within a cell proteome. In recent years, however, high throughput approaches based on the yeast two hybrid  and TAP TAG  methods have provided for the first time a genome-wide perspective of the interactome of simple model organisms such as H. pylori, E. coli, S. cerevisiae [5–8], C. elegans  and D. melanogaster[10, 11]. Comparative analysis of comprehensive experiments conducted by different groups, using similar or orthogonal approaches, has led to the recognition that the available interactomes are noisy and largely incomplete . Nevertheless this remarkable experimental effort has put us in a position to analyze the interactomes' broad structure and to start mapping, in these complex protein meshes, the pathway representation we are used to. Unfortunately no such high-throughput data are yet available for the human proteome while genome-wide approaches aimed at the elucidation of the human interactome are only at their inception. However, assuming that functional protein interactions are conserved in evolution, one can consider extending the experimentally determined human protein interaction network by using data from the model organism protein interaction datasets. This can be achieved by transferring the interaction information from each organism to the human proteome and requires the identification of genes that have a common ancestor and share the same function in the two organisms (orthologs). Lehner and Fraser  have used the InParanoid algorithm  to infer a network of over 70000 interactions between 6200 human proteins generated by using data from the yeast, fly and worm interactome. More recently Brown and Jurisica  have developed OPHID a web-based database containing 23359 predicted interactions between human proteins. OPHID was assembled by mapping model organism PPIs to human orthologs using BLASTP and the reciprocal best hit approach. Here we present HomoMINT containing 9749 inferred interactions between 4125 human proteins. We also used the InParanoid algorithm to assign proteins to orthology groups. Whenever two proteins shown to interact in model organisms could be confidently assigned to orthology groups containing a human ortholog, the corresponding main human orthologs (not paralogs) are included in the inferred HomoMINT network. HomoMINT is essentially an 'orthology table' in the MINT database. Thus the inferred network can be freely and conveniently analyzed in the context of the MINT protein interaction data with the aid of the MINT search and analysis tools. HomoMINT is updated daily to take into account the growing number of interactions that are curated each day in the MINT database.
Our strategy starts by assigning proteins to orthology groups having a human protein as the main ortholog. An interaction between human proteins is then inferred if both partners of an interaction experimentally verified in model organisms have at least one human ortholog.
Similarly to Lehner and Fraser , we have used the InParanoid algorithm to assemble orthology groups. This algorithm has the potential to distinguish between out-paralog, homologous genes that arose by duplication before the speciation event (unlikely to share function), and in-paralogs arising after speciation. However, to avoid unnecessary graphical overcrowding, in the resulting inferred human network (HomoMINT) we have only included interactions between the main human orthologs of each orthology group. An extended network in which the model organism interactions are mapped to all the possible combinations of in-paralogs is also available (HomoMINT_extended). Since InParanoid attributes a score to each orthology assignment it is relatively easy to obtain different inferred networks using orthology tables with varying levels of stringency for assignment to orthology classes.
In addition we have tuned the orthology assignments by imposing the condition that proteins in the same orthology group must have the same domain architecture. This filtering step evaluates the overall protein similarity and eliminates any incongruity caused by the local nature of the BLAST algorithm. Motivated by the observation that multidomain proteins, sharing an exact domain architecture, have significantly higher functional conservation [17, 18], we developed a workflow (see Methods) to produce a "high confidence" orthology table in which all orthology group members share the same domain architecture. This filtering procedure improves the functional coherence within the orthology groups (see Methods) while removing only 10% of the 16531 inferred groups. We call the resulting network HomoMINT_filtered.
HomoMINT as a web server
In the latter case one obtains, as a result of the query, both the experimentally verified interactions and the inferred ones. Appropriate links make it possible to retrieve information about the experiments supporting the interaction either directly (experiments carried out with human proteins) or indirectly (experiments carried out in model organisms) (Fig. 1B).
During any MINT search session it is possible to extend the analysis to HomoMINT, by clicking the HomoMINT hyperlink. The composition of the orthology groups used to infer the human interactions can also be inspected via the 'orthology table' hyperlink. A distinction is made between main orthologs (orthologs) and co-orthologs (in-paralogs).
Finally the HomoMINT network can be analyzed, expanded, edited in the context of the experimentally verified protein interactions in the MINT database by using the MINT viewer tool (Fig. 1c). For instance the MINT viewer makes it possible, by checking appropriate boxes, to visualize only interactions inferred from any combination of model organism interactomes. The network visualized and edited by the viewer tool can be downloaded in any of three formats: flat file, XML PSI , or in a format that can be used as input for the OSPREY visualization software .
Intersection of HomoMINT with the Human experimental network
Intersection of human interactomes in public databases
Nr. of edges
Inferred ad experimental networks compared in this work
Number of interactions
Description or reference
HomoMINT filtered for domain architecture conservation.
inferred from interactions confirmed by at least two experiments.
inferred from interactions supported by experiments in at least two model organisms
Inferred from interactions discovered by low throughput experiments.
Compilation of interactions between human proteins
where P(I|D) and P(~I|D) are the frequencies of interactions, in a given dataset (D), that are or are not observed in the benchmark dataset (I), while P(I) and P(~I) represent the prior expectations (the frequency of all benchmark gene pairs that do or do not interact).
Overlap between inferred and experimental human networks
Intersection of HomoMINT with the iHOP resource
The PubMed resource, containing more than 15 million biomedical abstracts, is a valuable resource for high quality protein interactions. As a whole, concurring proteins in PubMed sentences can be considered and modeled as a literature network, which can be superimposed on experimental interaction data or on putative relationships, making it possible to compare new and existing knowledge possible. Here we have made use of a novel text-mining resource, called iHOP (Information Hyperlinked over Proteins)  as an independent assessment of the protein interactions predicted in HomoMINT. The iHOP system currently contains 6 million sentences from PubMed abstracts and about 40000 different proteins from human, mouse, and other common animal models (iHOP, http://www.pdg.cnb.uam.es/UniPub/iHOP/).
Overlap of the inferred and experimental human networks with iHOP
Nr of Edges
For this comparisons we mapped all the proteins to Locus Link ids. In this process proteins (and their interactions) that could not be confidently mapped were eliminated from the networks. For this reason, H_MINT in Table 4 contains 7658 interactions.
Interacting proteins sharing GO terms
HomoMINT as a graph
Clust. coeff. 1
Several databases, using a variety of computational methods to make inferences about functional relationships between genes and proteins, are available on the web [32–35]. HomoMINT is an inferred human protein network obtained by transferring the experimental interaction annotation from the proteome of seven model organisms to the corresponding ortholog human proteins. The orthology mapping is obtained by means of the InParanoid algorithm.
Approximately one fifth of the interactions present in the MINT database could be mapped to human orthologs thus resulting in the assembly of an inferred network linking 4125 human proteins with 9749 edges. While a large proportion of these proteins are not functionally annotated one can use HomoMINT to transfer functional information from better characterized neighbors in the graph.
Because of evolutionarily frequent molecular processes leading to gene family expansion or contraction, the transfer of interaction information between organisms, especially high eukaryotes, is complicated by the abundance of paralogs in orthology groups. The InParanoid algorithm is designed to distinguish paralogs arising before or after speciation events. We have chosen to transfer the interaction information only to the main human ortholog in each group. Thus our inferred network is essentially based on orthology mapping by the reciprocal best hit approach. However, the orthology groups assembled in our web available table contain paralogs, so permitting any alternative choice. Furthermore since the InParanoid algorithm provides a confidence score for each orthology assignment the likelihood of the inferred interactions can be evaluated from the confidence score of the model organism and human gene orthology assignment as proposed for instance by Lehner and Fraser .
To assess the predictive value of HomoMINT, we performed a number of tests aimed at assessing to what degree of accuracy and coverage the orthology based inferred networks could be supported by previous knowledge. We first assembled a human experimental network from the protein interaction data stored in PPI databases and determined the percentage overlap between this network and HomoMINT or related networks. Next, we estimated the enrichment in the inferred networks of interacting proteins sharing Gene Ontology annotation. Finally we estimated the overlap between the inferred networks and the iHOP literature network.
Our approach is based on the assumption that protein interactions between ortholog proteins are conserved in evolution. To what extent this is true cannot at present be estimated because of the incompleteness and inaccuracy of the available experimental datasets . Even hypothesizing that the assumption is 100% correct, the accuracy and coverage of the inferred network is still limited by the quality of the original model organism interaction datasets and our ability to identify the true human orthologs of a model organism protein. Not surprisingly our benchmark tests show that accuracy increases if one uses more stringent criteria for orthology assignment (for instance by only allowing orthologs with similar modular architecture) or if one bases the inference on a more reliable interaction dataset (for instance relying on multiple evidence).
In contrast with similar projects [13, 15, 37], HomoMINT is unique for its direct link to a curated PPI database. HomoMINT is a calculated section in the MINT relational database and its content is updated daily to take into account the newly curated entries in the MINT database. Furthermore the MINT viewer makes it possible to analyze and edit the HomoMINT network in the context of the experimentally verified interactions deposited in the MINT database. HomoMINT can be searched and analyzed at http://mint.bio.uniroma2.it/mint/search/search.php?dataset=homomint. The HomoMINT dataset is available either as a flat file or a PSI XML file (see Additional file 1 and Additional file 2 for details). Each of them contains all interaction inferred from model organism's protein on main human orthologs.
Click here, http://mint.bio.uniroma2.it/mint/release/main.php and fill the requested fields to have access to the latest release files.
Since it is not clear which percentage of PPI are conserved through evolution  HomoMINT should be considered as a hypothetical network that can be of use in predicting functions of yet uncharacterized proteins, in making experimentally testable hypotheses about new participants in well studied pathways and in prioritizing interactions to be tested in large scale PPI experiments. As such, the network should provide a rich source of functional hypotheses for researchers interested in the functions of one or many human proteins.
BLASTP searches were carried out using blastall 2.2.9 .
InParanoid algorithm version 1.35 was downloaded from: http://inparanoid.cgb.ki.se/index.html.
The proteome sets for the BLAST searches and ortholog table assembling were downloaded or built from the following sources: Arabidopsis thaliana proteome set (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=3Caenorhabditis elegans (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=9Drosophila melanogaster (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=17Escherichia coli K12 (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=18Homo sapiens (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=25Mus musculus (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=59Rattus norvegicus (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=122Saccharomyces cerevisiae (predicted proteins), http://www.ebi.ac.uk/integr8/FtpSearch.do?orgProteomeID=40 Multiple species proteome set (predicted proteins), http://mint.bio.uniroma2.it/mint/ by querying the database for proteins belonging to the following species: Sus scrofa (Pig), Xenopus laevis (African clawed frog), Ovis aries (Sheep), Oryctolagus cuniculus (Rabbit),Gallus gallus (Chicken), Canis familiaris (Dog),Bos taurus (Bovine).
Assembly of the orthology table
The procedure implemented in the InParanoid algorithm  starts with an all-against-all BLASTP comparison between two proteomes of interest. Reciprocal best hit criteria are used to identify orthologous relationships between pairs of proteins. For each putative ortholog, probable recent paralogs or in-paralogs are identified as sequences within the same proteome that are reciprocally more similar to each other than either is to any sequence from the other proteome.
An InParanoid confidence level cut-off of 0.6 was chosen for the assignment of in-paralogs to orthology groups. Due to the redundancy of the starting proteome sets, several groups contained identical copies of the same protein. To limit this problem we decided to eliminate paralogs with InParanoid confidence level above 0.98. InParanoid performs its comparison between each pair of proteomes. To build an orthology table with orthology groups including proteins from all organisms of interest, we used python scripts to merge the InParanoid results keeping a human protein as reference for each orthology group.
Assembling HEN (Human Experimental Network)
The human experimental interactome has been assembled by importing the data in a Postgresql database from the following resources: Intact (XML PSI files),1300 unique interactions at http://www.ebi.ac.uk/intact/index.jsp DIP (Flat file),833 unique interactions at http://dip.doe-mbi.ucla.edu/ BIND (XML PSI 2 file),4073 unique interactions at http://bind.ca/ MINT, 3679 unique interactions at http://mint.bio.uniroma2.it/mint/ HPRD (XML PSI file), 6153 unique interactions at http://www.hprd.org/ MIPS (XML PSI file), 322 unique interactions at http://mips.gsf.de/proj/ppi/ Only interactions that could be confidently mapped to Uniprot ids were added to HEN.
Filtering orthology groups for domain architecture homogeneity
A procedure has been developed to improve and to measure the functional coherence in orthology groups, based on dynamic programming techniques and implemented as a string matching algorithm .
We modeled every protein in our orthology groups as an ordered string of domains. To this end, we used the domain annotations available in SMART  and PFAM . In particular, the human and the other eight model organism proteomes under analysis have been surveyed for their specific domain architectures. Repetitions of the same domain are treated as a single instance of that domain. Overlapping domains are considered as independent elements of the string representing the domain architecture of the protein.
Then we developed a PERL string matching algorithm to establish distances between the proteins in terms of similarities between their domain architectures. Each protein is represented as a string of concatenated ordered domains. Thus we were able to measure a distance between two proteins by counting the number of domain editing steps (deletions, insertions, substitutions) in order to match the domain architecture of the two proteins. Proteins identical in their domain architecture will have an "edit distance" equal to zero. Distances are normalized by dividing for the total number of domains in the ortholog human protein.
This procedure prevents proteins with markedly different domain architecture (and function) from being clustered mistakenly in a group, although they share similarities only within distinct regions of a multidomain protein. In this way we tried to take in account not only local relationships among sequences to be merged in the orthology groups but global relationships as well.
To assess the filtering procedure we examined the consistency of the annotation of the members within each orthology group, as reported in the ENZYME database . We were able to attribute at least two ENZYME annotations to 9% of groups constituting the filtered orthology table. Fewer than 6% of these groups (77/1355) were declared inconsistent with the ENZYME hierarchic classification scheme. 17 inconsistent groups present in the standard orthology table were not present in the filtered orthology table, underlining the improvement of the functional coherence in the orthology groups after filtering for similarity in domain architectures. The number of inconsistent groups in the standard orthology table was 94 out of 1396 groups which have at least two ENZYME annotations.
Gene Ontology similarity analysis
Finds all the terms to which Pi and Pj are annotated including the parent terms. These sets of terms in the Gene Ontology tree represent the nodes of the GO graphs induced by Pi and Pj, respectively.
Find the set of terms which the GO graphs induced by Pi and Pj have in common. Denote this set Sij.
Define the depth of each term in Sij to be the length of the shortest path between the term and the root node of the ontology (here length refers to number of connecting edges).
Find the maximum depth of terms in the set Sij. We refer to this value as Dij.
This work was supported by Telethon, AIRC (Italian Association for Cancer Research) and the EU FP6 'Interaction Proteome' project. We wish to thank Maria Vittoria Schneider and Luisa Montecchi Palazzi for stimulating discussion.
- Chien CT, Bartel PL, Sternglanz R, Fields S: The two-hybrid system: a method to identify and clone genes for proteins that interact with a protein of interest. Proc Natl Acad Sci U S A 1991, 88: 9578–82. 10.1073/pnas.88.21.9578PubMed CentralView ArticlePubMedGoogle Scholar
- Rigaut G, Shevchenko A, Rutz B, Wilm M, Mann M, Seraphin B: A generic protein purification method for protein complex characterization and proteome exploration. Nat Biotechnol 1999, 17: 1030–2. 10.1038/13732View ArticlePubMedGoogle Scholar
- Rain JC, Selig L, De Reuse H, Battaglia V, Reverdy C, Simon S, Lenzen G, Petel F, Wojcik J, Schachter V, et al.: The protein-protein interaction map of Helicobacter pylori. Nature 2001, 409: 211–5. 10.1038/35051615View ArticlePubMedGoogle Scholar
- Butland G, Peregrin-Alvarez JM, Li J, Yang W, Yang X, Canadien V, Starostine A, Richards D, Beattie B, Krogan N, et al.: Interaction network containing conserved and essential protein complexes in Escherichia coli. Nature 2005, 433: 531–7. 10.1038/nature03239View ArticlePubMedGoogle Scholar
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, et al.: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 2002, 415: 180–3. 10.1038/415180aView ArticlePubMedGoogle Scholar
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, et al.: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature 2002, 415: 141–7. 10.1038/415141aView ArticlePubMedGoogle Scholar
- Ito T, Chiba T, Ozawa R, Yoshida M, Hattori M, Sakaki Y: A comprehensive two-hybrid analysis to explore the yeast protein interactome. Proc Natl Acad Sci U S A 2001, 98: 4569–74. 10.1073/pnas.061034498PubMed CentralView ArticlePubMedGoogle Scholar
- Uetz P, Giot L, Cagney G, Mansfield TA, Judson RS, Knight JR, Lockshon D, Narayan V, Srinivasan M, Pochart P, et al.: A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae [see comments]. Nature 2000, 403: 623–7. 10.1038/35001009View ArticlePubMedGoogle Scholar
- Li S, Armstrong CM, Bertin N, Ge H, Milstein S, Boxem M, Vidalain PO, Han JD, Chesneau A, Hao T, et al.: A map of the interactome network of the metazoan C. elegans. Science 2004, 303: 540–3. 10.1126/science.1091403PubMed CentralView ArticlePubMedGoogle Scholar
- Giot L, Bader JS, Brouwer C, Chaudhuri A, Kuang B, Li Y, Hao YL, Ooi CE, Godwin B, Vitols E, et al.: A protein interaction map of Drosophila melanogaster. Science 2003, 302: 1727–36. 10.1126/science.1090289View ArticlePubMedGoogle Scholar
- Formstecher E, Aresta S, Collura V, Hamburger A, Meil A, Trehin A, Reverdy C, Betin V, Maire S, Brun C, et al.: Protein interaction mapping: a Drosophila case study. Genome Res 2005, 15: 376–84. 10.1101/gr.2659105PubMed CentralView ArticlePubMedGoogle Scholar
- von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P: Comparative assessment of large-scale data sets of protein-protein interactions. Nature 2002, 417: 399–403. 10.1038/nature750View ArticlePubMedGoogle Scholar
- Lehner B, Fraser AG: A first-draft human protein-interaction map. Genome Biol 2004, 5: R63. 10.1186/gb-2004-5-9-r63PubMed CentralView ArticlePubMedGoogle Scholar
- Remm M, Storm CE, Sonnhammer EL: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Mol Biol 2001, 314: 1041–52. 10.1006/jmbi.2000.5197View ArticlePubMedGoogle Scholar
- Brown KR, Jurisica I: Online Predicted Human Interaction Database. Bioinformatics 2005, 21: 2076–2082. 10.1093/bioinformatics/bti273View ArticlePubMedGoogle Scholar
- Zanzoni A, Montecchi-Palazzi L, Quondam M, Ausiello G, Helmer-Citterich M, Cesareni G: MINT: a Molecular INTeraction database. FEBS Lett 2002, 513: 135–40. 10.1016/S0014-5793(01)03293-8View ArticlePubMedGoogle Scholar
- Hegyi H, Gerstein M: Annotation transfer for genomics: measuring functional divergence in multi-domain proteins. Genome Res 2001, 11: 1632–40. 10.1101/gr. 183801PubMed CentralView ArticlePubMedGoogle Scholar
- Vogel C, Bashton M, Kerrison ND, Chothia C, Teichmann SA: Structure, function and evolution of multidomain proteins. Curr Opin Struct Biol 2004, 14: 208–16. 10.1016/j.sbi.2004.03.011View ArticlePubMedGoogle Scholar
- Hermjakob H, Montecchi-Palazzi L, Bader G, Wojcik J, Salwinski L, Ceol A, Moore S, Orchard S, Sarkans U, von Mering C, et al.: The HUPO PSI's molecular interaction format – a community standard for the representation of protein interaction data. Nat Biotechnol 2004, 22: 177–83. 10.1038/nbt926View ArticlePubMedGoogle Scholar
- Breitkreutz BJ, Stark C, Tyers M: Osprey: a network visualization system. Genome Biol 2003, 4: R22. 10.1186/gb-2003-4-3-r22PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Betel D, Hogue CW: BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res 2003, 31: 248–50. 10.1093/nar/gkg056PubMed CentralView ArticlePubMedGoogle Scholar
- Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A, et al.: IntAct: an open source molecular interaction database. Nucleic Acids Res 2004, 32(Database):D452–5. 10.1093/nar/gkh052PubMed CentralView ArticlePubMedGoogle Scholar
- Peri S, Navarro JD, Amanchy R, Kristiansen TZ, Jonnalagadda CK, Surendranath V, Niranjan V, Muthusamy B, Gandhi TK, Gronborg M, et al.: Development of human protein reference database as an initial platform for approaching systems biology in humans. Genome Res 2003, 13: 2363–71. 10.1101/gr.1680803PubMed CentralView ArticlePubMedGoogle Scholar
- Xenarios I, Salwinski L, Duan XJ, Higney P, Kim SM, Eisenberg D: DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res 2002, 30: 303–5. 10.1093/nar/30.1.303PubMed CentralView ArticlePubMedGoogle Scholar
- Joshi-Tope G, Gillespie M, Vastrik I, D'Eustachio P, Schmidt E, de Bono B, Jassal B, Gopinath GR, Wu GR, Matthews L, et al.: Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 2005, 33: D428–32. 10.1093/nar/gki072PubMed CentralView ArticlePubMedGoogle Scholar
- Lee I, Date SV, Adai AT, Marcotte EM: A probabilistic functional network of yeast genes. Science 2004, 306: 1555–8. 10.1126/science.1099511View ArticlePubMedGoogle Scholar
- Bader GD, Hogue CW: Analyzing yeast protein-protein interaction data obtained from different sources. Nat Biotechnol 2002, 20: 991–7. 10.1038/nbt1002-991View ArticlePubMedGoogle Scholar
- Hoffmann R, Valencia A: A gene network for navigating the literature. Nat Genet 2004, 36: 664. 10.1038/ng0704-664View ArticlePubMedGoogle Scholar
- Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, et al.: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 2004, 32: D258–61. 10.1093/nar/gkh066View ArticlePubMedGoogle Scholar
- Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 2004, 5: R80. 10.1186/gb-2004-5-10-r80PubMed CentralView ArticlePubMedGoogle Scholar
- Barabasi AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet 2004, 5: 101–13. 10.1038/nrg1272View ArticlePubMedGoogle Scholar
- Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D: Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol 2004, 5: R35. 10.1186/gb-2004-5-5-r35PubMed CentralView ArticlePubMedGoogle Scholar
- Huang TW, Tien AC, Huang WS, Lee YC, Peng CL, Tseng HH, Kao CY, Huang CY: POINT: a database for the prediction of protein-protein interactions based on the orthologous interactome. Bioinformatics 2004, 20: 3273–6. 10.1093/bioinformatics/bth366View ArticlePubMedGoogle Scholar
- Mellor JC, Yanai I, Clodfelter KH, Mintseris J, DeLisi C: Predictome: a database of putative functional links between proteins. Nucleic Acids Res 2002, 30: 306–9. 10.1093/nar/30.1.306PubMed CentralView ArticlePubMedGoogle Scholar
- von Mering C, Huynen M, Jaeggi D, Schmidt S, Bork P, Snel B: STRING: a database of predicted functional associations between proteins. Nucleic Acids Res 2003, 31: 258–61. 10.1093/nar/gkg034PubMed CentralView ArticlePubMedGoogle Scholar
- Cesareni G, Ceol A, Gavrila C, Palazzi LM, Persico M, Schneider MV: Comparative interactomics. FEBS Lett 2005, 579: 1828–33. 10.1016/j.febslet.2005.01.064View ArticlePubMedGoogle Scholar
- von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 2005, 33: D433–7. 10.1093/nar/gki005PubMed CentralView ArticlePubMedGoogle Scholar
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ: Basic local alignment search tool. J Mol Biol 1990, 215: 403–10. 10.1006/jmbi.1990.9999View ArticlePubMedGoogle Scholar
- Gusfield D: Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology. Cambridge: Cambridge University Press; 1997.View ArticleGoogle Scholar
- Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork P: SMART 4.0: towards genomic data integration. Nucleic Acids Res 2004, 32: D142–4. 10.1093/nar/gkh088PubMed CentralView ArticlePubMedGoogle Scholar
- Bateman A, Coin L, Durbin R, Finn RD, Hollich V, Griffiths-Jones S, Khanna A, Marshall M, Moxon S, Sonnhammer EL, et al.: The Pfam protein families database. Nucleic Acids Res 2004, 32: D138–41. 10.1093/nar/gkh121PubMed CentralView ArticlePubMedGoogle Scholar
- Bairoch A: The ENZYME database in 2000. Nucleic Acids Res 2000, 28: 304–5. 10.1093/nar/28.1.304PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.