Triangle network motifs predict complexes by complementing high-error interactomes with structural information
© Andreopoulos et al; licensee BioMed Central Ltd. 2009
Received: 14 January 2009
Accepted: 27 June 2009
Published: 27 June 2009
A lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles.
We find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes.
Given high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN.
Protein-protein interaction networks (PPINs) derived from high-throughput studies are known to have many errors [1, 2]. Data from different studies usually exhibit low overlap; for instance, two large-scale human interactome screens [3, 4] share only six interactions, while each has several thousand interactions [5–7]. In some PPINs, more than 50% of reported interactions are estimated to be false positives (FPs) or wrong interactions [8, 9]. Moreover, current PPINs are incomplete with an estimated false negative (missing interactions) rate approaching 90% [10–12]. False positives often result when the matrix model, which fully connects the pray and bait proteins, is used for interpreting results of affinity purification followed by mass spectrometry experiments .
Not all interactions occur at the same place and time in all cellular states. This implies that representing a PPIN as a set of binary protein-protein interactions (PPIs) is often incomplete . Instead, one wants to restructure protein complexes in PPINs, which are modular units of physical interactions occurring at the same time and cellular component [15, 16]. For predicting complexes one wants to include complementary data, such as structural domain-domain interactions (SDDIs) representing binding evidence on proteins [17–22]. At the same time, one wants to leave out of predicted complexes the false positives [22–26].
It was proposed that triangle network motifs represent the basic building blocks of PPINs [27–32]. In this paper, we complement PPIs with SDDIs to form PPI-SDDI-PPI triangle network motifs. Triangle network motifs integrate high-throughput PPINs with complementary knowledge, such as structural data, to account for missing edges [25, 33–38]. Our proposed paradigm of PPI-SDDI-PPI triangle network motifs integrate:
The purpose of PPI-SDDI-PPI triangles is to support revealing biological insights, such as finding complexes of physical interactions occurring at the same time and location [50–55]. Besides complementing PPINs with SDDIs, we additionally form triangle network motifs with other complementary datatypes (CD), such as threading results, and PubMed protein co-occurrence data, thus expanding to other PPI-CD-PPI triangles [56–59]. The complex prediction with other CD is comparable to SDDIs; this supports that the improved complex prediction results are due to a physical relation between proteins and not just coincidence [40, 60, 61].
A rationale for triangles and themes is the observation that proteins with common interaction partners are likely to have common functions [62–65]. Second-level neighbors in PPINs are functionally similar, and are useful for functional prediction [66–70]. By this "Guilt by Association of Common Interaction Partners" approach, themes can be tied to specific biological phenomena and processes [71–73]. For instance, it was shown for the E. coli and C. elegans transcriptional network that subgraphs matching two types of transcriptional regulatory circuit triangle – feed-forward and bi-fan – overlap with one another and form large clusters [28, 74–76]. Another rationale for triangles and themes is that PPINs are "small-world" implying neighborhood clustering, where neighbors of a given node tend to interact with one another; this results in triangle network motifs of three-node interconnection patterns [77, 78]. This led to the "transitive module" hypothesis that is used for predicting missing interactions, as shown in Figure 1a, where proteins with many common interaction partners are likely to interact with one another forming triangles .
Extracting triangle network motifs and themes from high-throughput interaction networks
This paper is organized as follows. Next, we present related work on finding errors in PPINs via motifs of interconnection patterns. Then, we present the results on prediction of true positive complexes using triangles. We illustrate this with an example of myosin-actin related activities. Next, we explain the biological basis for triangles: a model for SDDIs that explains the functional similarity of second-level neighbors in PPINs. Finally, we conclude the paper with an outlook of using other data sources to complement interactomes.
Several papers aim to find errors in PPINs by completing them for missing edges or finding false positives [79–83]. Our approach differs from all of these approaches, since we integrate structural information with PPINs derived from high-throughput studies to find triangle network motifs and themes, which can be used to predict complexes. Moreover, we offer the biological basis for the ability of this structural-PPI hybrid method to predict complexes.
A first category of work involves collecting ensembles of data, such as structural or literature information. Alber et al. (2007)  collect diverse high-quality data, and analyse the ensemble to produce a detailed architectural map of the nuclear pore complex. This work translates the data into spatial restraints, instead of using network motifs as in our approach. Ramirez et al. (2007)  assessed the quality and value of publicly available human protein network data, by comparing predicted datasets, high-throughput results from yeast two-hybrid screens, and literature-curated protein-protein interactions. This analysis revealed major differences between datasets. Rhodes et al. (2005)  demonstrate that a probabilistic analysis integrating model organism protein interactome data, structural domain data, genome-wide gene expression data and functional annotation data predicts nearly 40,000 interactions in humans. Bader et al. (2004)  perform an integrated analysis of proteomics data with data from genetics and gene expression. Combining temporal gene expression clustering with proteomics network topology provides an automated method for extracting biological subnetworks. Huang et al. (2004)  present POINT, the "prediction of interactome database". POINT integrates several publicly accessible databases, with emphasis placed on mouse, fruit fly, worm and yeast protein-protein interactions datasets from the Database of Interacting Proteins (DIP), followed by converting them into a predicted human interactome. POINT also incorporates correlated mRNA expression clusters obtained from cell cycle microarray databases and subcellular localization from Gene Ontology to pinpoint the likelihood of biological relevance of each predicted set of interacting proteins. Patil et al. (2005)  find that a combination of sequence, structure and annotation information is a good predictor of true interactions in large and noisy interactomes.
Another large body of work attempted to predict the missing interactions or assign confidences to large noisy interactomes. Some of these use network topology and others use information on SDDIs, while others use Bayesian networks or probabilistic measures. Yu et al. (2006)  describe predicting missing PPIs, using only the PPIN topology as observed by a high-throughput experiment. The method searches the interactome for defective cliques, nearly complete complexes of pairwise interacting proteins, and predicts the interactions that complete them. Chen et al. (2008)  propose using triplets of observed PPIs to predict and validate interactions. Yeast is the only data set large enough to warrant application of this method. Singhal et al. (2007)  present DomainGA, a computational approach that uses information about SDDIs to predict PPIs. This method achieves good prediction for the positive and negative PPIs in yeast. Pitre et al. (2006)  present PIPE, a system for predicting PPIs for any target pair of the yeast proteins from their primary structure. Chen et al. (2005)  introduce a novel measure called IRAP, "interaction reliability by alternative path", for assessing the reliability of PPIs based on the underlying PPIN topology. IRAP measure is effective for discovering reliable PPIs in large noisy PPIN datasets. Ng et al. (2003)  propose an integrative approach that applies SDDIs to predict and validate PPIs. Chen et al. (2005)  introduce a SDDI-based random forest of decision trees to infer PPIs. This method is capable of exploring all possible SDDIs and making predictions based on all the protein domains. Wu et al. (2006)  propose using the similarity between two Gene Ontology (GO) terms for reconstructing and predicting a yeast PPIN based solely on knowledge of functional associations between the GO annotations.
We have also experimented with using GO similarities in our approach. Chinnasamy et al. (2006)  present a probabilistic-based naive Bayesian network to predict PPIs using protein sequence information. This framework provides a confidence level for every predicted PPI. Jansen et al. (2003)  also developed an approach using Bayesian networks to predict PPIs in yeast. Han et al. (2004)  propose PreSPI, a domain combination based PPI prediction approach. PPIs are interpreted as the result of groups of multiple SDDIs. This approach also provides an interacting probability for PPIs. Recently, Vidal and colleagues  used reference sets to calculate the probability that a newly identified PPI is a true biophysical interaction, and assigned confidence scores to all PPIs in interactome networks. Yu et al. (2009)  assign confidence scores that reflect the reliability of each PPI, by using multiple independent sets of training positives to reduce the bias inherent in using a single training set.
Another body of work has performed large scale analysis of networks, statistical network motif analysis or error estimation, which is of interest for our work as well. Jin et al. (2007)  use network motifs to solve the open question about 'party hubs' and 'date hubs' which was raised by previous studies. At the level of network motifs instead of individual proteins, they found two types of hubs, motif party hubs and motif date hubs, whose network motifs display distinct characteristics on biological functions. Zhang et al. (2005)  observed that different types of networks exhibit different triangle profiles, providing a means for network classification. They extended the network triangle concept to an integrated network of many interaction types. Mathivanan et al. (2006)  analyzed the major publically available databases that contain literature curated PPI information for human proteins, finding a large difference in their content. This included BIND, DIP, HPRD, IntAct, MINT, MIPS, PDZBase and Reactome databases . Chiang et al. (2007)  assess the error statistics in all published large-scale datasets for S. cerevisiae. Vidal and colleagues [99, 100] used an empirically-based approach to assess the quality and coverage of existing human interactomes. They found that high-throughput human interactomes are more precise than literature-curated PPIs from publications.
Several papers used clustering or graph theoretic methods to predict complexes in PPINs. Bader et al. (2003) detected complexes as highly connected subgraphs . Andreopoulos et al. (2007) detected complexes as groups of proteins with similar interaction partners . Cakmak et al. (2007)  go beyond complexes to discover unknown pathways in organisms, using Gene Ontology (GO)-based functionalities of enzymes involved in metabolic pathways.
Results and discussion
In our experiments, we employ three high-throughput PPINs, derived by affinity purification followed by mass spectrometry (AP/MS). Krogan06 is based on . Gavin06MATRIX and Gavin06SPOKE are matrix and spoke model interpretations, respectively, of . The matrix model of interpreting pull-down studies connects all prey proteins that were pulled out with a bait, while the spoke model connects only the preys with the bait. We focus on yeast PPINs, since yeast is a well-annotated organism with Gene Ontology terms. The Krogan06 and Gavin06SPOKE yeast PPINs have low overlap. To evaluate the success of our approach, we employ known complexes from the MIPS database [105, 106]. We evaluate whether known MIPS complexes could be predicted using triangles and theme motifs, consisting of PPINs combined with complementary data such as SDDIs. For illustratory purposes, we use three manually curated networks of myosin-actin involvement in different cellular processes [see Additional files 1, 2, 3, and 4]
Low overlaps of PPINs with complexes
Overlap of high-throughput PPI networks (Gavin06MATRIX and Krogan06) with the MIPS network (without triangles).
Edge overlap with MIPS a
Edges in network but not in MIPS b
PPI-SDDI-PPI triangles predict complexes
Given the many false negatives (missing interactions) and false positives (wrong interactions) in protein-protein interaction networks (PPINs) derived from high-throughput experiments, we evaluated the success of triangle network motifs and themes in finding known MIPS complexes. With structural domain-domain interactions (SDDIs) representing binding evidence on proteins, PPI-SDDI-PPI triangle network motifs are likely to reflect true complexes. To evaluate this, we examined the overlap of triangles from Gavin06 and Krogan06 with MIPS complexes. For the common proteins we evaluated the interactions that are true positives (overlap) or false positives (no overlap) with MIPS.
Success of triangle network motifs and themes in predicting known MIPS complexes.
936/166241 = 0.6%
516/10791 = 4.8%
914/33124 = 2.8%
254/2832 = 9.0%
143/521 = 27.4%
254/1182 = 21.5%
Literature co-occurrence c
710/5592 = 12.7%
416/1340 = 31%
502/1876 = 26.8%
Domain co-occurrence d
2004/21876 = 9.2%
892/4268 = 20.9%
1250/4776 = 26.2%
Union of all above
2477/26468 = 9.4%
1446/6129 = 23.6%
1647/6489 = 25.4%
PPIN triangle success in MIPS complex prediction.
CD = structural SDDI, protein-SCOP domain assignments > confidence threshold
CD = threading, protein-SCOP domain assignments > confidence threshold
Triangles with other complementary data
We added to PPINs other complementary datatypes, besides structural SDDIs, to form triangles: PubMed literature co-occurrences of protein mentions, and Interpro Pfam domain co-occurrences in PPIs  (see methods section). Table 2 rows 3–4 show the MIPS complex overlaps with triangle network motifs using other complementary datatypes to form triangles. The triangles with other complementary datatypes exhibit little difference in their overlap with MIPS complexes. In the last row 5 where all datatypes are combined, the overlap with MIPS increases. Triangles that include SDDIs or other complementary data to match second-level neighbors have higher overlap with MIPS complexes than second-level neighbors without any complementary data. These results point to the direction of complementing the PPINs with other datatypes as triangle network motifs, rather than simple edges, for improved prediction of MIPS complexes.
Individual ability of various datatypes to predict MIPS complexes.
Number of nodes a
Number of edges b
Node overlap with MIPS c
MIPS nodes not in network d
Network nodes not in MIPS e
Nodes in edge overlap f
Edges in edge overlap g
MIPS nodes not in edge overlap h
MIPS edges not in network i
Network edges not in MIPS j
Literature co-occurrence l
Domain co-occurrence m
All above combined
Example: reconstructing distinct myosin-actin biopathways via themes of PPI-SDDI-PPI triangle network motifs
MYO3 is one of two type I myosins, which utilize the cytoskeleton for movement, moving along microfilaments through interaction with actin. Deletion of MYO3 causes severe defects in growth and actin cytoskeleton organization . Besides myosin, SHE4 is also important for the organization of the actin cytoskeleton. SHE4 is of special interest because it is involved in all of organization of the actin cytoskeleton, asymmetric mRNA localization, and endocytosis . SHE4 has similar Gene Ontology annotations as myosin.
Next, we explore whether triangle network motifs and themes in Gavin06MATRIX can help reconstruct distinct myosin-actin pathways for cellular localization of biomolecules.
Cytoskeletal actin organization
Figure 4b illustrates the relevant triangle network motifs. Yeast cells organize their actin cytoskeleton in a highly polarized manner during vegetative growth. Myosin type I is known to play an important role in moving membranes against actin and membrane-actin interactions. Organization of the actin cytoskeleton requires SHE4. SHE4 is a protein containing a domain that binds to myosin motor domains to regulate myosin function .
RSR1, BNI1, GEA1 play a role in cytoskeletal actin localization [113, 114]. The correct localization of RSR1 has been shown to be critical for actin cytoskeleton organization. Localization of the Ras-like GTPase RSR1 and its regulators are required for selection of a specific growth site . Regulators direct the correct localization of RSR1 in various organisms. In Figure 4b, while RSR1 interacts with both MYO3 and GEA1, it also interacts with parts of their intersecting neighborhoods. Both GO term similarity and the literature suggest MYO3/GEA1 control of RSR1. The GEA1 RAS superfamily G proteins (small GTPase) has observed SDDIs with both ARF2 and RSR1. GEA1 is a Guanine nucleotide exchange factor for ADP ribosylation factors (ARFs), involved in vesicular transport between the Golgi and ER, Golgi organization, and actin cytoskeleton organization; similar to but not functionally redundant with GEA2. An active Sec7 region in GEA1, which is the probable catalytic domain for GEF activity, is important for actin cytoskeleton activity. The mechanism by which GEA1 and GEA2 stimulate actin cable formation in a BNI1-dependent manner remains to be determined [116, 117].
What is of special interest in this example is the intersection of the neighborhoods of RSR1, ARF2, BNI1 comprising EF1A-RL3, which were previously observed to have a functional significance for F-actin localization . In addition, BNI1 and GEA1 appear to be connected to the ARF2 complex via PYR1 intermediary. Thus, RSR1, GEA1 and BNI1 appear to be linked to one another via EF1A-RL3-PYR1, which are also common partners of ARF2. This suggests a role of EF1A-RL3-PYR1 as the regulators for the RSR1-GEA1-BNI1 complex localization in yeast cytoskeletal actin localization .
Overexpression of GEA1 or GEA2 was observed to bypass the requirement for profilin in actin cable formation . Profilin is an actin-binding protein involved in cytoskeleton dynamics. Profilin enhances actin growth as follows: Profilin binds to monomeric actin on the plus end of the filament inducing a shape change of the actin subunit, allowing the G-actin to replace the ADP to which it is bound by ATP and form F-actin. The F-actin then forms a heterodimer which can bind to the plus end of an actin filament. In the process of binding to the actin monomers it also stereochemically inhibits addition to the minus end . On the other hand, in a separate study it was observed that loss of the activity to bind EF1A-RL3 displayed an abnormal phenotype represented by dissociated localizations of F-actin, which were co-localized in wild-type cells . This observation links the two studies, suggesting that the significance of EF1A-RL3 for F-actin localization may help explain why overexpression of GEA1 or GEA2 bypassed the requirement for profilin in actin cable formation.
Nuclear actin and myosin I required for RNA polymerase I, II, III transcription
Figure 4c illustrates the relevant triangle network motifs. The presence of actin and nuclear myosin type I (NMI) in the nucleus suggests a role for these motor proteins in nuclear functions. Both actin and nuclear myosin I (NMI) are associated with ribosomal RNA genes (rDNA) and are required for RNA polymerase I, II, III (Pol I, II, III) transcription [121–124]. Actin and NMI are present in nucleoli as a complex physically associated with RNA polymerase I. This association appears to have a functional relevance in rDNA transcription. Altogether an actin-myosin complex is present on actively transcribing ribosomal genes and, therefore, suggests a direct involvement of actin-myosin in regulating transcription .
TBA1/RAP1 play a role in nucleus transciption from RNA polymerase II promoter. TBA1/RAP1 is a DNA-binding protein involved in either activation or repression of transcription, depending on binding site context; it also binds telomere sequences and plays a role in telomeric position effect (silencing) and telomere structure. In Figure 4c, RAP1 is associated with MYO3/SHE4, which transport RAP1 and actin in the nucleus and the cytoplasm. While RAP1 has PPIs to RSR1, BNI1 and ARF2, literature confirms this is an indirect relationship and instead that Myosin type I translocates RAP1 in both the nucleus and cytoplasm (precisely the myosin type I GO annotation) [126, 127]. The indirect interaction of RAP1 with RSR1, BNI1 and ARF2 points to the involvement of actin in transciption.
mRNA localization: The SHE protein complex is required for cytoplasmic transport of mRNAs in yeast
Figure 4d illustrates the relevant triangle network motifs. A key feature of eukaryotic cells is their organization into distinct compartments, each with a distinct set of proteins. It has been shown that the sorting of many cytoplasmic proteins involves mRNA localization. Cytoplasmic localization starts in the nucleus where a first set of RNA-binding factors recognize localized mRNAs [124, 128]. RNA-protein complexes that are exported to the cytoplasm associate with additional factors, such as molecular motor proteins. Such motors are required to transport their cargo along cytoskeletal filaments to the target site where the mRNA is unloaded and anchored. The SHE protein complex facilitates cytoplasmic localization of ASH1 and other localized mRNAs .
ARF2, EF1A, IMDH3 play a role in mRNA localization for translation. ARF2 is an ADP-ribosylation factor involved in regulation of coated formation vesicles in intracellular trafficking within the Golgi . In Figure 4d, ARF2 is likely to interact with subsets of the main cluster; particularly we notice an association of ARF2 with both EF1A and IMDH3:
EF1A: Translation elongation factors are responsible for two main processes during protein synthesis on the ribosome . EF1A (or EF-Tu) is responsible for the selection and binding of the cognate aminoacyl-tRNA to the A-site (acceptor site) of the ribosome. EF2 (or EF-G) is responsible for the translocation of the peptidyl-tRNA from the A-site to the P-site (peptidyl-tRNA site) of the ribosome, thereby freeing the A-site for the next aminoacyl-tRNA to bind. Elongation factors are responsible for achieving accuracy of translation and both EF1A and EF2 are remarkably conserved throughout evolution (InterPro annotation).
IMDH3: Involved in the amino acid biosynthesis pathway.
Biological interpretation of PPI-SDDI-PPI triangles: A structural basis for functional similarity of second-level neighbors in PPINs
Gene Ontology (GO) similarity in triangle PPI edges
Gavin06MATRIX PPI-SDDI-PPI triangles and Krogan06 PPI-SDDI-PPI triangles: Gene Ontology (GO) similarities and correlations.
GO Functional similarity
Average over protein pairs in SDDI edges
Average over protein pairs in PPI edges
0.18 and 0.18
0.35 and 0.36
PPI-PPI similarity correlation coefficient
SDDI-PPI similarity correlation coefficient
0.16 and 0.18
0.58 and 0.59
GO Process/Location similarity
Average over protein pairs in SDDI edges
Average over protein pairs in PPI edges
0.28 and 0.27
0.67 and 0.67
PPI-PPI similarity correlation coefficient
SDDI-PPI similarity correlation coefficient
0.08 and 0.09
0.46 and 0.45
Why are few SDDIs detected in high-throughput PPINs experiments?
Few SDDIs overlap with PPINs derived from high-throughput experiments and MIPS complexes.
SDDIs total with both proteins in MIPS and PPIN
SDDIs supported by PPIs in both MIPS and PPIN
SDDIs supported by PPIs in MIPS but not PPIN
SDDIs supported by PPIs in PPIN but not MIPS
SDDIs supported by PPIs in neither PPIN nor MIPS
How many SDDIs are needed to predict all complexes for an entire PPIN?
SDDIs and the PubMed co-occurrences relate to two different aspects. SDDIs are based on experimental results that are likely to imply a structural interaction. In the case of SDDIs, we can use all information found by mapping structural domains to proteins using BLAST sequence similarity and still get good prediction accuracy. On the other hand, for literature we have to apply a strict filtering, keeping only the top 1% of protein co-occurrences appearing in PubMed as complementary data. We observed that the literature co-occurrences appear to give slightly better results than using SDDIs as complementary data. The main limitation of SDDIs at present is the sparsity of known structural interactions. Since PubMed is expected to grow faster than structural knowledge, using literature co-occurrences might give even better prediction accuracy in the future, as long as a strict cut-off is set.
With the amount of PPINs from high-throughput experiments, structural data and literature-based interactions on the rise, we studied their combined ability to predict known complexes. We found a low overlap of PPINs derived from high-throughput studies with known complexes, as well as low overlap with structural domain-domain interactions.
We proposed PPI-SDDI-PPI triangle network motifs as a model for analysing PPINs and predicting complexes. PPI-SDDI-PPI triangles have higher overlap with MIPS complexes than random second-level neighbors, indicating that structural SDDIs are useful for complementing PPINs in triangles to create a more complete picture of protein cellular involvement. We complemented PPINs with several other datatypes besides SDDIs to create triangle and theme motifs, resulting in similar overlaps with complexes. Themes of PPI-SDDI-PPI triangles helped us to reconstruct complexes in myosin-actin processes that were not detected by PPINs. Our approach is useful for finding true positives in PPINs, as structural knowledge on proteins increases in the future.
SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. A SDDI may cause a structurally connected pair of proteins to be observed with common interaction partners in high-throughput affinity purification experiments followed by mass spectrometry (AP/MS) that use bait-prey technologies. We examined why some SDDIs are detected in PPINs, and we found that SDDIs detected by PPINs are part of highly connected components/complexes, therefore they are more likely to be detected by experimental studies.
In this section we give an overview of the methods used in this study. Figure 2 illustrates the overall workflow of the process.
PPI-CD-PPI triangle network motifs
PPI-CD-PPI triangles contain three proteins connected by two PPIs and an edge of a complementary datatype (CD), such as a structural SDDI; in this case, we refer to PPI-SDDI-PPI triangles, as Figure 1a shows. Our method can be viewed as finding bicliques in a PPIN, and then connecting second level neighbors via complementary datatype edges. For extracting second level neighbors in large networks we used the HIERDENC algorithm, described in [62, 135]. Figure 1b and 1c show that PPI-CD-PPI triangles imply that an experiment detected PPIs A ↔ B and B ↔ C, while a CD edge A ↔ C exists, such as a structural SDDI. In a PPIN second level neighbors (a pair of PPIs) may be involved across cellular space and time in different processes and locations. Connecting second level neighbors to each other via CD edges gives confidence that the second level neighbors interact at the same cellular space and time [136–138]. Triangles likely represent a protein complex [139, 140].
Let σ SDDI denote the number of PPI-SDDI-PPI triangles a structural SDDI is involved in. A structural SDDI may be involved in σ ≥ 1 triangles, which we refer to as a theme. A theme is given by the σ common interaction partners (intersecting neighborhoods) of a SDDI's protein pair, and some PPIs in a theme may be False Positives.
As structural information to complement PPINs, we used the SCOPPI database, which contains SDDIs observed in known protein complex structures . To assign domains, we BLASTed the sequences of all proteins in the "Saccharomyces Genome Database" (which includes yeast PPINs) against all domains sequences of SCOPPI. We considered only BLAST hits with an E-value ≤ 0.01 and a sequence identity percentage s ≥ 30%. In addition, we required 75% of the domain to appear in the protein.
Other complementary datatypes (CD) edges we used included The Genomic Threading Database (GTD) . GTD contains yeast protein assignments to SCOP domain structural annotations and interacting structures. An assigned Confidence value gives an indication of the strength of a hit, ranging from "certain" to "guess", which is based on a P-value measure of significance.
The next CD dataset we used was PubMed literature co-occurrences of protein mentions. To extract these, we used the GoPubMed protein mention extraction algorithm to assign proteins to all PubMed documents . Then, we used a version of the Blosum co-occurrence score to find if two proteins p1 and p2 co-occur frequently in PubMed documents: . A cutoff of 10 was strict enough to filter out the majority of protein co-occurrences in PubMed, resulting in a network of 170,638 edges. The last CD dataset we used was Interpro Pfam domain co-occurrences in PPIs. For this, we took all IntAct yeast PPIs and assigned to the proteins Pfam domains from InterPro . Then, we used the Blosum co-occurrence score to find which Pfam domains co-occur frequently in the IntAct yeast PPIs. Based on the most co-occurring Pfam domains, we build a network over the yeast PPIs.
High-throughput PPINs and known complexes
We use two yeast PPINs that we denote as Gavin06  and Krogan06 . For Gavin06 we used both the matrix and the spoke model to interpret it, which we refer to as Gavin06MATRIX and Gavin06SPOKE throughout the text. Gavin06MATRIX had 93,881 edges, while Gavin06SPOKE had 22,452 edges. Krogan06 had 14,292 edges, consisting of the binary interactions as provided by the publication. For validation, we used MIPS complexes [105, 106]. For MIPS we used the SPOKE model for the interpretation of complexes, since otherwise the result would be biased to give a high overlap with the PPINs [see Additional files 5, 6]. The MIPS complexes had 2,099 edges.
Moreover, for our illustrations we manually curated three network examples from the literature, representing myosin-actin involvement in cytoskeleton organisation, nucleus transcription, and mRNA translocation. Developing these networks involved reading papers from the biomedical literature and recording any interaction(s) described in the articles.
Gene Ontology similarity
It is likely that a PPI is not physical, but a false positive, which may be detected by a GO similarity of zero. PPIs with a GO similarity of zero hint at false positives. For calculating the similarity based on Gene Ontology terms, we searched for GO terms in the current abstract and compared them to the set of GO terms assigned to each gene candidate. For each potential tuple taken from the two sets (text and gene annotation), we calculated a distance of the terms in the ontology tree. These distances yielded a similarity measure for two terms, even if they did not belong to the same sub-branch or were immediate parents/children of each other. The distance took into account the shortest path via the lowest common ancestors, as well as the depth of this lowest common ancestor in the overall hierarchy (comparable to Schlicker et al., 2006 ). The distances for the closest terms from each set then defined a similarity between the gene and the text .
HIERDENC supplementary material
We implemented the HIERDENC online database, which contains all of the datasets we used. HIERDENC helps a user to visualize and find true positives in PPINs via triangles of high-throughput PPINs and complementary data. http://www.hierdenc.com/ or http://projects.biotec.tu-dresden.de/HIERDENC/
This work was funded by the EU Sealife project. Joerg Hakenberg helped with Gene Ontology similarity. Rainer Winnenburg helped with discussions.
- Chiang T, Scholtens D, Sarkar D, Gentleman R, Huber W: Coverage and error models of protein-protein interaction data by directed graph analysis. Genome Biol 2007, 8(9):R186.PubMed CentralPubMedView ArticleGoogle Scholar
- Yip KY, Gerstein M: Training set expansion: an approach to improving the reconstruction of biological networks from limited and uneven reliable interactions. Bioinformatics 2009, 25(2):243–50.PubMed CentralPubMedView ArticleGoogle Scholar
- Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck F, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksöz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker E: A human protein-protein interaction network: a resource for annotating the proteome. Cell 2005, 122(6):957–68.PubMedView ArticleGoogle Scholar
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, Smolyar A, Bosak S, Sequerra R, Doucette-Stamm L, Cusick ME, Hill DE, Roth FP, Vidal M: Towards a proteome-scale map of the human protein-protein interaction network. Nature 2005, 437(7062):1173–1178.PubMedView ArticleGoogle Scholar
- Hart GT, Ramani AK, Marcotte EM: How complete are current yeast and human protein-interaction networks? Genome_Biol 2006, 7(11):120.PubMed CentralPubMedGoogle Scholar
- Hoffmann R, Valencia A: Protein interaction: same network, different hubs. Trends Genet 2003, 19(12):681–3.PubMedView ArticleGoogle Scholar
- Deane CM, Salwinski L, Xenarios I, Eisenberg D: Protein interactions: two methods for assessment of the reliability of high throughput observations. Mol Cell Proteomics 2002, 1(5):349–56.PubMedView ArticleGoogle Scholar
- Aloy P: Shaping the future of interactome networks. Genome Biol 2007, 8(10):316.PubMed CentralPubMedView ArticleGoogle Scholar
- Scott MS, Barton GJ: Probabilistic prediction and ranking of human protein-protein interactions. BMC Bioinformatics 2007, 8: 239.PubMed CentralPubMedView ArticleGoogle Scholar
- Stumpf MPH, Thorne T, Silva EdS, Stewart R, An HJ, Lappe M, Wiuf C: Estimating the size of the human interactome. Proc Natl Acad Sci USA 2008, 105(19):6959–64.PubMed CentralPubMedView ArticleGoogle Scholar
- D'haeseleer P, Church GM: Estimating and improving protein interaction error rates. Proc IEEE Comput Syst_Bioinform Conf 2004, 216–23.Google Scholar
- Sprinzak E, Sattath S, Margalit H: How reliable are experimental protein-protein interaction data? J Mol Biol 2003, 327(5):919–923.PubMedView ArticleGoogle Scholar
- Bader GD, Hogue CWV: Analyzing yeast protein-protein interaction data obtained from different sources. Nat Biotechnol 2002, 20(10):991–997.PubMedView ArticleGoogle Scholar
- Zhang Y, Xuan J, los Reyes BGdlR, Clarke R, Ressom HW: Network motif-based identification of transcription factor-target gene relationships by integrating multi-source biological data. BMC Bioinformatics 2008, 9: 203.PubMed CentralPubMedView ArticleGoogle Scholar
- Sprinzak E, Altuvia Y, Margalit H: Characterization and prediction of protein-protein interactions within and between complexes. Proc Natl Acad Sci USA 2006, 103(40):14718–23.PubMed CentralPubMedView ArticleGoogle Scholar
- Edwards A, Kus B, Jansen R, Greenbaum D, Greenblatt J, Gerstein M: Bridging structural biology and genomics: assessing protein interaction data with known complexes. Trends Genet 2002, 18(10):529–36.PubMedView ArticleGoogle Scholar
- Singh R, Xu J, Berger B: Struct2net: integrating structure into protein-protein interaction prediction. Pac Symp Biocomput 2006, 403–14.Google Scholar
- Kim W, Park J, Suh J: Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. Genome Inform 2002, 13: 42–50.PubMedGoogle Scholar
- Bader J, Chaudhuri A, Rothberg J, Chant J: Gaining confidence in high-throughput protein interaction networks. Nat Biotechnol 2004, 22: 78–85.PubMedView ArticleGoogle Scholar
- Tong A, Drees B, Nardelli G, Bader G, Brannetti B, Castagnoli L, Evangelista M, Ferracuti S, Nelson B, Paoluzi S, Quondam M, Zucconi A, Hogue C, Fields S, Boone C, Cesareni G: A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 2002, 295(5553):321–4.PubMedView ArticleGoogle Scholar
- Espadaler J, Romero-Isart O, Jackson R, Oliva B: Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships. Bioinformatics 2005, 21(16):3360–8.PubMedView ArticleGoogle Scholar
- Ramirez F, Schlicker A, Assenov Y, Lengauer T, Albrecht M: Computational analysis of human protein interaction networks. Proteomics 2007, 7(15):2541–2552.PubMedView ArticleGoogle Scholar
- Singhal M, Resat H: A domain-based approach to predict protein-protein interactions. BMC Bioinformatics 2007, 8: 199.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen X, Liu M: Prediction of protein-protein interactions using random decision forest framework. Bioinformatics 2005, 21(24):4394–400.PubMedView ArticleGoogle Scholar
- Wojcik J, Schachter V: Protein-protein interaction map inference using interacting domain profile pairs. Bioinformatics 2001, 17(Suppl 1):S296–305.PubMedView ArticleGoogle Scholar
- Patil A, Nakamura H: Filtering high-throughput protein-protein interaction data using a combination of genomic features. BMC Bioinformatics 2005, 6: 100.PubMed CentralPubMedView ArticleGoogle Scholar
- Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U: Network motifs: simple building blocks of complex networks. Science 2002, 298(5594):824–7.PubMedView ArticleGoogle Scholar
- Zhang L, King O, Wong S, Goldberg D, Tong A, Lesage G, Andrews B, Bussey H, Boone C, Roth F: Motifs, themes and thematic maps of an integrated Saccharomyces cerevisiae interaction network. J Biol 2005, 4(2):6.PubMed CentralPubMedView ArticleGoogle Scholar
- Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U: Network motifs: simple building blocks of complex networks. Science 2002, 298(5594):824–827.PubMedView ArticleGoogle Scholar
- Kashtan N, Alon U: Spontaneous evolution of modularity and network motifs. Proc Natl Acad Sci USA 2005, 102(39):13773–13778.PubMed CentralPubMedView ArticleGoogle Scholar
- Albert I, Albert R: Conserved network motifs allow protein-protein interaction prediction. Bioinformatics 2004, 20(18):3346–3352.PubMedView ArticleGoogle Scholar
- Jin G, Zhang S, Zhang X, Chen L: Hubs with network motifs organize modularity dynamically in the protein-protein interaction network of yeast. PLoS_ONE 2007, 2(11):e1207.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim WK, Park J, Suh JK: Large scale statistical prediction of protein-protein interaction by potentially interacting domain (PID) pair. Genome Inform 2002, 13: 42–50.PubMedGoogle Scholar
- Ng SK, Zhang Z, Tan SH: Integrative approach for computationally inferring protein domain interactions. Bioinformatics 2003, 19(8):923–929.PubMedView ArticleGoogle Scholar
- Aloy P, Russell RB: InterPreTS: protein interaction prediction through tertiary structure. Bioinformatics 2003, 19: 161–162.PubMedView ArticleGoogle Scholar
- Riley R, Lee C, Sabatti C, Eisenberg D: Inferring protein domain interactions from databases of interacting proteins. Genome_Biol 2005, 6(10):R89.PubMed CentralPubMedGoogle Scholar
- Guimaraes KS, Jothi R, Zotenko E, Przytycka TM: Predicting domain-domain interactions using a parsimony approach. Genome_Biol 2006, 7(11):R104.PubMed CentralPubMedGoogle Scholar
- Deng M, Mehta S, Sun F, Chen T: Inferring domain-domain interactions from protein-protein interactions. Genome_Res 2002, 12(10):1540–1548.PubMed CentralPubMedGoogle Scholar
- Jothi R, Cherukuri PF, Tasneem A, Przytycka TM: Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. J Mol_Biol 2006, 362(4):861–875.PubMed CentralPubMedView ArticleGoogle Scholar
- Xia K, Fu Z, Hou L, Han JDJ: Impacts of protein-protein interaction domains on organism and network complexity. Genome_Res 2008, 18(9):1500–8.PubMed CentralPubMedGoogle Scholar
- Cohen-Gihon I, Nussinov R, Sharan R: Comprehensive analysis of co-occurring domain sets in yeast proteins. BMC_Genomics 2007, 8: 161.PubMed CentralPubMedGoogle Scholar
- Nye TMW, Berzuini C, Gilks WR, Babu MM, Teichmann SA: Statistical analysis of domains in interacting protein pairs. Bioinformatics 2005, 21(7):993–1001.PubMedView ArticleGoogle Scholar
- Nye TMW, Berzuini C, Gilks WR, Babu MM, Teichmann S: Predicting the strongest domain-domain contact in interacting protein pairs. Stat_Appl_Genet_Mol_Biol 2006., 5:Google Scholar
- Liu Y, Liu N, Zhao H: Inferring protein-protein interactions through high-throughput interaction data from diverse organisms. Bioinformatics 2005, 21(15):3279–3285.PubMedView ArticleGoogle Scholar
- Itzhaki Z, Akiva E, Altuvia Y, Margalit H: Evolutionary conservation of domain-domain interactions. Genome_Biol 2006, 7(12):R125.PubMed CentralPubMedGoogle Scholar
- Jothi R, Cherukuri PF, Tasneem A, Przytycka TM: Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. J_Mol_Biol 2006, 362(4):861–75.PubMed CentralPubMedGoogle Scholar
- Iqbal M, Freitas AA, Johnson CG, Vergassola M: Message-passing algorithms for the prediction of protein domain interactions from protein-protein interaction data. Bioinformatics 2008, 24(18):2064–70.PubMedView ArticleGoogle Scholar
- Wang RS, Wang Y, Wu LY, Zhang XS, Chen L: Analysis on multi-domain cooperation for predicting protein-protein interactions. BMC Bioinformatics 2007, 8: 391.PubMed CentralPubMedView ArticleGoogle Scholar
- Wuchty S: Topology and weights in a protein domain interaction network-a novel way to predict protein interactions. BMC Genomics 2006, 7: 122.PubMed CentralPubMedView ArticleGoogle Scholar
- Luo F, Yang Y, Chen C, Chang R, Zhou J, Scheuermann R: Modular organization of protein interaction networks. Bioinformatics 2007, 23(2):207–14.PubMedView ArticleGoogle Scholar
- Gagneur J, Krause R, Bouwmeester T, Casari G: Modular decomposition of protein-protein interaction networks. Genome_Biol 2004, 5(8):R57.PubMed CentralPubMedGoogle Scholar
- Pawson T: Organization of cell-regulatory systems through modular-protein-interaction domains. Philos Transact A Math Phys Eng Sci 2003, 361(1807):1251–62.View ArticleGoogle Scholar
- Poyatos J, Hurst L: How biologically relevant are interaction-based modules in protein networks? Genome Biol 2004, 5(11):R93.PubMed CentralPubMedView ArticleGoogle Scholar
- Rives A, Galitski T: Modular organization of cellular networks. Proc Natl Acad Sci USA 2003, 100(3):1128–33.PubMed CentralPubMedView ArticleGoogle Scholar
- Lu H, Shi B, Wu G, Zhang Y, Zhu X, Zhang Z, Liu C, Zhao Y, Wu T, Wang J, Chen R: Integrated analysis of multiple data sources reveals modular structure of biological networks. Biochem Biophys Res Commun 2006, 345: 302–9.PubMedView ArticleGoogle Scholar
- Qi Y, Klein-Seetharaman J, Bar-Joseph Z: A mixture of feature experts approach for protein-protein interaction prediction. BMC Bioinformatics 2007, 8(Suppl 10):S6.PubMed CentralPubMedView ArticleGoogle Scholar
- Qi Y, Bar-Joseph Z, Klein-Seetharaman J: Evaluation of different biological data and computational classification methods for use in protein interaction prediction. Proteins 2006, 63(3):490–500.PubMed CentralPubMedView ArticleGoogle Scholar
- Beyer A, Workman C, Hollunder J, Radke D, Mueller U, Wilhelm T, Ideker T: Integrated assessment and prediction of transcription factor binding. PLoS Comput Biol 2006, 2(6):e70.PubMed CentralPubMedView ArticleGoogle Scholar
- Lin N, Wu B, Jansen R, Gerstein M, Zhao H: Information assessment on predicting protein-protein interactions. BMC Bioinformatics 2004, 5: 154.PubMed CentralPubMedView ArticleGoogle Scholar
- Aloy P, Russell R: Structural systems biology: modelling protein interactions. Nat Rev Mol Cell Biol 2006, 7(3):188–197.PubMedView ArticleGoogle Scholar
- Schlicker A, Huthmacher C, Ramirez F, Lengauer T, Albrecht M: Functional evaluation of domain-domain interactions and human protein interaction networks. Bioinformatics 2007, 23(7):859–865.PubMedView ArticleGoogle Scholar
- Andreopoulos B, An A, Wang X, Faloutsos M, Schroeder M: Clustering by common friends finds locally significant proteins mediating modules. Bioinformatics 2007, 23(9):1124–31.PubMedView ArticleGoogle Scholar
- Li H, Li J, Wong L: Discovering motif pairs at interaction sites from protein sequences on a proteome-wide scale. Bioinformatics 2006, 22(8):989–996.PubMedView ArticleGoogle Scholar
- Okada K, Kanaya S, Asai K: Accurate extraction of functional associations between proteins based on common interaction partners and common domains. Bioinformatics 2005, 21(9):2043–8.PubMedView ArticleGoogle Scholar
- Goh , Bogan , Joachimiak , Walther , Cohen : Co-evolution of proteins with their interaction partners. JMB 2000, 299(2):283–93.View ArticleGoogle Scholar
- Chua HN, Ning K, Sung WK, Leong HW, Wong L: Using indirect protein-protein interactions for protein complex prediction. J Bioinform Comput Biol 2008, 6(3):435–66.PubMedView ArticleGoogle Scholar
- Chua HN, Sung WK, Wong L: Using indirect protein interactions for the prediction of Gene Ontology functions. BMC Bioinformatics 2007, 8(Suppl 4):S8.PubMed CentralPubMedView ArticleGoogle Scholar
- Yu H, Paccanaro A, Trifonov V, Gerstein M: Predicting interactions in protein networks by completing defective cliques. Bioinformatics 2006, 22(7):823–829.PubMedView ArticleGoogle Scholar
- Morrison JL, Breitling R, Higham DJ, Gilbert DR: A lock-and-key model for protein-protein interactions. Bioinformatics 2006, 22(16):2012–9.PubMedView ArticleGoogle Scholar
- Zhang S, Ning X, Zhang X: Identification of functional modules in a PPI network by clique percolation clustering. Comput Biol Chem 2006, 30(6):445–51.PubMedView ArticleGoogle Scholar
- Chua HN, Sung WK, Wong L: Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 2006, 22(13):1623–30.PubMedView ArticleGoogle Scholar
- Vázquez A, Dobrin R, Sergi D, Eckmann JP, Oltvai ZN, Barab'asi AL: The topological relationship between the large-scale attributes and local interaction patterns of complex networks. Proc Natl Acad Sci USA 2004, 101(52):17940–17945.PubMed CentralPubMedView ArticleGoogle Scholar
- Lo SL, Cai CZ, Chen YZ, Chung MCM: Effect of training datasets on support vector machine prediction of protein-protein interactions. Proteomics 2005, 5(4):876–84.PubMedView ArticleGoogle Scholar
- Resendis-Antonio O, Freyre-Gonzalez JA, Menchaca-Mandez R, Gutierrez-Rios RM, Martinez-Antonio A, Avila-Sanchez C, Collado-Vides J: Modular analysis of the transcriptional regulatory network of E. coli. Trends Genet 2005, 21: 16–20.PubMedView ArticleGoogle Scholar
- Shen-Orr S, Milo R, Mangan S, Alon U: Network motifs in the transcriptional regulation network of Escherichia coli. Nat Genet 2002, 31: 64–8.PubMedView ArticleGoogle Scholar
- Wuchty S, Oltvai ZN, Barab'asi AL: Evolutionary conservation of motif constituents in the yeast protein interaction network. Nat Genet 2003, 35(2):176–179.PubMedView ArticleGoogle Scholar
- Clauset A, Moore C, Newman MEJ: Hierarchical structure and the prediction of missing links in networks. Nature 2008, 453(7191):98–101.PubMedView ArticleGoogle Scholar
- Watts DJ, Strogatz SH: Collective dynamics of 'small-world' networks. Nature 1998, 393(6684):440–442.PubMedView ArticleGoogle Scholar
- Yu J, Fotouhi F: Computational approaches for predicting protein-protein interactions: a survey. J Med Syst 2006, 30: 39–44.PubMedView ArticleGoogle Scholar
- Valencia A, Pazos F: Computational methods for the prediction of protein interactions. Curr Opin Struct Biol 2002, 12(3):368–73.PubMedView ArticleGoogle Scholar
- von Mering C, J Jensen L, Kuhn M, Chaffron S, Doerks T, Kruger B, Snel B, Bork P: STRING 7-recent developments in the integration and prediction of protein interactions. Nucleic Acids Res 2007, (35 Database):D358–62.Google Scholar
- Ben-Hur A, Noble WS: Kernel methods for predicting protein-protein interactions. Bioinformatics 2005, 21(Suppl 1):i38–46.PubMedView ArticleGoogle Scholar
- Guo Y, Yu L, Wen Z, Li M: Using support vector machine combined with auto covariance to predict protein-protein interactions from protein sequences. Nucleic Acids Res 2008, 36(9):3025–30.PubMed CentralPubMedView ArticleGoogle Scholar
- Alber F, Dokudovskaya S, Veenhoff L, Zhang W, Kipper J, Devos D, Suprapto A, Karni-Schmidt O, Williams R, Chait B, Rout M, Sali A: Determining the architectures of macromolecular assemblies. Nature 2007, 450(7170):683–94.PubMedView ArticleGoogle Scholar
- Rhodes D, Tomlins S, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan A: Probabilistic model of the human protein-protein interaction network. Nat Biotechnol 2005, 23(8):951–9.PubMedView ArticleGoogle Scholar
- Huang T, Tien A, Huang W, Lee Y, Peng C, Tseng H, Kao C, Huang C: POINT: a database for the prediction of protein-protein interactions based on the orthologous interactome. Bioinformatics 2004, 20(17):3273–6.PubMedView ArticleGoogle Scholar
- Patil A, Nakamura H: Filtering high-throughput protein-protein interaction data using a combination of genomic features. BMC Bioinformatics 2005, 6: 100.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen P, Deane C, Reinert G: Predicting and Validating Protein Interactions Using Network Structure. PLoS Comput Biol 2008., 4(7):Google Scholar
- Pitre S, Dehne F, Chan A, Cheetham J, Duong A, Emili A, Gebbia M, Greenblatt J, Jessulat M, Krogan N, Luo X, Golshani A: PIPE: a protein-protein interaction prediction engine based on the re-occurring short polypeptide sequences between known interacting protein pairs. BMC Bioinformatics 2006, 7: 365.PubMed CentralPubMedView ArticleGoogle Scholar
- Ng S, Zhang Z, Tan S: Integrative approach for computationally inferring protein domain interactions. Bioinformatics 2003, 19(8):923–9.PubMedView ArticleGoogle Scholar
- Wu X, Zhu L, Guo J, Zhang DY, Lin K: Prediction of yeast protein-protein interaction network: insights from the Gene Ontology and annotations. Nucleic Acids Res 2006, 34(7):2137–50.PubMed CentralPubMedView ArticleGoogle Scholar
- Chinnasamy A, Mittal A, Sung WK: Probabilistic prediction of protein-protein interactions from the protein sequences. Comput Biol Med 2006, 36(10):1143–54.PubMedView ArticleGoogle Scholar
- Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan N, Chung S, Emili A, Snyder M, Greenblatt J, Gerstein M: A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science 2003, 302(5644):449–53.PubMedView ArticleGoogle Scholar
- Han DS, Kim HS, Jang WH, Lee SD, Suh JK: PreSPI: a domain combination based prediction system for protein-protein interaction. Nucleic Acids Res 2004, 32(21):6312–20.PubMed CentralPubMedView ArticleGoogle Scholar
- Braun P, Tasan M, Dreze M, Barrios-Rodiles M, Lemmens I, Yu H, Sahalie JM, Murray RR, Roncari L, de Smet AS, Venkatesan K, Rual JF, Vandenhaute J, Cusick ME, Pawson T, Hill DE, Tavernier J, Wrana JL, Roth FP, Vidal M: An experimentally derived confidence score for binary protein-protein interactions. Nat Methods 2009, 6: 91–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Yu J, Finley RLJ: Combining multiple positive training sets to generate confidence scores for protein-protein interactions. Bioinformatics 2009, 25: 105–11.PubMed CentralPubMedView ArticleGoogle Scholar
- Mathivanan S, Periaswamy B, Gandhi T, Kandasamy K, Suresh S, Mohmood R, Ramachandra Y, Pandey A: An evaluation of human protein-protein interaction data in the public domain. BMC Bioinformatics 2006, 7(Suppl 5):S19.PubMed CentralPubMedView ArticleGoogle Scholar
- Galperin MY, Cochrane GR: Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009. Nucleic Acids Res 2009, (37 Database):D1–4.Google Scholar
- Venkatesan K, Rual JF, Vazquez A, Stelzl U, Lemmens I, Hirozane-Kishikawa T, Hao T, Zenkner M, Xin X, Goh KI, Yildirim MA, Simonis N, Heinzmann K, Gebreab F, Sahalie JM, Cevik S, Simon C, de Smet AS, Dann E, Smolyar A, Vinayagam A, Yu H, Szeto D, Borick H, Dricot A, Klitgord N, Murray RR, Lin C, Lalowski M, Timm J, Rau K, Boone C, Braun P, Cusick ME, Roth FP, Hill DE, Tavernier J, Wanker EE, Barabasi AL, Vidal M: An empirical framework for binary interactome mapping. Nat Methods 2009, 6: 83–90.PubMed CentralPubMedView ArticleGoogle Scholar
- Cusick ME, Yu H, Smolyar A, Venkatesan K, Carvunis AR, Simonis N, Rual JF, Borick H, Braun P, Dreze M, Vandenhaute J, Galli M, Yazaki J, Hill DE, Ecker JR, Roth FP, Vidal M: Literature-curated protein interaction datasets. Nat Methods 2009, 6: 39–46.PubMed CentralPubMedView ArticleGoogle Scholar
- Bader GD, Hogue CWV: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 2003, 4: 2.PubMed CentralPubMedView ArticleGoogle Scholar
- Cakmak A, Ozsoyoglu G: Mining biological networks for unknown pathways. Bioinformatics 2007, 23(20):2775–83.PubMedView ArticleGoogle Scholar
- Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrín-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, Onge PS, Ghanny S, Lam MHY, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature 2006, 440(7084):637–643.PubMedView ArticleGoogle Scholar
- Gavin A, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen L, Bastuck S, Dümpelfeld B, Edelmann A, Heurtier M, Hoffman V, Hoefert C, Klein K, Hudak M, Michon A, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick J, Kuster B, Bork P, Russell R, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature 2006, 440(7084):631–6.PubMedView ArticleGoogle Scholar
- Mewes H, Frishman D, Mayer K, Muensterkoetter M, Noubibou O, Pagel P, Rattei T, Oesterheld M, Ruepp A, Stuempflen V: MIPS: analysis and annotation of proteins from whole genomes in 2005. Nucleic Acids Res 2006, (34 Database):D169–72.Google Scholar
- Pagel P, Kovac S, Oesterheld M, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Mark P, Stuempflen V, Mewes HW, Ruepp A, Frishman D: The MIPS mammalian protein-protein interaction database. Bioinformatics 2005, 21(6):832–4.PubMedView ArticleGoogle Scholar
- Mulder NJ, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, Bork P, Buillard V, Cerutti L, Copley R, Courcelle E, Das U, Daugherty L, Dibley M, Finn R, Fleischmann W, Gough J, Haft D, Hulo N, Hunter S, Kahn D, Kanapin A, Kejariwal A, Labarga A, Langendijk-Genevaux PS, Lonsdale D, Lopez R, Letunic I, Madera M, Maslen J, McAnulla C, McDowall J, Mistry J, Mitchell A, Nikolskaya AN, Orchard S, Orengo C, Petryszak R, Selengut JD, Sigrist CJA, Thomas PD, Valentin F, Wilson D, Wu CH, Yeats C: New developments in the InterPro database. Nucleic Acids Res 2007, (35 Database):D224-D228.Google Scholar
- Galletta B, Chuang D, Cooper J: Distinct Roles for Arp2/3 Regulators in Actin Assembly and Endocytosis. PLoS Biol 2008, 6: e1.PubMed CentralPubMedView ArticleGoogle Scholar
- Kim PM, Lu LJ, Xia Y, Gerstein MB: Relating three-dimensional structures to protein networks provides evolutionary insights. Science 2006, 314(5807):1938–1941.PubMedView ArticleGoogle Scholar
- Lanerolle PdL, Johnson T, Hofmann WA: Actin and myosin I in the nucleus: what next? Nat Struct Mol Biol 2005, 12(9):742–6.PubMedView ArticleGoogle Scholar
- Evangelista M, Klebl B, Tong A, Webb B, Leeuw T, Leberer E, Whiteway M, Thomas D, Boone C: A role for myosin-I in actin assembly through interactions with Vrp1p, Bee1p, and the Arp2/3 complex. J Cell Biol 2000, 148(2):353–62.PubMed CentralPubMedView ArticleGoogle Scholar
- Toi H, Fujimura-Kamada K, Irie K, Takai Y, Todo S, Tanaka K: She4p/Dim1p interacts with the motor domain of unconventional myosins in the budding yeast, Saccharomyces cerevisiae. Mol Biol Cell 2003, 14(6):2237–49.PubMed CentralPubMedView ArticleGoogle Scholar
- Pruyne D, Evangelista M, Yang C, Bi E, Zigmond S, Bretscher A, Boone C: Role of formins in actin assembly: nucleation and barbed-end association. Science 2002, 297(5581):612–5.PubMedView ArticleGoogle Scholar
- Evangelista M, Pruyne D, Amberg D, Boone C, Bretscher A: Formins direct Arp2/3-independent actin filament assembly to polarize cell growth in yeast. Nat Cell Biol 2002, 4(3):260–9.PubMedView ArticleGoogle Scholar
- Park H, Kang P, Rachfal A: Localization of the Rsr1/Bud1 GTPase involved in selection of a proper growth site in yeast. J Biol Chem 2002, 277(30):26721–4.PubMedView ArticleGoogle Scholar
- Zakrzewska E, Perron M, Laroche A, Pallotta D: A role for GEA1 and GEA2 in the organization of the actin cytoskeleton in Saccharomyces cerevisiae. Genetics 2003, 165(3):985–95.PubMed CentralPubMedGoogle Scholar
- Evangelista M, Blundell K, Longtine M, Chow C, Adames N, Pringle J, Peter M, Boone C: Bni1p, a yeast formin linking cdc42p and the actin cytoskeleton during polarized morphogenesis. Science 1997, 276(5309):118–22.PubMedView ArticleGoogle Scholar
- Yanagihara C, Shinkai M, Kariya K, Yamawaki-Kataoka Y, Hu CD, Masuda T, Kataoka T: Association of elongation factor 1 alpha and ribosomal protein L3 with the proline-rich region of yeast adenylyl cyclase-associated protein CAP. Biochem Biophys Res Commun 1997, 232(2):503–7.PubMedView ArticleGoogle Scholar
- Nelson WJ: Adaptation of core mechanisms to generate cell polarity. Nature 2003, 422(6933):766–74.PubMed CentralPubMedView ArticleGoogle Scholar
- Lambert AA, Perron MP, Lavoie E, Pallotta D: The Saccharomyces cerevisiae Arf3 protein is involved in actin cable and cortical patch formation. FEMS Yeast Res 2007, 7(6):782–95.PubMedView ArticleGoogle Scholar
- Bettinger BT, Gilbert DM, Amberg DC: Actin up in the nucleus. Nat Rev Mol Cell Biol 2004, 5(5):410–5.PubMedView ArticleGoogle Scholar
- Pederson T, Aebi U: Actin in the nucleus: what form and what for? J Struct Biol 2002, 140(1–3):3–9.PubMedView ArticleGoogle Scholar
- Olave IA, Reck-Peterson SL, Crabtree GR: Nuclear actin and actin-related proteins in chromatin remodeling. Annu Rev Biochem 2002, 71: 755–81.PubMedView ArticleGoogle Scholar
- Franke WW: Actin's many actions start at the genes. Nat Cell Biol 2004, 6(11):1013–4.PubMedView ArticleGoogle Scholar
- Hofmann WA, Stojiljkovic L, Fuchsova B, Vargas GM, Mavrommatis E, Philimonenko V, Kysela K, Goodrich JA, Lessard JL, Hope TJ, Hozak P, Lanerolle PdL: Actin is part of pre-initiation complexes and is necessary for transcription by RNA polymerase II. Nat Cell Biol 2004, 6(11):1094–101.PubMedView ArticleGoogle Scholar
- Pina B, Fernandez-Larrea J, Garcia-Reyero N, Idrissi FZ: The different (sur)faces of Rap1p. Mol Genet Genomics 2003, 268(6):791–8.PubMedGoogle Scholar
- Holden JL, Nur-E-Kamal MS, Fabri L, Nice E, Hammacher A, Maruta H: Rsr1 and Rap1 GTPases are activated by the same GTPase-activating protein and require threonine 65 for their activation. J Biol Chem 1991, 266(26):16992–5.PubMedGoogle Scholar
- Lecuyer E, Yoshida H, Parthasarathy N, Alm C, Babak T, Cerovina T, Hughes T, Tomancak P, Krause H: Global analysis of mRNA localization reveals a prominent role in organizing cellular architecture and function. Cell 2007, 131: 174–87.PubMedView ArticleGoogle Scholar
- Long RM, Singer RH, Meng X, Gonzalez I, Nasmyth K, Jansen RP: Mating type switching in yeast controlled by asymmetric localization of ASH1 mRNA. Science 1997, 277(5324):383–7.PubMedView ArticleGoogle Scholar
- Stearns T, Kahn RA, Botstein D, Hoyt MA: ADP ribosylation factor is an essential protein in Saccharomyces cerevisiae and is encoded by two genes. Mol Cell Biol 1990, 10(12):6690–9.PubMed CentralPubMedView ArticleGoogle Scholar
- Nilsson J, Nissen P: Elongation factors on the ribosome. Curr Opin Struct Biol 2005, 15(3):349–54.PubMedView ArticleGoogle Scholar
- Scholtens D, Chiang T, Huber W, Gentleman R: Estimating node degree in bait-prey graphs. Bioinformatics 2008, 24(2):218–24.PubMedView ArticleGoogle Scholar
- Schlicker A, Domingues FS, Rahnenführer J, Lengauer T: A new measure for functional similarity of gene products based on Gene Ontology. BMC Bioinformatics 2006, 7: 302.PubMed CentralPubMedView ArticleGoogle Scholar
- Winter C, Henschel A, Kim W, Schroeder M: SCOPPI: a structural classification of protein-protein interfaces. Nucleic Acids Res 2006, (34 Database):D310–4.Google Scholar
- Andreopoulos B, An A, Wang X: Hierarchical density-based clustering of categorical data and a simplification. In Proceedings of the 11th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2007), Springer LNCS 4426; 11–22. Nanjing, China, May 22-25, 2007View ArticleGoogle Scholar
- Beyer A, Bandyopadhyay S, Ideker T: Integrating physical and genetic maps: from genomes to interaction networks. Nat Rev Genet 2007, 8(9):699–710.PubMed CentralPubMedView ArticleGoogle Scholar
- Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B, Hanspers K, Isserlin R, Kelley R, Killcoyne S, Lotia S, Maere S, Morris J, Ono K, Pavlovic V, Pico AR, Vailaya A, Wang PL, Adler A, Conklin BR, Hood L, Kuiper M, Sander C, Schmulevich I, Schwikowski B, Warner GJ, Ideker T, Bader GD: Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2007, 2(10):2366–82.PubMed CentralPubMedView ArticleGoogle Scholar
- Emig D, Cline MS, Lengauer T, Albrecht M: Integrating expression data with domain interaction networks. Bioinformatics 2008, 24(21):2546–8.PubMed CentralPubMedView ArticleGoogle Scholar
- Schuster-Bockler B, Bateman A: Reuse of structural domain-domain interactions in protein networks. BMC Bioinformatics 2007, 8: 259.PubMed CentralPubMedView ArticleGoogle Scholar
- Aragues R, Sali A, Bonet J, Marti-Renom MA, Oliva B: Characterization of protein hubs by inferring interacting motifs from protein interactions. PLoS Comput Biol 2007, 3(9):1761–71.PubMedView ArticleGoogle Scholar
- McGuffin LJ, Street SA, Bryson K, Soerensen SA, Jones DT: The Genomic Threading Database: a comprehensive resource for structural annotations of the genomes from key organisms. Nucleic Acids Res 2004, (32 Database):D196–9.Google Scholar
- The GO Consortium: The Gene Ontology (GO) project in 2006. Nucleic Acids Research 2005, (34 Database):D322–6.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.