A high-accuracy consensus map of yeast protein complexes reveals modular nature of gene essentiality
© Hart et al; licensee BioMed Central Ltd. 2007
Received: 22 December 2006
Accepted: 02 July 2007
Published: 02 July 2007
Identifying all protein complexes in an organism is a major goal of systems biology. In the past 18 months, the results of two genome-scale tandem affinity purification-mass spectrometry (TAP-MS) assays in yeast have been published, along with corresponding complex maps. For most complexes, the published data sets were surprisingly uncorrelated. It is therefore useful to consider the raw data from each study and generate an accurate complex map from a high-confidence data set that integrates the results of these and earlier assays.
Using an unsupervised probabilistic scoring scheme, we assigned a confidence score to each interaction in the matrix-model interpretation of the large-scale yeast mass-spectrometry data sets. The scoring metric proved more accurate than the filtering schemes used in the original data sets. We then took a high-confidence subset of these interactions and derived a set of complexes using MCL. The complexes show high correlation with existing annotations. Hierarchical organization of some protein complexes is evident from inter-complex interactions.
We demonstrate that our scoring method can generate an integrated high-confidence subset of observed matrix-model interactions, which we subsequently used to derive an accurate map of yeast complexes. Our results indicate that essentiality is a product of the protein complex rather than the individual protein, and that we have achieved near saturation of the yeast high-abundance, rich-media-expressed "complex-ome."
The molecular machines that carry out basic cellular processes are typically not individual proteins but protein complexes. Even in the relatively simple model organism Saccharomyces cerevisiae, most machines that process and store biological information are in fact large protein complexes comprised of many subunits.
The path from measuring protein interactions to defining complexes has been well studied. Experimental and computational methods have provided over 50,000 putative yeast protein-protein interactions to date, although a substantial fraction of these may be spurious[1, 2]. An array of analytical methods aimed at generating high-quality complexes from these data have been applied, including both unsupervised [3–5] and trained [6, 7] techniques. Other genomic and proteomic data sets, such as gene expression, knockout phenotype, subcellular localization, and genetic interaction profiles, and phylogenetic profiles [5, 6, 8–10], have also been integrated with the raw interaction data in an effort to broaden and deepen our ability to accurately define protein complexes.
Two recent genome-scale tandem affinity purification/mass spectrometry (TAP-MS) experiments perfomed by Gavin et al.  and Krogan et al. , have produced an enormous amount of new data, allowing a more complete analysis of the universe of yeast protein complexes. However, the complex maps published independently by the two groups show a surprising lack of correlation, which can only be partially explained by the different analytical methods applied after generating the raw data [1, 13].
TAP-MS data typically consist of a tagged "bait" protein and the associated "prey" proteins that co-purify with the bait. Interaction data sets are generated from this raw data using either the spoke method, which considers bait-prey interactions, or the matrix method, which includes all prey-prey interactions from a given bait pull-down . As the affinity purification process generally isolates stable complexes, there is no clear-cut way to differentiate between direct physical interactions and indirect interactions mediated by other members of the complex – or, for that matter, other proteins that appear simply a result of experimental noise. Thus, the spoke model contains both direct physical interactions and a sampling of the indirect interactions within a complex, plus some amount of noise, while the matrix model captures a much larger number of true indirect interactions at the price of decreased accuracy from linking every spurious protein to every "real" one, as well as linking proteins from heterogeneous complexes that each contain the bait. While some efforts have been made to use a filtered subset of matrix-model interactions to improve accuracy [9, 15, 16], analysis of mass spectrometry interaction data has typically been carried out using the spoke model [3, 5].
Here we offer a simple yet robust statistical scoring scheme for assigning confidence to observed interactions. The scheme is based on comparing observed versus expected numbers of interactions in the matrix model of protein-protein interactions, and provides greatly increased recall and/or precision over the standard spoke model interpretation. A further advantage of the system is that it can be used to integrate data sets from different sources. We use the scoring scheme to combine the Gavin et al. , Krogan et al. , and Ho et al.  co-complex data sets and define a high-quality subset comprised of 1689 proteins in 390 complexes. We further show that essential proteins strongly cluster together, supporting a complex-centric rather than gene-centric basis for essentiality for a large fraction of essential genes.
We derived a set of complexes at each threshold by using MCL , an implementation of a Markov clustering algorithm. MCL was evaluated in  and was used to derive complexes from the raw data in . To evaluate the accuracy of each set of complexes, we measured the Hubert statistic, H, of the derived complexes versus a reference set of complexes . Briefly, calculating H involves generating a matrix M of protein pairs (i, j) where M(i, j) = 1 if the proteins are in the same complex and 0 otherwise. The correlation between the experimental and reference matrices is then measured, resulting in a score from -1 to 1, with 1 implying identical complex assignments and values near zero indicating random assignment. We measured the Hubert statistic of complexes measured at each threshold against the set of curated MIPS complexes  with ribosomal subunits removed and against a filtered set of Gene Ontology (GO) Cellular Component (CC) annotations (see Methods). The correlations generally improve with increasing stringency (Figure 3B), although the rate of increase in correlation with GO component drops off sharply after the 10-2 cutoff. This improvement in accuracy comes at the price of decreasing coverage, reflected in the decreasing number of interactions at each threshold as shown in Figure 3A. In an attempt to balance accuracy and coverage, we selected the complexes derived from the E = 10-2 threshold, hereafter called the E-2 complexes, for further study.
Features of the E-2 complexes
The large fraction of E-2 complexes that correspond to existing annotations suggest that the data set is highly accurate. Of the 132 complexes with four or more subunits, 69% (91) are highly enriched for one or more specific GO component annotations; of the 44 complexes of size eight or larger, 84% (37) are so annotated. Furthermore, there are virtually no uncharacterized genes in these large complexes, and the few that appear have relatively weak connections to the other members of their respective clusters. This suggests that the yeast community has achieved a fairly complete description of a large fraction of the "complex-ome," at least for complexes containing many proteins. In fact, only one complex of size four or greater consists entirely of unnamed subunits and thus could be considered truly novel (complex C132, composed of proteins YAL049C, YDL025C, YGR016W, and YHR009C).
Several E-2 clusters represent amalgamations of known complexes. The MCL algorithm assigns each protein to exactly one complex, so protein complexes with shared subunits are sometimes found combined into a single cluster in the E-2 complexes. The C1 cluster, for example, includes RNA polymerase I, II, and III, largely because all three enzymes contain the Rpb5, Rpb8, Rpb10, and Rpo26 subunits. Likewise, complex C7 contains the TAFIID complex and the SAGA transcription factor/chromatin remodeling complex; these complexes share the Taf5, 6, 9, 10, and 12 proteins. It seems clear from the RNA polymerase case that the E-2 clusters occasionally contain discrete complexes that presumably do not physically interact.
Even the clusters that lack significant GO terms tend to have subunits that share similar free-text descriptions in the Saccharomyces Genome Database (SGD) . For example, complex C44 contains eight proteins, all of which are essential. Of these, seven are explicitly described in SGD as being involved in 60 S ribosome biogenesis or as components of 66 S pre-ribosomal particles, and the eighth is involved in export of pre-ribosomal large subunits from the nucleus. No GO term enrichment is found because the CC annotation is typically "nucleolus," a weak term excluded from our analysis (see Methods). Likewise, unannotated complexes C20, C30, and C78 contain 13, 10, and 5 proteins, respectively (10, 9, and 5 essential), that are all known or suspected to be involved in ribosome biogenesis. Other unannotated complexes include C43, eight largely nonessential proteins in the well-described cyclin/cyclin-dependent kinase group; C51, seven nonessential proteins involved in catabolite inactivation of FBPase; and C72, six proteins (five essential), of which five are involved in retrograde Golgi-to-ER trafficking and the sixth, Sec39, is of unknown function but "proposed to be involved in protein secretion."
Hierarchical structure of co-complex network
Essentiality of protein complexes
Modular nature of essentiality
The concentration of essential proteins into complexes suggests that essentiality is, in many cases, a product of complex function rather than individual protein function. This phenomenon has been observed by the Barabasi group  in an analysis of Ho and Gavin 2002. In using the raw data from these assays, the prior study assigns each bait pull-down to a discrete complex and does not correct for sampling the same complex with multiple baits. Thus, for example, purifications derived from TAP-tagged Nsp1, Nup60, Nup82, and Nup116 are all considered to be discrete complexes with a high fraction of essential proteins, while in reality these factors are all constituents of the same nuclear pore complex.
Essential Complexes. Selected essential complexes from the E-2 complex set. Complexes listed are composed of at least 4 subunits, of which >70% are essential. For each complex, the table lists the E-2 complex identifier, the size of the complex, the fraction of essential proteins, the most significant GO cellular component annotation for the complex, and the list of proteins in the complex. Twenty-six percent of all essential genes in yeast are represented in these complexes
% Essentia l
Most significantly enriched GO CC term
DNA-directed RNA polymerase III complex
DST1, IWR1, RET1, RPA12, RPA135, RPA14, RPA190, RPA34, RPA43, RPA49, RPB10, RPB11, RPB2, RPB3, RPB4, RPB5, RPB7, RPB8, RPB9, RPC11, RPC17, RPC19, RPC25, RPC31, RPC34, RPC37, RPC40, RPC53, RPC82, RPO21, RPO26, RPO31, SPT4, TFG1, TFG2
small nucleolar ribonucleoprotein complex
BMS1, DIP2, ECM16, EMG1, IMP3, MPP10, NAN1, NOC4, NOP14, POL5, PWP2, SOF1, UTP10, UTP13, UTP14, UTP15, UTP18, UTP20, UTP21, UTP30, UTP4, UTP5, UTP6, UTP7, UTP8, UTP9, YGR210C
mRNA cleavage and polyadenylation specificity factor complex
BUD14, CFT1, CFT2, FIP1, GIP3, GLC7, GLC8, MPE1, PAP1, PFS2, PTA1, PTI1, REF2, SDS22, SSU72, SWD2, SYC1, YPI1, YSH1, YTH1
U4/U6 × U5 tri-snRNP complex
AAR2, BRR2, DIB1, LEA1, LSM8, PRP11, PRP21, PRP3, PRP31, PRP38, PRP4, PRP6, PRP8, PRP9, RSE1, SMX2, SNU114, SNU23, SNU66, SPP381
proteasome core complex, alpha-subunit complex (sensu Eukaryota)
FLC2, GRH1, OSM1, PRE1, PRE10, PRE2, PRE3, PRE4, PRE5, PRE6, PRE7, PRE8, PRE9, PUP1, PUP2, PUP3, RED1, SCL1
BRR1, LUC7, MUD1, NAM8, PRP39, PRP40, PRP42, SMB1, SMD1, SMD2, SMD3, SME1, SMX3, SNP1, SNU56, SNU71, STO1, YHC1
(no significant annotation)
BRX1, CIC1, DRS1, ERB1, FPR4, HAS1, MAK5, NOC2, NOC3, PUF6, PWP1, RRP5, YTM1
eukaryotic translation initiation factor 2B complex
CDC123, GCD1, GCD11, GCD2, GCD6, GCD7, GCN3, PET111, SUI2, SUI3, YVH1
(no significant annotation)
EBP2, MRT4, NOG1, NOP15, NOP2, NOP7, NUG1, RLP7, RPF2, TIF6
GLE2, NIC96, NSP1, NUP116, NUP159, NUP49, NUP57, NUP82
ASK1, DAD1, DAD2, DAD3, DAM1, DUO1, SPC19, SPC34
EXO70, EXO84, SEC10, SEC15, SEC3, SEC5, SEC6, SEC8
(no significant annotation)
DBP10, NIP7, NSA1, RIX7, RPF1, RRP1, SPB1, SPB4
Arp2/3 protein complex
ARC15, ARC18, ARC19, ARC35, ARC40, ARP2, ARP3
DNA replication factor C complex
CTF18, ELG1, RFC1, RFC2, RFC3, RFC4, RFC5
transcription factor TFIIH complex
CCL1, KIN28, RAD3, SSL1, TFB1, TFB3, TFB4
signal recognition particle (sensu Eukaryota)
LHP1, SEC65, SRP14, SRP21, SRP54, SRP68, SRP72
nucleolar ribonuclease P complex
POP1, POP3, POP4, POP5, POP7, POP8, RPP1
nuclear origin of replication recognition complex
ORC1, ORC2, ORC3, ORC4, ORC5, ORC6
transcription factor TFIIIC complex
TFC1, TFC3, TFC4, TFC6, TFC7, TFC8
(no significant annotation)
DSL1, SEC22, SEC39, TIP20, UFE1, USE1
CCT2, CCT3, CCT4, CCT5, CCT6, TCP1
(no significant annotation)
IPI1, IPI3, RIX1, RSA4, SDA1
nuclear cohesin complex
CDC5, IRR1, MCD1, SMC1, SMC3
CTF4, PSF1, PSF2, PSF3, SLD5
nuclear condensin complex
BRN1, SMC2, SMC4, YCG1, YCS4
nucleolar preribosome, small subunit precursor
ENP1, HRR25, LTV1, RIO2, TSR1
DSN1, MTW1, NNF1, NSL1
alpha DNA polymerase:primase complex
POL1, POL12, PRI1, PRI2
(no significant annotation)
CIA1, MET18, NAR1, YHR122W
(no significant annotation)
NAB3, NAB6, NRD1, SEN1
mRNA cleavage factor complex
CLP1, PCF11, RNA14, RNA15
transcription factor TFIIE complex
DBP2, PPN1, TFA1, TFA2
outer plaque of spindle pole body
SPC72, SPC97, SPC98, TUB4
NUF2, SPC24, SPC25, TID3
Comparison to Collins et al
Further comparison shows that the hypergeometric scoring method and the Collins method yielded data sets of nearly equal accuracy. We rank-ordered the two sets of interactions by their respective scores, divided each into bins of 500 interactions, and then plotted the cumulative recall and precision of each versus MIPS co-complex interactions [see Additional File 5]. The Collins data set achieves greater coverage than the PICO network, at somewhat lower overall accuracy, when performance is calculated against the entire MIPS reference. The difference is due almost entirely to the inclusion of ribosomal protein interactions in Collins: when the ribosome is removed from the MIPS reference set, both networks provide nearly identical recall (~34%) and precision (~81%). That the networks generated by the two methods overlap so strongly, despite our inclusion of the Ho dataset and use of a much higher confidence threshold for the Krogan raw data, suggests the networks capture a highly accurate subset of yeast co-complex interactions, and that the simple probabilistic method offered in this study is an effective tool for assigning relative confidence rankings to observations in large-scale data sets.
It is worth noting that even the highest-scoring interactions in the two analyses do not reach 100% precision versus the MIPS reference. This is in part due to the incompleteness of the reference set. An interaction is defined as a false positive if and only if both its corresponding proteins are present in the reference set but the interaction is not. Thus, true interactions that are detected experimentally but absent from the reference set will be scored as false positives (provided the proteins are present in the reference set). We observe several cases of this. For example, the Tub4 gamma tubulin complex is composed of Spc97, Spc98, and Tub4, as defined by GO Cellular Component and MIPS annotation. The E2 derived complex also includes Spc72, the spindle pole body component which interacts with the Tub4p complex . The MIPS reference does not include Spc72 in the gamma tubulin complex but does include the protein in the "Spindle Pole Body Components" collection of proteins. Thus interactions between Spc72 and other members of the gamma tubulin complex, while almost certainly "true" co-complex interactions, are scored as false positives when calculating precision versus MIPS. All such experimentally detected inter-complex interactions are absent in the MIPS reference set. Thus the incompleteness of the reference set prevents a high-accuracy experimental data set from achieving perfect precision.
We have described a simple yet robust unsupervised method of assigning confidence levels to interactions observed in a large-scale assay, as well as combining data from independent assays into an integrated whole that can be used for further study. We used this method to integrate data from three large-scale affinity purification-mass spectrometry assays in yeast to generate a high-confidence subset of interactions, from which we derived an accurate set of protein complexes. The recall of MIPS co-complex interactions indicates that no more than 46% of the total co-complex interactome in yeast has been assayed by TAP-MS methods (with only 34% in the high confidence E2 set). Nonetheless, the high proportion of complexes that correspond to existing annotations and the small number of uncharacterized genes present in our high-confidence data strongly suggest that the community has largely saturated the fraction of the complex-ome that is accessible to the methods (TAP-MS) and conditions (aerobic growth in rich media) that have been explored so far. Therefore, it would likely be fruitful to explore other conditions and growth states to extend the interactome.
Our complex data also support the notion that, in many cases, essentiality is tied not to the protein or gene itself, but to the molecular machine to which that protein belongs. We can clearly separate the majority of complexes into essential and nonessential. The few that are mixed – for example, the SAGA/TAFIID complex – lead to interesting questions about the essentiality of specific interactions . We anticipate that the complex descriptions offered here, as well as the general scoring method, can be used in other functional genomics and systems biology studies.
Data from Ho et al. were taken from Table S1 of . Interactions from Gavin et al. were taken from Supplementary Table 1 of . In both cases, bait-prey pairs were generated from the list of purifications, with the bait removed from the prey list if applicable. Interactions from Krogan et al.  were taken from the raw LCMS and MALDI purification data. Bait-prey pairs from LCMS purifications with confidence > = 99.6 and MALDI purifications with score > = 3.4 were included. Matrix-model data sets were generated by considering all prey-prey pairs if both prey were purified from the same bait.
Reference data sets
MIPS filtered data: The MIPS curated complex data were downloaded from mpact. All high-throughput data, as well as the large and small ribosomal subunits, were excluded. An all-by-all set of interactions was generated from each complex and used as a reference to calculate recall/precision curves of experimental data. The co-complex data was used to calculate the Hubert statistic.
GO filtered reference set: The complete yeast GO Cellular Component ontology was downloaded from the Saccharomyces Genome Database  on 5 December 2006. Annotations were sorted by the number of genes to which they applied; all annotations equal to or larger than the size of the "small cytoplasmic ribosomal subunit" were discarded. The resulting set of annotations is mostly complexes, with a small number of discrete cellular localizations included. This annotation set was used to calculate GO term enrichment and the Hubert statistic.
The MCL program was downloaded from . For each E-value threshold of the PICO network, MCL was run with the following parameter space: -I, 1.8 to 3.0 in 0.2 increments; -C, 0.5 to 1.5 in 0.25 increments; -S, 0 to 7. The Hubert statistic (H) was calculated for each MCL result against the GO filtered reference set and the MCL result with the highest H score was considered the optimal result for that E-value. The -S parameter was found to have no effect on our results.
Calculation of the Hubert statistic, H, was performed as described in . As the matrices must be equal size, the calculation was performed on the potential interaction space defined by the set of proteins present in both the experimental and reference protein sets.
The Simpson coefficient, Cs of similarity between sets of proteins A and B, is:
Cs = (# proteins in A and B)/min(# proteins in A, # proteins in B)
The list of essential ORFS was downloaded from the Saccharomyces Genome Database. We considered only verified or uncharacterized ORFs.
This work was supported by grants from the N.S.F., N.I.H., Welch Foundation (F1515), and a Packard Fellowship (E.M.M).
- Hart GT, Ramani AK, Marcotte EM: How complete are current yeast and human protein-interaction networks?. Genome Biol. 2006, 7 (11): 120-10.1186/gb-2006-7-11-120.PubMed CentralView ArticlePubMedGoogle Scholar
- Reguly T, Breitkreutz A, Boucher L, Breitkreutz BJ, Hon GC, Myers CL, Parsons A, Friesen H, Oughtred R, Tong A, Stark C, Ho Y, Botstein D, Andrews B, Boone C, Troyanskya OG, Ideker T, Dolinski K, Batada NN, Tyers M: Comprehensive curation and analysis of global interaction networks in Saccharomyces cerevisiae. J Biol. 2006, 5 (4): 11-10.1186/jbiol36.PubMed CentralView ArticlePubMedGoogle Scholar
- Maciag K, Altschuler SJ, Slack MD, Krogan NJ, Emili A, Greenblatt JF, Maniatis T, Wu LF: Systems-level analyses identify extensive coupling among gene expression machines. Mol Syst Biol. 2006, 2: 2006 0003-10.1038/msb4100045.PubMed CentralView ArticlePubMedGoogle Scholar
- Krause R, von Mering C, Bork P: A comprehensive set of protein complexes in yeast: mining large scale protein-protein interaction screens. Bioinformatics. 2003, 19 (15): 1901-1908. 10.1093/bioinformatics/btg344.View ArticlePubMedGoogle Scholar
- Dezso Z, Oltvai ZN, Barabasi AL: Bioinformatics analysis of experimentally determined protein complexes in the yeast Saccharomyces cerevisiae. Genome Res. 2003, 13 (11): 2450-2454. 10.1101/gr.1073603.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang LV, Wong SL, King OD, Roth FP: Predicting co-complexed protein pairs using genomic and proteomic data integration. BMC Bioinformatics. 2004, 5: 38-10.1186/1471-2105-5-38.PubMed CentralView ArticlePubMedGoogle Scholar
- Asthana S, King OD, Gibbons FD, Roth FP: Predicting protein complex membership using probabilistic network reliability. Genome Res. 2004, 14 (6): 1170-1175. 10.1101/gr.2203804.PubMed CentralView ArticlePubMedGoogle Scholar
- Jansen R, Greenbaum D, Gerstein M: Relating whole-genome expression data with protein-protein interactions. Genome Res. 2002, 12 (1): 37-46. 10.1101/gr.205602.PubMed CentralView ArticlePubMedGoogle Scholar
- de Lichtenberg U, Jensen LJ, Brunak S, Bork P: Dynamic complex formation during the yeast cell cycle. Science. 2005, 307 (5710): 724-727. 10.1126/science.1105103.View ArticlePubMedGoogle Scholar
- von Mering C, Krause R, Snel B, Cornell M, Oliver SG, Fields S, Bork P: Comparative assessment of large-scale data sets of protein-protein interactions. Nature. 2002, 417 (6887): 399-403. 10.1038/nature750.View ArticlePubMedGoogle Scholar
- Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, Russell RB, Superti-Furga G: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631-636. 10.1038/nature04532.View ArticlePubMedGoogle Scholar
- Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, Peregrin-Alvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Canete MM, Vlasblom J, Wu S, Orsi C, Collins SR, Chandran S, Haw R, Rilstone JJ, Gandi K, Thompson NJ, Musso G, St Onge P, Ghanny S, Lam MH, Butland G, Altaf-Ul AM, Kanaya S, Shilatifard A, O'Shea E, Weissman JS, Ingles CJ, Hughes TR, Parkinson J, Gerstein M, Wodak SJ, Emili A, Greenblatt JF: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440 (7084): 637-643. 10.1038/nature04670.View ArticlePubMedGoogle Scholar
- Goll J, Uetz P: The elusive yeast interactome. Genome Biol. 2006, 7 (6): 223-PubMed CentralView ArticlePubMedGoogle Scholar
- Bader GD, Hogue CW: Analyzing yeast protein-protein interaction data obtained from different sources. Nat Biotechnol. 2002, 20 (10): 991-997. 10.1038/nbt1002-991.View ArticlePubMedGoogle Scholar
- D'Haeseleer P, Church GM: Estimating and improving protein interaction error rates. Proc IEEE Comput Syst Bioinform Conf. 2004, 216-223.Google Scholar
- Gagneur J, Krause R, Bouwmeester T, Casari G: Modular decomposition of protein-protein interaction networks. Genome Biol. 2004, 5 (8): R57-10.1186/gb-2004-5-8-r57.PubMed CentralView ArticlePubMedGoogle Scholar
- Ho Y, Gruhler A, Heilbut A, Bader GD, Moore L, Adams SL, Millar A, Taylor P, Bennett K, Boutilier K, Yang L, Wolting C, Donaldson I, Schandorff S, Shewnarane J, Vo M, Taggart J, Goudreault M, Muskat B, Alfarano C, Dewar D, Lin Z, Michalickova K, Willems AR, Sassi H, Nielsen PA, Rasmussen KJ, Andersen JR, Johansen LE, Hansen LH, Jespersen H, Podtelejnikov A, Nielsen E, Crawford J, Poulsen V, Sorensen BD, Matthiesen J, Hendrickson RC, Gleeson F, Pawson T, Moran MF, Durocher D, Mann M, Hogue CW, Figeys D, Tyers M: Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature. 2002, 415 (6868): 180-183. 10.1038/415180a.View ArticlePubMedGoogle Scholar
- Marcotte CJ, Marcotte EM: Predicting functional linkages from gene fusions with confidence. Appl Bioinformatics. 2002, 1 (2): 93-100.PubMedGoogle Scholar
- Lee I, Narayanaswamy R, Marcotte EM: Bioinformatic prediction of yeast gene function . Yeast Gene Analysis. Edited by: Stansfield I. 2007, Elsevier PressGoogle Scholar
- Samanta MP, Liang S: Predicting protein functions from redundancies in large-scale protein interaction networks. Proc Natl Acad Sci U S A. 2003, 100 (22): 12579-12583. 10.1073/pnas.2132527100.PubMed CentralView ArticlePubMedGoogle Scholar
- Schlitt T, Palin K, Rung J, Dietmann S, Lappe M, Ukkonen E, Brazma A: From gene networks to gene function. Genome Res. 2003, 13 (12): 2568-2576. 10.1101/gr.1111403.PubMed CentralView ArticlePubMedGoogle Scholar
- Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, Drewes G, Raida M, Bouwmeester T, Bork P, Seraphin B, Kuster B, Neubauer G, Superti-Furga G: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415 (6868): 141-147. 10.1038/415141a.View ArticlePubMedGoogle Scholar
- Guldener U, Munsterkotter M, Oesterheld M, Pagel P, Ruepp A, Mewes HW, Stumpflen V: MPact: the MIPS protein interaction resource on yeast. Nucleic Acids Res. 2006, 34 (Database issue): D436-41. 10.1093/nar/gkj003.PubMed CentralView ArticlePubMedGoogle Scholar
- Enright AJ, Van Dongen S, Ouzounis CA: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res. 2002, 30 (7): 1575-1584. 10.1093/nar/30.7.1575.PubMed CentralView ArticlePubMedGoogle Scholar
- Brohee S, van Helden J: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics. 2006, 7: 488-10.1186/1471-2105-7-488.PubMed CentralView ArticlePubMedGoogle Scholar
- Dhillon IS, Marcotte EM, Roshan U: Diametrical clustering for identifying anti-correlated gene clusters. Bioinformatics. 2003, 19 (13): 1612-1619. 10.1093/bioinformatics/btg209.View ArticlePubMedGoogle Scholar
- Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res. 2003, 13 (11): 2498-2504. 10.1101/gr.1239303.PubMed CentralView ArticlePubMedGoogle Scholar
- Hong EL BR: "Saccharomyces Genome Database".Google Scholar
- Collins SR, Kemmeren P, Zhao XC, Greenblatt JF, Spencer F, Holstege FC, Weissman JS, Krogan NJ: Toward a Comprehensive Atlas of the Physical Interactome of Saccharomyces cerevisiae. Mol Cell Proteomics. 2007, 6 (3): 439-450. 10.1074/mcp.M600381-MCP200.View ArticlePubMedGoogle Scholar
- Knop M, Schiebel E: Receptors determine the cellular localization of a gamma-tubulin complex and thereby the site of microtubule formation. Embo J. 1998, 17 (14): 3952-3967. 10.1093/emboj/17.14.3952.PubMed CentralView ArticlePubMedGoogle Scholar
- He X, Zhang J: Why do hubs tend to be essential in protein networks?. PLoS Genet. 2006, 2 (6): e88-10.1371/journal.pgen.0020088.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.