Identifying the essential components in a specific biological process and detecting the associations among these components in response to various conditions are important for understanding cellular functions. Such components consist of interacting proteins, DNA, and other molecules such as complexes, pathways, and regulatory programs [1–4]. Therefore, a set of genes encoding proteins that are associated by functional related interactions, such as direct physical interactions between members of a complex, cascading interactions of a pathway, or regulatory interactions between a factor and it's targets, form a functional module to facilitate a specific cellular function [2–4]. To conduct a cellular process, module cooperation is necessary to properly facilitate signal transduction, regulation, and metabolism. This cooperation can be established by direct interactions among components (crosstalk) or through shared partners [5, 6]. To adapt to changing environmental conditions, the formation of functional modules and interactions among these modules are likely to be dynamic and condition-specific. To sustain cellular activities upon changes in the extra- or intracellular environment, specific functional modules and interactions among modules are induced by a series of signaling and regulatory cascades [3, 4, 6–8]. For example, under low-nitrogen conditions, crosstalk is observed between two signaling pathways in Saccharomyces cerevisiae, the cAMP and MAPK pathways, which are both downstream of the small GTPase Ras. These pathways in turn control the cell surface glycoprotein Flo11 and are involved in invasive and filamentous growth [9, 10]. Therefore, discovering dynamically assembling modules, associations among these modules, and their condition-specific functions are critical for understanding the mechanisms of a biological process.
Large amounts of yeast two-hybrid, DNA microarray, and other high-throughput data are now publicly available [11–14]. These datasets not only provide information related to gene function and direct interactions among genes, but they also enable the use of clustering-based methods to discover functional modules [3, 15–20]. By applying clustering algorithms to different datasets, various types of functional modules, including protein complexes, co-regulated modules, and signaling and metabolic pathways, can be extracted. In addition, with datasets derived from specific experimental conditions, functional modules with special properties, such as evolutionarily conserved complexes and condition-related functional components, can also be found [3, 17, 19, 21]. Based on the identified modules, researchers can use network measurement approaches to further analyze the properties of a module or to compare modules from different datasets to elucidate various biological characteristics [16, 19]. Clustering-based approaches, however, only focus on module identification and do not consider the connectivity between modules. Therefore, these approaches do not readily provide information about associations between modules such as module cooperation.
Recently, several groups have developed approaches to discover coordinated relationships between pairs of modules and to establish more complete frameworks for various cellular processes [5, 22, 23]. One type of approach searches for crosstalk pathways that significantly interact. By measuring the number of protein-protein interactions among all possible pathway pairs from a database, such as BioCarta, the pathway pairs with a statistically significant number of protein interactions can be identified . Another type of approach aims to select module pairs that are coordinated in their gene expression levels by using data from Gene Ontology (GO) and DNA microarrays [5, 22]. Thus, these methods identify coordinated relationships that are co-regulated by common regulators or are co-expressed under specific conditions. Both types of approaches are suitable for characterizing the properties of module association.
Although the above-mentioned methods can be used to measure correlations between module pairs, they ignore interactions mediated by genes that associate with module pairs. These interactions are direct clues used to interpret the influence, function, and mechanisms of module cooperation and, importantly, to estimate the necessity of the cooperation between a module pair. Moreover, as the modules evaluated by these methods are previously-defined gene sets, it is difficult to identify dynamically assembled functional modules and correlations between modules in a specific condition. Therefore, tools still need to be developed to discover and study cooperating module pairs that function in important signal transduction, regulatory and metabolic reactions under specific conditions.
In this paper, we propose an approach to study module cooperation. We identified cooperating module pairs by searching for functional module pairs that significantly correlate with genes with important functions and genes that mediate communication between functional components of a process. To evaluate our approach, we also analyzed the functions, cooperating genes, and mechanisms of each identified module pair. Using the yeast cell cycle as an example, we identified cooperating module pairs and predicted the mediators and interactions that are important for module cooperation in each phase of the cell cycle. The yeast cell cycle is divided into four phases: G1, S (synthesis), G2, and M (mitosis). During this cycle, a cell duplicates and divides into two daughter cells through a series of regulatory events and checkpoint mechanisms. Cell cycle-specific components dynamically assemble and interact with specific factors to control progression through the cell cycle. For example, in G1 phase, the major regulator Cdc28 combines with G1 cyclins and associates with other G1-specific transcription factors, such as the SBF complex (Swi4/Swi6), to regulate G1/S-specific genes and prepare the cell for DNA replication [24, 25]. In S phase, specific component coordination appears to promote DNA replication, bud emergence, SPB duplication, and SPB separation . In G2 and M phases, Cdc28 and B-type cyclins form complexes that induce chromosome condensation, spindle elongation, and nuclear division . In addition, to ensure that events of the cell cycle finish completely, checkpoint mechanisms coordinate multiple pathways to control progression through the cell cycle . Due to its complex regulation and the dynamic interactions of its components, studying the cell cycle requires a systematic approach that analyzes cooperation among functional components.
Rather than considering only one type of data, our approach provides a platform that allows interaction and expression data to be integrated. The expression data provide information about dynamic correlations among genes in the yeast cell cycle, and the interaction data suggest possible interactions among genes. This information can be used to predict genes and interactions that may function in the yeast cell cycle. Advantages of combining heterogeneous data were demonstrated by the studies of functional association prediction. These approaches used a probabilistic model to combine expression correlations and physical interactions between genes measured from different experimental data sets [29–31]. The combined scores were used to establish a gene network to present the functional associations between genes and to predict gene function [29, 31]. To identify functional modules and the cooperating pairs that directly interact with genes essential to the cell cycle, we used a different approach to combine information from protein-protein interactions, ChIP-chip data, and microarrays. We did not use combined association scores between genes to construct the gene network but instead used direct physical interactions to represent links among genes. However, information from expression correlation was used to measure the essentiality of genes to the cell cycle. Therefore, we can design an algorithm to search cooperating sub-networks (modules) based on the physical interaction network. In addition, we evaluated the importance of module cooperation and only reported module pairs that significantly influence the cell cycle process. To analyze the architecture and special properties of module cooperation in the cell cycle, the resulting module pairs were further used to construct a cooperative module network (CMN). This cooperative module network presents cell cycle-specific modules and cooperative associations between the modules.
To understand the functions and communication mechanisms of each cooperative association, we also predicted genes related to each cooperative association (correlated genes). Such genes could be regulators, signal communicators, regulated genes, or members of a protein complex. Based on interactions among these correlated genes and genes within the modules, we further inferred the functions and effects of the cooperative associations in the cell cycle. Thus, we used a gene set consisting of genes regulated in a specific phase of the cell cycle and regulators of each phase to verify and explore cooperating interactions of the identified module pairs functioning in specific signal transduction, regulation and other activities of the yeast cell cycle. Using this phase-regulated gene set, we predicted phase-related interactions and genes mediating cooperative associations in a specific phase and then discovered dynamic changes in these interactions during the cell cycle. Based on interactions of phase-specific regulators, we constructed relationship graphs for each phase of the cell cycle to identify possible crosstalk among modules through phase-specific regulators and to attempt to explain the roles of transcriptional regulators in controlling the cooperation of and connections between modules. These graphs present a dynamic view of the module interactions in the yeast cell cycle. By comparing graphs, we gained important insights into the changes in associations between the different functional modules.