GBNet: Deciphering regulatory rules in the co-regulated genes using a Gibbs sampler enhanced Bayesian network approach
© Shen et al; licensee BioMed Central Ltd. 2008
Received: 08 July 2008
Accepted: 24 September 2008
Published: 24 September 2008
Combinatorial regulation of transcription factors (TFs) is important in determining the complex gene expression patterns particularly in higher organisms. Deciphering regulatory rules between cooperative TFs is a critical step towards understanding the mechanisms of combinatorial regulation.
We present here a Bayesian network approach called GBNet to search for DNA motifs that may be cooperative in transcriptional regulation and the sequence constraints that these motifs may satisfy. We showed that GBNet outperformed the other available methods in the simulated and the yeast data. We also demonstrated the usefulness of GBNet on learning regulatory rules between YY1, a human TF, and its co-factors. Most of the rules learned by GBNet on YY1 and co-factors were supported by literature. In addition, a spacing constraint between YY1 and E2F was also supported by independent TF binding experiments.
We thus conclude that GBNet is a useful tool for deciphering the "grammar" of transcriptional regulation.
Decoding regulatory interactions between transcription factors (TFs) and their target genes is critical in understanding the complex gene expression patterns in response to extra- or intra-cellular signals. Many computational methods have been developed to identify the cis-regulatory elements recognized by TFs . These DNA motifs have also been determined by experimental measurements . The accumulation of known TF motifs facilitates addressing a more challenging question, understanding the combinatorial regulation of TFs and deciphering the rules of how the TFs cooperate with each other, which is particularly important for studying transcriptional regulation in higher organisms .
The previous efforts have been mainly focused on inferring which TFs may function together [4–13]. However, these studies cannot reveal the regulatory mechanisms of combinatorial regulation, namely whether a TF motif has a positional preference relative to the transcription start site (TSS) or whether the order of the two motifs matters for their cooperation. The importance of such regulatory "grammar" has been observed in numerous studies. For example, the binding site of the repressor Giant relative to those of the Gal4 activators determined transcription of a reporter gene in the embryo of Drosophila melanogaster .
Searching for the sequence constraints between TF motifs is a difficult task. As we will see below, simple enumeration of all possible sequence features and conducting statistical test to evaluate the significance for each of them is computationally expensive for even a modest number of candidate motifs. Alternative methods are thus needed to tackle this problem. Recently, Elemento et al. developed a motif finding algorithm called FIRE that can predict various sequence constraints of motifs and whether pairs of motifs interact with each other . FIRE is based on the mutual information between the sequence features of interest and gene expression. However, because Elemento et al. emphasized on removing false positives, the relative small number of predicted motifs implies that some true motifs might be missed. For example, only 17 DNA motifs and 6 RNA motifs were identified from 78 clusters generated from 173 microarray experiments in yeast . In addition, FIRE only considers consensus sequences with mismatches, which is a relatively simple representation of motifs compared with position weight matrix. More importantly, FIRE cannot consider the joint effects of multiple rules. The rules were tested individually by FIRE and the computational cost would be too high to enumerate all possible combinations of rules (see below). Therefore, for example, synergy between rules cannot be detected by FIRE.
In the present study, we adopted a Bayesian network approach to identify regulatory grammars because Bayesian network explicitly models the nonlinear relationship between sequence rules. Our goal is to find enriched constraints for DNA motifs such as spacing between TF binding sites and positional bias of a TF sites relative to TSS in a group of sequences, often promoter sequences of a set of co-regulated genes. This can be considered as a generalization of motif finding algorithms. It is important to emphasize that we do not aim to predict gene expression based on sequences, which is the goal of the studies of, such as, Beer and Tavazoie  and Yuan et al. .
We implemented a Gibbs sampling procedure to search for optimal Bayesian network structure. We call our method GBNet, G ibbs sampler enhanced B ayesian Net work, and the software is available at http://modem.ucsd.edu/shenli/gbnet.tgz. To demonstrate the strength of our searching strategy, we compared the performance of GBNet with BBNet, in which a greedy searching algorithm is implemented to search for the optimal Bayesian network. We have applied both methods to simulated data as well as yeast and human data. The results showed that Gibbs sampling has much better performance than greedy search in searching for sequence constraints between cooperative TFs. We also demonstrated that numerous sequence features identified by GBNet for human transcription factor YY1 were supported by literature and experimental evidence.
GBNet: a Gibbs-sampler enhanced Bayesian network
Because the number of sequence features grows exponentially with the number of candidate motifs, searching a set of optimal sequence features is not trivial. We employed a Gibbs sampler enhanced global search strategy to tackle this problem (see Methods for details). Six sequence features as defined in  were considered: presence of a motif, distance from transcription start site (TSS), spacing between two motifs, orientation of a motif, presence of a second copy of a motif and order between two motifs. GBNet can therefore be considered as a generalization of sequence motif finding: instead of searching for enriched consensus motifs, enriched combinations of motifs satisfying a specific constraint is being searched.
Validation of GBNet on simulated data
We first validated the performance of GBNet using simulated data. To keep the sequences as natural as possible, we took the sequences from the 114 promoters in the fourth yeast cluster of . We then implanted a spacing constraint between two yeast motifs, distance between PAC and RRPE motifs less than 40 bp, in a portion of genes ranging from 40% to 80% with an interval of 10%. The original instances of PAC and RRPE were removed. These simulated sequences and the weight matrices of the 666 yeast motifs taken from  were input to GBNet for identification of enriched sequence constraints between these motifs. We used the same 1789 background sequences as in .
Sequence constraints learned by BBNet and GBNet in the five simulated datasets
1. Distance to TSS of M604:180
2. Presence of PAC
3. Presence of M599
1. Distance to TSS of M604:140
2. Distance between RRPE and PAC:40
3. Distance to TSS of M599:160
4. Presence of M593
1. Presence of PAC
2. Distance to TSS of M604:180
3. Presence of M599
1. Distance to TSS of M604:200
2. Distance between RRPE and PAC:40
3. Distance to TSS of M599:480
1. Presence of PAC
2. Distance to TSS of M604:180
3. Presence of RRPE
4. Presence of M599
1. Distance between PAC and RRPE:40
2. Distance to TSS of M604:140
3. Distance between M604 and M599:500
1. Presence of PAC
2. Presence of RRPE
3. Distance to TSS of M604:140
1. Distance between PAC and RRPE:40
2. Distance to TSS of M604:140
1. Presence of PAC
2. Presence of RRPE
3. Distance to TSS of M604:140
1. Distance between PAC and RRPE:40
2. Distance between M604 and M599:340
3. Distance to TSS of M604:140
Validation of GBNet on the PAC and RRPE example
We then validated the performance of GBNet using a real dataset. We took the fourth yeast cluster of  in which Beer and Tavazoie found two regulatory rules for PAC and RRPE: 1. PAC (M600) is within 140 bp of ATG; 2. RRPE (M602) is within 240 bp of ATG. When both rules are satisfied, the genes containing the two motifs showed highly correlated expression patterns across a variety of conditions. When neither of these rules was satisfied, the gene expression patterns were indistinguishable from the background .
In , the above two rules were learned from the bootstrap samples but not directly from the original sequences in the fourth yeast cluster. We generated numerous sets of 10 bootstrap samples. The two rules could be simultaneously identified in only 1 to 3 out of 10 bootstrap samples by BBNet and they were not necessarily the most abundant rules learned by BBNet from these samples. Different from Beer and Tavazoie's goal to predict gene expression from sequence, we aim to identify sequence constraints between cooperative TF binding motifs from a group of genes with coherent expression patterns. Therefore, bootstrap is not an option for our purpose. When we applied BBNet and GBNet to the original sequences in the fourth yeast cluster, BBNet correctly found the first rule of PAC but only the presence of RRPE instead of the distance constraint (Table S1); In contrast, GBNet successfully found both rules.
Applying GBNet to the 49 yeast clusters
The above analyses suggested that GBNet can find the rules of combinatorial regulation between TFs. To have a large scale comparison between GBNet and BBNet, we then applied them to the 49 yeast clusters of 2770 genes in  (Table S1). We compared GBNet and BBNet on the following aspects using the original data without bootstrap sampling.
GBNet fits better models to the data than BBNet
A Bayesian score reflects how well a model fits to the data. The rules learned by GBNet gave better Bayesian scores in 47 clusters than those learned by BBNet (Table S1 in the Additional file 1). The sum of Bayesian scores for all 49 clusters is -4394.3 for BBNet and -4306.6 for GBNet. On average, GBNet achieved a better Bayesian score ~1.8/cluster than BBNet. Again, the Bayesian networks learned by GBNet are not more complex than those learned by BBNet. This can be seen by the average number of rules per cluster: 2.1 for BBNet vs. 2.3 for GBNet.
GBNet finds more biologically interesting rules
GBNet searches more thoroughly in the rule space
GBNet searches much more efficiently in the rule space than enumeration
An alternative to developing models like GBNet for learning sequence constraints is enumerating all possible rules and selecting the best scored ones, using either Bayesian score or mutual information like in FIRE. However, the possible number of combinations of rules is so large to make this straightforward approach computationally prohibited. For example, learning spacing constraints among 50 candidate motifs in yeast, one needs to consider 19 function depths (0.05–0.95 with an interval of 0.05) for each motif and 30 possible distances between two motifs (20–600 bps with an interval of 20 bps). In total, (50 × 49/2) × 19 × 19 × 30 = 13,266,750 statistical tests have to be computed for learning the single spacing constraint. For a distance constraint from TSS (positional bias), 50 × 19 × 30 = 28500 tests need to be performed. Therefore, when consider the combination of the above two rules (spacing constraint and positional bias), there are total 13,266,750*28500 = 3.78*1011 possibilities, which makes enumeration infeasible. As a comparison, on average, GBNet calculated 570,000 times of Bayesian scores per cluster that considered combinations of all six types of constraints. The running time of GBNet per cluster was 3 to 4 hours on a desktop computer with a 1.8 GHz CPU, which means enumeration would take >1.99*106 to 2.65*106 hours for only considering the two types of rules mentioned above per cluster. The efficiency difference between GBNet and enumeration becomes more significant when more candidate motifs are considered for learning sequence constrains. This is because GBNet scales better than linearly with the number of candidate motifs (Table S2). As a comparison, the computational cost of enumeration is polynomial to the number of candidate motifs when considering one rule or exponential when considering combination of rules.
Dissecting transcriptional regulatory rules of a human transcription factor YY1
Combinatorial regulation is much more prevalent in higher organisms than in yeast. To demonstrate the usefulness of GBNet, we applied it to studying transcriptional regulation by a human transcription factor (TF) called YY1, which plays essential roles during development [18–20]. For the purpose of comparison, we also analyzed this human dataset using BBNet. Our previous study showed that YY1 mainly binds to the 1.5 kbp regions around the transcription start site . Therefore, we focus on searching for sequence constraints between YY1 and its cofactors in the proximal promoters.
Identifying YY1 target genes and clustering of gene expression profiles
ChIP-chip analysis of YY1 binding has been conducted using a whole-genome promoter array in human HeLa cells . We used a Gibbs sampler based computational algorithm, called GI bbs sampler for finding T ranscription factor TAR get genes (GITTAR)  that integrates sequence motif and ChIP-chip binding information to identify a set of confident YY1 target genes. The intuition behind the GITTAR algorithm is the same as that of MODEM , namely genes containing the YY1 motif and showing significant ChIP-chip ratio are likely to be YY1 targets. GITTAR identified 968 such genes and the average of their log2 ratios is 2.47 ± 0.70, which significantly deviates from the background (0.23 ± 0.65). A 12-bp long motif defined by GITTAR (Fig. S1 in the Additional file 2) was used in the following analyses.
Finding enriched motifs in each cluster
To search for potential YY1 co-factors, we collected 505 motifs of human TFs from the TRANSFAC database (Version 10.2)  and examined their enrichment in each of the five clusters against the genes outside of the cluster but included in the YY1 ChIP-chip study using Fisher's exact test  based on hypergeometric distribution (see Methods). In addition, we also conducted de novo motif finding using BioProspector . It is not surprising that the YY1 motif was always ranked on the top in Fisher's test. Numerous motifs like E2Fs, CREB, ELK1 and NFY were also significantly enriched. In total, a list of 74 motifs was compiled (Table S3) as the candidate motifs for further analysis of combinatorial rules by GBNet and BBNet.
Learning combinatorial regulation between YY1 and its co-factors
Sequence constraints learned by BBNet and GBNet in the five human YY1 clusters. The functional depth for each motif is in parentheses.
Presence of YY1 (0.02), 4.05E-12
Distance between ETS (0.01) and YY1 (0.02):120 bp, 5.32E-13
Presence of YY1 (0.01), 5.43E-10
Distance between WT1 (0.02) and YY1 (0.01):40 bp, 1.09E-10
Presence of YY1 (0.01), 3.94E-114
Presence of E2F (0.2), 1.54E-33
Presence of ELK1 (0.04), 1.13E-20
Distance between YY1 (0.01) and E2F (0.01): 40 bp, 1.67E-121
Distance between ELK1 (0.04) and E2F (0.01): 220 bp, 7.64E-26
Presence of YY1 (0.03), 8.64E-6
Distance between YY1 (0.01) and E2F1 (0.1): 520 bp, 8.82E-9
Presence of YY1 (0.02), 1.90E-6
Distance between YY1 (0.02) and ELK1 (0.01):160 bp, 9.79E-8
Literature evidence to support the rules learned by GBNet
Most of the rules learned by GBNet were supported by literature: YY1 and ETS family proteins have been shown to form a complex that helps to maintain the normal function of human immune system ; Both YY1 and ETS are required for the transcriptional regulation of a variety of cellular processes such as adipocyte differentiation ; YY1 has been shown to physically interact with E2F family proteins for the specificity of E2F function ; Synergistic cooperation has been observed between YY1 and E2F1 for the transcriptional activity of p73, through a mechanism involving a physical interaction ; Two independent groups have verified that YY1 and ELK1 co-ordinate the expression of the SOD gene [32, 33]. In addition, GBNet also identified two new cooperative pairs: YY1-WT1 and E2F-ELK1. It is worth pointing out that GBNet specifies how the above TFs cooperative with each other and provides specific guidance for experimental validation.
Independent E2F ChIP-chip experiments support the YY1-E2F distance constraint
A direct evidence to support the sequence constraints found by GBNet came from a recent study on the binding of E2F family members. Farnham and colleagues recently conducted ChIP-chip experiments on E2F family members, E2F1, E2F4 and E2F6, using the same promoter array that we used for our YY1 study (see Methods and ). They showed that E2F family members mainly bind to promoter regions within 2 kb of transcription start site (TSS) in HeLa cells and their bindings are interchangeable . Among the 2815 human genes that are targets of any of the three E2F family members, 496 are in common with the 968 GITTAR YY1 targets (P-value < 2.5e-167 based on hypergeometric distribution).
Discussion and conclusion
Combinatorial regulation of transcription factors is critical in gene expression control particularly in higher organisms. For the purpose of reconstructing transcription regulatory network, understanding the molecular mechanisms and deciphering the grammar of combinatorial regulation are the natural steps after finding the binding sites of TFs . Identification of cooperative TFs and learning sequence constraints between their motifs can provide great insights into building mechanistic and quantitative models of transcriptional regulation .
We have developed a Bayesian network approach to find regulatory rules enriched in a foreground sequences, for example the promoters of a set of co-regulated genes, compared with the background sequences. This method can be applied to any genome as we showed its success in yeast and human here. We designed a powerful searching strategy in Bayesian network structure learning by employing Gibbs sampling and simulated annealing. Compared with the exhaustive enumeration, GBNet can find the optimal rules much more efficiently. The more candidate motifs under consideration, the more save of computational cost GBNet would achieve over enumeration. Given the improved searching strategy, it is not surprising that GBNet outperforms BBNet that employs greedy search for optimal network structure in all the datasets we have tested, including simulated, yeast and human data.
In the present study, we were focused on the six sequence constraints using in  and analyzing combinatorial regulations in proximal promoters. Obviously, there exist other sequence constraints, particularly those for the interactions between distal enhancers and promoters and those related to other regulatory elements such as silencers. New approaches are emerging to define all the regulatory elements, for example, using chromatin modification patterns  or protein-DNA interaction data to predict enhancers . The accumulation of such knowledge can facilitate GBNet to learn rules that involve other regulatory elements or long-range regulatory interactions such as the looping interaction between distal enhancers and promoters. It is straightforward to search for other types of rules by GBNet without significantly increasing the computational cost.
where N s is network structure, D is data, Γ (·) is the gamma function, N p is the number of parent nodes, log10(K) is a network parameter (see below), q is the number of possible parent states, r + 1 is the number of possible child states, a j = Σ a jk , N j = Σ N jk , N jk is the number of samples for child state k when parent state is j, a jk is prior count. In BBNet, a greedy search algorithm was employed to search for the best network structure: a search stopped when adding a new parent node (sequence constraint) could not further improve the Bayesian score. This procedure is prone to get trapped in local optimum. To improve the searching efficiency, we implemented a Metropolis jumping in GBNet each time when a parent node (sequence constraint) was added or the functional depth of a motif was updated. In addition, simulated annealing was also exploited to search for the global optimum (see Fig. S2 for a comparison between the two search algorithms). A change to the Bayesian network structure was accepted by a probability of , where and N s are network structures after and before the change and T is the temperature. In simulated annealing, T was decreased exponentially as T ← αnT, where n is the number of iterations and α is the rate of change. The searching procedure stopped when there was no change detected at a specific temperature after a number of attempts or a sufficient number of temperature changes had been made. In this work, the initial temperature was set to 5.0 and α = 0.9 for both yeast and human data. The simulation moved to the next temperature if either the number of iterations reached 20 or the number of structure changes reached 500. The maximum number of temperature changes was set to 20. In our tests, GBNet was always able to find the optimum using these parameters.
The background sequences were selected as the following. For the yeast data, the same background as Beer and Tavazoie (2004) was used for a fair comparison. For the human YY1 data, the number of background sequences was set to five times of the size of the cluster under consideration. This background size was heuristically determined to achieve a balance between discrimination and statistical significance. All genes were ranked according to their correlation to the mean expression profile of the cluster in the ascending order and the least correlated or the most anti-correlated genes were selected as the background. The structural parameter, log10 K in Eq.(1), helps avoid overfitting in learning the Bayesian network structure by penalizing the complex network structures . The value of log10 K in  was used for the yeast data and a heuristic value of 5.0 was chosen for the human YY1 data.
Finding enriched co-factors using Fisher's exact test
a + b
c + d
a + c
b + d
where a, b, c, d are the numbers of genes in each category; n = a+b+c+d is the total number of genes under consideration. The same criterion as described above was used to select candidate motifs.
The YY1 ChIP-chip data
The YY1 ChIP-chip data was obtained from our previous study . Briefly, the whole-genome promoter array was designed and synthesized by Nimblegen. 24134 promoters from human genome build 35 (HGS17) were represented on the array. 1500 bp sequence of each promoter (1300 bp upstream and 200 bp downstream of TSS) is covered by 15 oligo probes of 50 bps long. Two replicates of experiments were conducted in HeLa cells. Data were collected and processed as described previously [37, 38].
We thank Michael Beer, Olivie Elemento and Saeed Tavazoie for answering our questions on the details of their methods. We also thank Andrew McCamman for his support. This work is partially supported by a NIH grant to WW (GM072856). LS was partially supported by a postdoctoral fellowship from La Jolla Interfaces in Science.
- Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ, et al.: Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotechnol 2005, 23(1):137–144. 10.1038/nbt1053View ArticlePubMedGoogle Scholar
- Mukherjee S, Berger MF, Jona G, Wang XS, Muzzey D, Snyder M, Young RA, Bulyk ML: Rapid analysis of the DNA-binding specificities of transcription factors with DNA microarrays. Nat Genet 2004, 36(12):1331–1339. 10.1038/ng1473PubMed CentralView ArticlePubMedGoogle Scholar
- Levine M, Tjian R: Transcription regulation and animal diversity. Nature 2003, 424: 147–151. 10.1038/nature01763View ArticlePubMedGoogle Scholar
- Yu X, Lin J, Masuda T, Esumi N, Zack DJ, Qian J: Genome-wide prediction and characterization of interactions between transcription factors in Saccharomyces cerevisiae . Nucleic Acids Research 2006, 34(3):917–927. 10.1093/nar/gkj487PubMed CentralView ArticlePubMedGoogle Scholar
- Yu X, Lin J, Zack DJ, Qian J: Computational analysis of tissue-specific combinatorial gene regulation: predicting interaction between transcription factors in human tissues. Nucleic Acids Res 2006, 34(17):4925–4936. 10.1093/nar/gkl595PubMed CentralView ArticlePubMedGoogle Scholar
- Bulyk ML, McGuire AM, Masuda N, Church GM: A motif co-occurrence approach for genome-wide prediction of transcription-factor-binding sites in Escherichia coli. Genome Res 2004, 14(2):201–208. 10.1101/gr.1448004PubMed CentralView ArticlePubMedGoogle Scholar
- Hannenhalli S, Levy S: Predicting transcription factor synergism. Nucleic Acids Res 2002, 30(19):4278–4284. 10.1093/nar/gkf535PubMed CentralView ArticlePubMedGoogle Scholar
- Pilpel Y, Sudarsanam P, Church GM: Identifying regulatory networks by combinatorial analysis of promoter elements. Nat Genet 2001, 29(2):153–159. 10.1038/ng724View ArticlePubMedGoogle Scholar
- Zhu Z, Shendure J, Church GM: Discovering functional transcription-factor combinations in the human cell cycle. Genome Research 2005, 15: 848–855. 10.1101/gr.3394405PubMed CentralView ArticlePubMedGoogle Scholar
- Banerjee N, Zhang MQ: Identifying cooperativity among transcription factors controlling the cell cycle in yeast. Nucleic Acids Res 2003, 31(23):7024–7031. 10.1093/nar/gkg894PubMed CentralView ArticlePubMedGoogle Scholar
- Kato M, Hata N, Banerjee N, Futcher B, Zhang MQ: Identifying combinatorial regulation of transcription factors and binding motifs. Genome Biology 2004., 5(8):Google Scholar
- Das D, Banerjee N, Zhang MQ: Interacting models of cooperative gene regulation. PNAS 2004, 101(46):16234–16239. 10.1073/pnas.0407365101PubMed CentralView ArticlePubMedGoogle Scholar
- Segal E, Shapira M, Regev A, Pe'er D, Botstein D, Koller D, Friedman N: Module networks: identifying regulatory modules and their condition-specific regulators from gene expression data. Nat Genet 2003, 34(2):166–176. 10.1038/ng1165View ArticlePubMedGoogle Scholar
- Kulkarni MM, Arnosti DN: cis-regulatory logic of short-range transcriptional repression in Drosophila melanogaster. Mol Cell Biol 2005, 25(9):3411–3420. 10.1128/MCB.25.9.3411-3420.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Elemento O, Slonim N, Tavazoie S: A universal framework for regulatory element discovery across all genomes and data types. Mol Cell 2007, 28(2):337–350. 10.1016/j.molcel.2007.09.027PubMed CentralView ArticlePubMedGoogle Scholar
- Beer M, Tavazoie S: Predicting gene expression from sequence. Cell 2004, 117: 185–198. 10.1016/S0092-8674(04)00304-6View ArticlePubMedGoogle Scholar
- Yuan Y, Guo L, Shen L, Liu JS: Predicting Gene Expression from Sequence: A Reexamination. PLoS Computational Biology 2007, 3(11):e243. 10.1371/journal.pcbi.0030243PubMed CentralView ArticlePubMedGoogle Scholar
- Austen M, Luscher B, Luscher-Firzlaff JM: Characterization of the Transcriptional Regulator YY1. J Biol Chem 1997, 272(3):1709–1717. 10.1074/jbc.272.3.1709View ArticlePubMedGoogle Scholar
- Shi Y, Lee J-S, Galvin KM: Everything you have ever wanted to know about Yin Yang 1. Biochimica et Biophysica Acta (BBA) – Reviews on Cancer 1997, 1332(2):F49-F66. 10.1016/S0304-419X(96)00044-3View ArticleGoogle Scholar
- Gordon S, Akopyan G, Garban H, Bonavida B: Transcription factor YY1: structure, function, and therapeutic implications in cancer biology. Oncogene 2005, 25(8):1125–1142. 10.1038/sj.onc.1209080View ArticleGoogle Scholar
- Shen L, Wang K, Liu J, Affar EB, Shi Y, Ren B, Wang W: Rewiring of the YY1 transcriptional program in human and mouse. 2008, in press.Google Scholar
- Wang W, Cherry JM, Nochomovitz Y, Jolly E, Botstein D, Li H: Inference of combinatorial regulation in yeast transcriptional networks: A case study of sporulation. Proceedings of the National Academy of Sciences 2005, 102(6):1998–2003. 10.1073/pnas.0405537102View ArticleGoogle Scholar
- Su A, Wiltshire T, Batalov S, Lapp H, Ching K, Block D, Zhang J, Soden R, Hayakawa M, Kreiman G, et al.: HG_U133A/GNF1H and GNF1M Tissue Atlas Datasets. Proc Natl Acad Sci USA 2004, 101(16):6062–6067. 10.1073/pnas.0400782101PubMed CentralView ArticlePubMedGoogle Scholar
- Eisen MB, Spellman PT, Brown PO, Botstein D: Cluster analysis and display of genome-wide expression patterns. Proc Natl Acad Sci USA 1998, 95: 14863–14868. 10.1073/pnas.95.25.14863PubMed CentralView ArticlePubMedGoogle Scholar
- Matys V, Fricke E, Geffers R, Gossling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, et al.: TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res 2003, 31(1):374–378. 10.1093/nar/gkg108PubMed CentralView ArticlePubMedGoogle Scholar
- Fisher RA: Statistical Methods for research workers. Oliver and Boyd 1954.Google Scholar
- Liu X, Brutlag D, Liu J: BioProspector: discovering conserved DNA motifs in upstream regulatory regions of co-expressed genes. Pac Symp Biocomput: 2001 2001, 127–138.Google Scholar
- Presnell SR, Zhang L, Ramilo CA, Chan HW, Lutz CT: Functional redundancy of transcription factor-binding sites in the killer cell Ig-like receptor (KIR) gene promoter. Int Immunol 2006, 18(8):1221–1232. 10.1093/intimm/dxl043View ArticlePubMedGoogle Scholar
- Patton J, Block S, Coombs C, Martin ME: Identification of functional elements in the murine Gabp alpha/ATP synthase coupling factor 6 bi-directional promoter. Gene 2006, 369: 35–44. 10.1016/j.gene.2005.10.009View ArticlePubMedGoogle Scholar
- Schlisio S, Halperin T, Vidal M, Nevins JR: Interaction of YY1 with E2Fs, mediated by RYBP, provides a mechanism for specificity of E2F function. Embo J 2002, 21(21):5775–5786. 10.1093/emboj/cdf577PubMed CentralView ArticlePubMedGoogle Scholar
- Wu S, Murai S, Kataoka K, Miyagishi M: Cooperative regulation of p73 promoter by Yin Yang 1 and E2F1. Nucleic Acids Symp Ser (Oxf) 2007, 51: 347–348. 10.1093/nass/nrm174View ArticleGoogle Scholar
- Chang MS, Yoo HY, Rho HM: Positive and negative regulatory elements in the upstream region of the rat Cu/Zn-superoxide dismutase gene. Biochem J 1999, 339(Pt 2):335–341. 10.1042/0264-6021:3390335PubMed CentralView ArticlePubMedGoogle Scholar
- Srivastava S, Chandrasekar B, Gu Y, Luo J, Hamid T, Hill BG, Prabhu SD: Downregulation of CuZn-superoxide dismutase contributes to [beta]-adrenergic receptor-mediated oxidative stress in the heart. Cardiovascular Research 2007, 74(3):445–455. 10.1016/j.cardiores.2007.02.016View ArticlePubMedGoogle Scholar
- Xu X, Bieda M, Jin VX, Rabinovich A, Oberley MJ, Green R, Farnham PJ: A comprehensive ChIP chip analysis of E2F1, E2F4, and E2F6 in normal and tumor cells reveals interchangeable roles of E2F family members. Genome Res 2007. gr.6783507Google Scholar
- Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, Barrera LO, Van Calcar S, Qu C, Ching KA, et al.: Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet 2007, 39(3):311–318. 10.1038/ng1966View ArticlePubMedGoogle Scholar
- Hallikas O, Palin K, Sinjushina N, Rautiainen R, Partanen J, Ukkonen E, Taipale J: Genome-wide prediction of mammalian enhancers based on analysis of transcription-factor binding affinity. Cell 2006, 124(1):47–59. 10.1016/j.cell.2005.10.042View ArticlePubMedGoogle Scholar
- Kim TH, Barrera LO, Zheng M, Qu C, Singer MA, Richmond TA, Wu Y, Green RD, Ren B: A high-resolution map of active promoters in the human genome. Nature 2005, 436(7052):876–880. 10.1038/nature03877PubMed CentralView ArticlePubMedGoogle Scholar
- Kim TH, Barrera LO, Ren B: ChIP-chip for Genome-Wide Analysis of Protein Binding in Mammalian Cells. Current Protocols in Molecular Biology 2007, 79: 21–21.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.