Skip to main content
Figure 2 | BMC Bioinformatics

Figure 2

From: BRNI: Modular analysis of transcriptional regulatory programs

Figure 2

Flow of the integrated analysis. (a-d) Learning a biochemically based regulatiozn model. The input for model learning is transcription rates derived from mRNA levels (a). A biochemical model of TF binding and dissociation (b) is used to describe the transcription rate of a target gene. The binding and dissociation kinetics of each transcription factor (orange and green ovals) to the target gene promoter (left panel) are governed by affinity parameters (γ1 and γ2, respectively). These kinetics result in a distribution of promoter states within the cell population (middle panel). Each promoter state is associated with a distinct transcription rate (αa through αd, right panel). These regulation functions are used within a probabilistic graphical model (c) where the observed transcription rates of a target gene (G, blue oval) are explained using the hidden active protein levels of the regulators (R1 and R2, pink ovals). In practice we learn a modular model (d), where the genes belonging to a single module (square nodes) share the same set of affinity and transcription rate parameters {γ, α}. The model topology describes which regulators control each of the modules, and which genes are members of each module. In addition, the regulator activity profiles (right) and all kinetic parameters are inferred. (e) An ensemble learning approach. From the original set of genes (G, barrel), m subsets (G1 through Gm) are randomly sampled, each containing some fraction (e.g. 80%) of the genes. A modular regulation model is learned for each subset as in (d). The resulting ensemble of models is integrated into a unified consensus model (Methods). First, regulators are mapped between different runs based on their time profile similarities (e.g. red profiles on right panel). Next, core gene modules are defined based on sets of genes that frequently co-occur in the same module. (f) Learning a motif-based regulation model. Subsets of genes are defined either by members of a module, or by targets of a regulator in the unified model. The promoters of these gene subsets are searched for novel cis-regulatory motifs using four different algorithms. The resulting redundant collection of motifs is clustered and merged to generate a non-redundant library of motifs. The promoters of all genes are then scanned against this library, and enrichments of gene sets for particular motifs are computed.

Back to article page