A floodbased information flow analysis and network minimization method for gene regulatory networks
 Andreas Pavlogiannis^{1},
 Vadim Mozhayskiy^{1, 2} and
 Ilias Tagkopoulos^{1, 2}Email author
DOI: 10.1186/1471210514137
© Pavlogiannis et al.; licensee BioMed Central Ltd. 2013
Received: 7 June 2012
Accepted: 19 March 2013
Published: 24 April 2013
Abstract
Background
Biological networks tend to have high interconnectivity, complex topologies and multiple types of interactions. This renders difficult the identification of subnetworks that are involved in condition specific responses. In addition, we generally lack scalable methods that can reveal the information flow in gene regulatory and biochemical pathways. Doing so will help us to identify key participants and paths under specific environmental and cellular context.
Results
This paper introduces the theory of network flooding, which aims to address the problem of network minimization and regulatory information flow in gene regulatory networks. Given a regulatory biological network, a set of source (input) nodes and optionally a set of sink (output) nodes, our task is to find (a) the minimal subnetwork that encodes the regulatory program involving all input and output nodes and (b) the information flow from the source to the sink nodes of the network. Here, we describe a novel, scalable, network traversal algorithm and we assess its potential to achieve significant network size reduction in both synthetic and E. coli networks. Scalability and sensitivity analysis show that the proposed method scales well with the size of the network, and is robust to noise and missing data.
Conclusions
The method of network flooding proves to be a useful, practical approach towards information flow analysis in gene regulatory networks. Further extension of the proposed theory has the potential to lead in a unifying framework for the simultaneous network minimization and information flow analysis across various “omics” levels.
Keywords
Network flood Network flux Information flow Gene regulatory networks Network minimizationBackground
In the last decade we have witnessed an explosion of biological data that are available in all branches of the Tree of Life. Significant advances in the biotechnological and computational realm have enabled new ways of acquisition, representation, analysis, and integration of many heterogeneous and seemingly disparate sources of data. This has led to the development of numerous databases that contain validated and putative associations between DNA, proteins or metabolites and various inference algorithms for network reconstruction. Still, in many cases, we lack the methods to extract information that is specific to a cellular mechanism, a biological behavior, or a complex phenotype from the plethora of the available data. The need to develop such methods will become increasingly more obvious, due to the projected accumulation of data in the following years.
Here, we address the problem of network minimization and regulatory information flow in gene regulatory networks. Given a gene regulatory network, a set of source (input) and a set of sink (output) nodes, our task is to find (a) the minimal subnetwork that encodes the regulatory program which involves all input and output nodes and (b) the information flow from the source to the sink nodes of the network. Concomitantly, if no sink nodes are specified, our task is reduced to identify the underlying pathways and nodes that are recipients of the information propagated from the source node(s). In this context, source nodes can be thought as cellular components that are sensitive to environmental fluctuations, and they can propagate this information to their downstream targets. In bacteria, this set includes intracellular and transmembrane receptors that participate in complex behaviors, such as chemotaxis and quorum sensing, transcription factors and responsespecific proteins that are (in)activated by external environmental stimuli, and sigma factors that initiate systemwide regulatory responses to environmental changes, such as heat shocks and nutrient limitation. In vertebrates, this list expands even more to include proteins involved in signal propagation in the nervous system, tissues, and organs. Similarly, sink nodes represent downstream targets of interest. Some examples include enzymes that participate in metabolic reactions, and proteins that are responsible for complex traits, such as stressresponse proteins, motility genes, and genes involved in aerobic respiration.
The theory of network floods that we introduce here is a fundamental extension of network flow theory for networks where (a) interactions can be negative and (b) flow is replicated instead of conserved, as it is the case in regulatory networks. Network flow theory [13] has been traditionally applied in other disciplines, including multiprocessor scheduling [4], transportation [5], and sociology [6]. Despite the availability of efficient methods in the field early on [7], network flow theory has not been applied in this biological context. The network flow theory cannot be applied directly to biological systems in general, and gene regulatory systems more specifically, since the interactions that take place in biological networks and the network properties are of different nature when compared to the networks in other applications. One of the most striking difference is that network flow is not conserved in each node, in other words, the sum of all incoming flows is not equal with the sum of all outgoing flows. In fact, most regulatory networks exhibit flow replication, where the sum of the incoming flow is replicated in each of the outgoing interactions. This network characteristic captures the process of transcriptional activation of a gene that itself has multiple downstream targets. In that case, each outgoing regulatory edge from that node can have activatory or inhibitory information that does not have to conform to any flow conservation rule. As an example of this flow replication property, imagine a transcriptional regulator, (e.g. the AraC activator protein) that can bind to two distinct promoters in the genome, which drive two different genes. In that case, upregulation of the araC gene will signify the simultaneous upregulation of the downstream targets (the two distinct genes). In this typical case, there is no conservation rule of any sort, for example the number of AraC copies do not have to be equal to the number of the two gene copies. Furthermore, given saturating levels of regulatory protein, the effect of the regulation will be the simultaneous upregulation of all upstream genes and thus the replication of the information flow to each distinct path. The vast majority of signal transduction network analysis methods are focused on topological features [8], such as motifs and binary interactions between nodes in the network [9]. Other approaches use Boolean theory to infer hidden regulatory pathways [10] or compute the minimal set of nodes that can perform signal transduction independently [11]. Although these methods provide valuable insight, they can’t capture quantitative relationships between nodes that are critical for elucidating the network dynamics, and where the weights of the individual edges have a critical role. In a recent study the information flow of acyclic, activationonly, hierarchical networks was studied using continuous expression models [12]. Other relevant prior work includes the application of elementary modes in signaling and regulatory networks for functional analysis [13], shortest path algorithms for biological interaction paths [14, 15], application of Petrinet based analysis to signal transduction pathways [16], partitioning biological data with transitivity clustering [17, 18], and measuring information flow through random walks ignoring inhibitory links [19]. In contrast to the methods that mostly target clustering or motif finding in biological data, network flooding can elucidate the regulatory information flow taking into account regulation weight and sign, an important challenge in systems biology [9, 20], and perform hypothesisspecific network minimization towards transforming data and networks to knowledge. Although networkbased approaches have been developed in the past mainly for metabolic networks [2123], they are not suitable to be applied in cases where both positive and negative regulation is present and flow conservation does not hold.
Note that network flooding is very different from simple superposition of negative and positive regulatory weights for each node, as it takes into account the amount of regulatory information in the preceding paths. For instance, nodes 5 and 6 have exactly the same regulatory influences (both direction and strength). However, while node 6 will be activated and propagate the signal to downstream nodes, node 5 will be inactive, due to the large information flow from node 1 to node 2 (compared to the small information flow from 1 to 4), which in this case has an inhibitory effect. The outcome of this process is thus highly dependent on both the network topology, and the corresponding weights. The problem becomes more challenging in the presence of loops and cycles, where simple network traversals (such as depth/breadth first search) are not applicable. In addition, network flooding is fundamentally different from network flows, as it introduces negative regulatory interactions and conservation of flow is not guaranteed for each node, as it would be the case in metabolic processes. The latter is a necessary addition to realistically capture regulatory interactions, as regulatory information is usually replicated as it passes through a node with multiple outputs.
Methods
In this section, we define the problem of gene regulatory network (GRN) minimization and we introduce the network flooding method along with an algorithmic implementation. Before we describe the method of network flooding, we first define some terms that we will use in its description.
Definitions
Flood networks and network floods
First, we define the capacity c (u, v) of the edge (u, v), between nodes u and v as the maximum amount of information that can pass from node u to node v. A flood network G = (V, E) is a directed graph in which every edge (u, v) ∈ E with u, v, ∈ V has a nonzero capacity c (u, v) ∈ R, and c (u, v) =0 when (u, v) ∉ E (i.e. zero capacity where no edge exists between two nodes). In the case of gene regulatory networks, the nodes represent genes and the edges establish the regulatory interactions between them. The capacity of each edge represents the weight and direction of each regulatory interaction, and can be either positive, or negative. We distinguish a source node s ∈ V, and for simplicity we assume that a path exists from s to all other nodes in G.
Since feedback and high interconnectivity is common in gene regulatory networks specifically, and biological networks in general, we have to devise a method to account for all possible walks in a network, which are not necessary simple. For this reason we “unwind” or “traverse” all its walks that are essential, which is formally defined as follows:
Essential walk: Given a network G = (V, E), a walk P = (x_{1}, x_{2}…x_{ n }) on G with (x_{ i }, x_{ j }) ∊ E is essential, if between any two successive appearances x_{ i },x_{ j } of any node v ∈ V, there exists at least one node that does not appear in the walk P’ = (x_{1}, x_{2}…x_{ i }).
An essential walk allows multiple appearances of some nodes, as long as each cycle introduces at least one new node to the essential walk. An essential walk is called saturated, if its expansion through any node will make it nonessential. An essential traversal of G, starting from the source node, is the set P^{ s } of all saturated walks starting from the source node s, following a breadthfirst manner. For a given edge (u, v) we denote as P_{(u,v)}^{ s } = {(s, …, u, v, …)} the subset of all saturated walks within the essential traversal that include the edge (u, v). Note that the essential traversal captures feedback loops of arbitrary size. Given the above, the network flood f (u, v) of an edge (u, v) corresponds to the amount of information that flows through that edge, and it can be calculated by any function f: E → R, provided that it is subject to the following three constraints:
Capacity Constraint: For all edges (u, v) we require that f(u, v) ≤ c(u, v).
Polarity Constraint: For all edges (u, v) we require that f(u, v)c(u, v) ≥ 0.
Essential Walk Constraint: We denote with f (P) the flood f (u, v) to an edge (u, v) that is carried through a walk P = (s, …, u, v), which starts from the source node s. The following constructs by induction a set of essential walks D, which defines the essential walk constraint on f(u, v), for all edges (u, v): Base case: Initially D = ∪ _{(s,u)∊E}{(s, u)} , and we require that f((s, u)) = c(s, u). Inductive step: Let P be a walk in D, and P′ its expansion through an edge (u, v). The essential walk constraint states that the network flood in edge (u, v) is given by: f(u, v) = f(P^{'}) = max (0, min(c(u, v), ∑ _{Q ∈ D : Q = (s, … , w, u)}f(Q))), with D = D ∪ {P^{’}}.
The capacity constraint restricts the magnitude of the flood that can run through an edge. The polarity constraint guaranties that the running flood has the same sign as the edge capacity, to preserve its regulatory function. Finally, the essential walk constraint imposes that each nonsaturating walk P entering a node has to carry out the same flood as the one it brings into u, and the magnitude of the total flood in any edge (u, v) be determined by the algebraic sum of the flood carried by the set ${P}_{{}^{\left(u,v\right)}}^{s}\phantom{\rule{0.5em}{0ex}}\subseteq \phantom{\rule{0.5em}{0ex}}Q$. In general, the function itself can take any form (e.g. sigmoid, polynomial), similar to the kernel functions in classification, although here we define its magnitude to be the linear sum of all incoming floods from essential walks, with the edge capacity value being its upper bound (capacity constraint), and its sign to be the same as the edge capacity (polarity constraint). The above definitions can be naturally extended to networks with more than one distinguished source nodes.
Environmental signals and source nodes: Let S = {s_{1}, s_{2}…} be a set of continuous variables that encode for environmental signals (e.g. heat, pressure, light, chemicals), each of which can be sensed by the organism through a nonempty set of nodes $\left\{{u}_{{s}_{i}}\phantom{\rule{0.5em}{0ex}}\in \phantom{\rule{0.5em}{0ex}}V\right\}$, namely the receptor nodes for the corresponding environmental signal. An environmental signal s_{ j } serves as a source node in a flood network, and is defined as “active” when it has a positive value. An edge or a node is defined as “active” when it has a nonzero flood. Floods can be positive or negative, which corresponds to activation or inhibition of the target node (i.e. downstream gene), respectively. Depending on which environmental signals are active, different pathways in the regulatory network are activated as a response to the current environmental state.
Network minimization problem
Regulatory network minimization through flood analysis
Network transformation (phase one)
 1.
Introduction of the signal nodes. Every signal ∈ S _{ act } is mapped to a signal node s ∈ V. The set of all signal nodes serves also as the set of distinguished nodes S ⊂ V.
 2.
An additional basal node b ∈ V is introduced. This node captures the basal (i.e. base) expression of the gene that depends on the "leakiness" of the upstream promoter and the concentration of the transcription factors that are bound to the promoter and regulate the gene's expression.
 3.
To capture the saturation of the node v expression, a saturation gadget is introduced. For every node v in the original GRN, two nodes v _{ in } ,v _{ out } are introduced together with an edge (v _{ in } ,v _{ out }) the capacity of which c (v _{ in } ,v _{ out }), is a positive number that limits the maximum flood through node.
 4.
Introduction of the signal and basal regulatory links. For every receptor node v in the original GRN regulated by an external signal s, (s, v _{ out }) is added in E. The capacity c(s, v _{ out }) is set equal to the corresponding signal regulation weight w (s, v _{ out }) from signal s to node v. Moreover, in the presence of information on the basal expression levels of a node v, an edge (b, v _{ out }) is added, with the capacity c(b, v _{ out }) set equal to the basal expression of v, b (v).
The basal node b, serves as an additional source node which is always being active.
Network flooding (phase two)
In this step we calculate all network floods in the transformed flood network, by applying an essential traversal on it. Algorithm 1 provides computation of floods in a flood network G = (V, E) with a set of distinct source nodes S. The process starts from the nodes in S, and is based on repetitive expansions of essential walks until they get saturated. Each walk P is carrying some flood, and upon its expansion through an edge (u, v), the following take place: (a) the magnitude of the flood to be stored in that edge is determined (flood_{ total }) by adding the existing flood on the edge with the one carried by the P, (b) possible excess of the incoming flood caused by the saturation limit is stored in a matrix (Excess), (c) the flood change (Flood_{ delta }) is then propagated along the walk, after being polarized by the sign of the edge. The time complexity of the essential traversal is highly dependent on the network topology and while in the worst case it can scale exponentially with the number of the nodes, scalability analysis shows that our method scales robustly with the increase of the graph size.
Algorithm 1
Threshold imposition and reverse network transformation
Now that we have calculated all floods, we can impose a lower bound on the minimal magnitude that we will allow. Conservatively, this value is zero (i.e. even a small amount of information flow is sufficient to add an edge on the minimized network), but any threshold t can be used, so that only edges with flood magnitude greater that t will exist (i.e. f(u, v) > t), where negative flood values are also allowed. In case that a set of output (sink) nodes A is also supplied, this step additionally disregards edges getting disconnected from A once the threshold is imposed. Transformation the minimized network to its GRN counterpart is achieved by simply reversing the steps performed in phase one. The network flooding method is deterministic for any given threshold.
Results
Ideally, performance evaluation of the network floods theory requires complete regulatory networks where all nodes and links are present, together with link directionality and a signed weight. In addition, the quantitative expression model for each node should be known, and the phenotypic change after the network reduction should be measurable for informative comparison between the original and minimized network. Since currently we are far from having any such dataset, we first adopted a similar approach to what is used for benchmarking gene network reconstruction algorithms, by constructing synthetic datasets of in silico organisms [24]. We used a multiscale microbial evolution simulator to create a synthetic complete dataset with the information mentioned above to comprehensively evaluate the proposed methods. Our results show that our method has very good scaling, is robust to noise and missing data, and does not require full network knowledge. We then evaluated our method with experimental data in the case of the bacterium E. coli, a wellstudied model organism. The source code, sample data files and a brief tutorial on the network flooding algorithm is provided in Additional file 1.
Synthetic datasets
We used the EVE (Evolution in Variable Environments) simulator to create a synthetic dataset and applied the network flood algorithm to the resulting networks. The EVE simulator has multiple levels of abstractions that range from molecular species, gene regulatory and biochemical networks, to organisms and environment. Each organism has its own distinct gene regulatory and biochemical network that can be depicted as a directed weighted graph. The network comprises of a number of “triplets” (three nodes): Gene/mRNA, Protein, and Modified Protein. The Promoter/Gene/RNA node captures gene regulation and transcription, while the Protein and Modified Protein nodes capture translation and posttranslational modification (acetylation, phosphorylation, etc.), respectively. In other words, the triplets capture the “central dogma” of molecular biology. Organisms undergo a stochastic evolution and their gene regulatory and biochemical networks change in size and topology in order to adapt to the synthetic environments. EVE has been used to test the hypothesis of anticipatory behavior in bacteria [25], to investigate Horizontal Gene Transfer dynamics [26], the distribution of fitness effects during evolution [27], facilitated variation in microbial communities [28], and has been documented elsewhere [29]. Here, 64 populations of 256 organisms each where evolved in low and high mutation rates (lmr/hmr) for 5,000 generations in dynamical environments where the existence of two environmental signals and the presence of nutrients follows a either an AND or a XOR gate (more about environmental structure in the supplementary online material of [25]). This resulted on a dataset of 47,698 organisms (after the complete set was filtered for organisms of high fitness) evolved in a total of four environments with complete information on their network connectivity, kinetic parameters, expression, evolutionary trajectory, and fitness.
Mean network minimization and mean running time comparison in synthetic datasets
Mean initial size, links  Mean minimization, % of removed links  Mean fitness decrease in flood minimized networks  Mean running time, sec  

Flood  Best heuristic  Flood  Best heuristic  
ANDlmr  11.61  52.3  66.9  0.8%  0.0019  11502.0 
ANDhmr  26.57  41.1  83.3  1.0%  0.0023  12147.0 
XORlmr  52.20  34.0  79.7  1.9%  0.0036  13794.0 
XORhmr  55.68  33.9  81.1  2.0%  0.0039  13986.0 
Scalability and sensitivity analysis
Escherichia coliregulatory network
Scenarios used in the flood minimization of E. coli regulatory network
Scenario  Inputs  Reporter gene selection  Genes in a subnetwork  Reporter genes  Total flood in a network with the flood threshold equals to  

0.00  0.35  0.70  
Stationary phase  σ^{38}  Stationary phase specific genes under control of σ^{38} (22)  1,254  10  185  50  33 
Exponential growth / GO groups  σ^{70}, σ^{54}, σ^{28}  Genes based on GO terms (amino acid synthesis, translation, protein folding, protein modification, glycolysis, tricarboxylic acid cylce)  1,584  168  242  49  48 
Exponential growth / expression data  σ^{70}, σ^{54}, σ^{28}  Genes expressed in the exponential phase (microarray expression data (23)  1,584  54  232  96  40 
Heat shock / GO groups  σ^{38}, σ^{54}, σ^{32}  Genes expressed in the exponential phase (microarray expression data (23)  1,314  13  173  46  33 
Transition phase / expression data  σ^{70}, σ^{38}  Genes expressed in the transition from the exponential to the stationary phase (microarray expression data (23)  1,595  34  241  102  52 
Conclusions
In this paper, we have presented the method of network flooding that aims to minimize regulatory networks in order to capture core regulatory patterns and information flow for specific biological conditions. We introduced a scalable, robust, graphbased algorithmic implementation that can achieve impressive network size reduction, without disrupting core regulatory pathways in synthetic datasets. When network flooding was applied in the reconstructed E. coli regulatory network, it was able to reduce its size producing meaningful (in terms of biological processes involved) and statistical significant (in terms of differentially expressed genes and GO terms) results. In addition, network topology is sufficient for network flooding to operate at the lack of other data sources, and the method copes well with missing information and unknown relationships. There are numerous extensions to our work that can prove useful for biological network analysis and pathway extraction. The presented method can be used to ask questions regarding the maximum information and (regulatory) control that can be achieved by any given node or set of nodes; the importance of a single node or subnetwork manifested by the amount of information flow that it channels, which is a quite different metric than its connectivity or regulatory strength; and the degree of multiplexing information that can be achieved in various organisms, a possible proxy for regulatory complexity. We intend to apply the method of network flooding in reconstructed mammalian networks, both in respect to regulation of core mechanisms [9], and miRNAbased regulation [15]. Although we have mainly focused so far on regulatory networks, this work can be extended in proteinprotein interaction (PPI), signal transduction and metabolic networks. This entails the extension of the network flood theory in order to handle differently nodes and edges that belong to distinct network types, as the associations between nodes are usually different (for example, a link between two nodes in a metabolic network usually signifies conversion, and not regulation). Taking into account the high degree of interconnection between multiple scales of biological organization, this extension may lead to a unifying framework for the simultaneous network minimization and information flow analysis across various “omics” levels, that is more than the sum of its parts.
Abbreviations
 GRN:

Gene regulatory network
 GO:

Gene ontology
 PPI:

Proteinprotein interaction
 lmr:

Low mutation rate
 hmr:

High mutation rate.
Declarations
Acknowledgements
We would like to thank the members of the Tagkopoulos lab for their feedback and useful discussions. This work was supported by the University of California opportunity fund and NSF award to IT.
Authors’ Affiliations
References
 Ahuja R, Magnanti T, Orlin J: Network flows: theory, algorithms, and applications. 1993, Upper Saddle River, NJ: Prentice HallGoogle Scholar
 Ford L: Network flow theory. RAND. 1956, 923: 113.Google Scholar
 West D: Introduction to graph theory. 2001, Upper Saddle River, NJ: Prentice HallGoogle Scholar
 Stone HS: Multiprocessor scheduling with aid of network flow algorithms. IEEE Trans Softw Eng. 1977, 3 (1): 8593.View ArticleGoogle Scholar
 Nemhauser G, Wolsey L: Integer and combinatorial optimization. 1988, New York: WileyView ArticleGoogle Scholar
 Borgatti SP: Centrality and network flow. Social Networks. 2005, 27 (1): 5571. 10.1016/j.socnet.2004.11.008.View ArticleGoogle Scholar
 Edmonds J, Karp RM: Theoretical improvements in algorithmic efficiency for network flow problems. J ACM. 1972, 19 (2): 248264. 10.1145/321694.321699.View ArticleGoogle Scholar
 Nassiri I, MasoudiNejad A, Jalili M, Moeini A: Nonparametric Simulation of Signal Transduction Networks with SemiSynchronized Update. PLoS One. 2012, 7 (6): e3964310.1371/journal.pone.0039643.PubMed CentralView ArticlePubMedGoogle Scholar
 Ma'ayan A, Jenkins SL, Neves S, Hasseldine A, Grace E, DubinThalere B, Eungdamrong NJ, Weng GZ, Ram PT, Rice JJ, et al: Formation of regulatory patterns during signal propagation in a mammalian cellular network. Science. 2005, 309 (5737): 10781083. 10.1126/science.1108876.PubMed CentralView ArticlePubMedGoogle Scholar
 SaezRodriguez J, Simeoni L, Lindquist JA, Hemenway R, Bommhardt U, Arndt B, Haus UU, Weismantel R, Gilles ED, Klamt S, et al: A logical model provides insights into T cell receptor signaling. Plos Computational Biology. 2007, 3 (8): e16310.1371/journal.pcbi.0030163.PubMed CentralView ArticlePubMedGoogle Scholar
 Wang RS, Albert R: Elementary signaling modes predict the essentiality of signal transduction network components. BMC Syst Biol. 2011, 5 (1): 114. 10.1186/1752050951.View ArticleGoogle Scholar
 Bloechl F, Wittmann DM, Theis FJ: Effective Parameters Determining the Information Flow in Hierarchical Biological Systems. Bull Math Biol. 2011, 73 (4): 706725. 10.1007/s1153801096046.View ArticleGoogle Scholar
 Klamt S, SaezRodriguez J, Lindquist JA, Simeoni L, Gilles ED: A methodology for the structural and functional analysis of signaling and regulatory networks. BMC Bioinformatics. 2006, 7: 126. 10.1186/1471210571.View ArticleGoogle Scholar
 Klamt S, von Kamp A: Computing paths and cycles in biological interaction graphs. BMC Bioinformatics. 2009, 10 (1): 111. 10.1186/14712105101.View ArticleGoogle Scholar
 Cui Q, Yu Z, Purisima EO, Wang E: Principles of microRNA regulation of a human cellular signaling network. Mol Syst Biol. 2006, 2 (46):
 Sackmann A, Heiner M, Koch I: Application of Petri net based analysis techniques to signal transduction pathways. BMC Bioinformatics. 2006, 7 (1): 117. 10.1186/1471210571.View ArticleGoogle Scholar
 Wittkop T, Emig D, Lange S, Rahmann S, Albrecht M, Morris JH, Boecker S, Stoye J, Baumbach J: Partitioning biological data with transitivity clustering. Nature Methods. 2010, 7 (6): 19View ArticleGoogle Scholar
 Boecker S, Briesemeister S, Klau GW: Exact Algorithms for Cluster Editing: Evaluation and Experiments. Algorithmica. 2011, 60 (2): 289302.Google Scholar
 Kim YA, Przytycki JH, Wuchty S, Przytycka TM: Modeling information flow in biological networks. Phys Biol. 2011, 8 (3): 03501210.1088/14783975/8/3/035012.PubMed CentralView ArticlePubMedGoogle Scholar
 Neves SR, Tsokas P, Sarkar A, Grace EA, Rangamani P, Taubenfeld SM, Alberini CM, Schaff JC, Blitzer RD, Moraru II, et al: Cell shape and negative links in regulatory motifs together control spatial information flow in signaling networks. Cell. 2008, 133 (4): 666680. 10.1016/j.cell.2008.04.025.PubMed CentralView ArticlePubMedGoogle Scholar
 YegerLotem E, Riva L, Su LJ, Gitler AD, Cashikar AG, King OD, Auluck PK, Geddie ML, Valastyan JS, Karger DR, et al: Bridging highthroughput genetic and transcriptional data reveals cellular responses to alphasynuclein toxicity. Nat Genet. 2009, 41 (3): 316323. 10.1038/ng.337.PubMed CentralView ArticlePubMedGoogle Scholar
 Dasika MS, Burgard A, Maranas CD: A computational framework for the topological analysis and targeted disruption of signal transduction networks. Biophys J. 2006, 91 (1): 382398. 10.1529/biophysj.105.069724.PubMed CentralView ArticlePubMedGoogle Scholar
 Kholodenko B, Yaffe MB, Kolch W: Computational Approaches for Analyzing Information Flow in Biological Networks. Sci Signal. 2012, 5 (220): relView ArticleGoogle Scholar
 Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G: Revealing strengths and weaknesses of methods for gene network inference. Proc Natl Acad Sci U S A. 2010, 107 (14): 62866291. 10.1073/pnas.0913357107.PubMed CentralView ArticlePubMedGoogle Scholar
 Tagkopoulos I, Liu YC, Tavazoie S: Predictive behavior within microbial genetic networks. Science. 2008, 320 (5881): 13131317. 10.1126/science.1154456.PubMed CentralView ArticlePubMedGoogle Scholar
 Mozhayskiy V, Tagkopoulos I: In Silico Evolution of MultiScale Microbial Systems in the Presence of Mobile Genetic Elements and Horizontal Gene Transfer. Lect Notes in Comput Sc. 2011, 6674: 26210.1007/9783642212604_26.View ArticleGoogle Scholar
 Mozhayskiy V, Tagkopoulos I: Horizontal gene transfer dynamics and distribution of fitness effects during microbial in silico evolution. BMC Bioinformatics. 2012, 13: S10PubMed CentralView ArticlePubMedGoogle Scholar
 Mozhayskiy V, Tagkopoulos I: Guided evolution of in silico microbial populations in complex environments accelerates evolutionary rates through a stepwise adaptation. BMC Bioinforma. 2012, 13Google Scholar
 Mozhayskiy V, Miller R, Ma KL, Tagkopoulos I: A Scalable Multiscale Framework for Parallel Simulation and Visualization of Microbial Evolution. 2011, Salt Lake City, UT: TeraGrid'11View ArticleGoogle Scholar
 Keseler IM, ColladoVides J, SantosZavaleta A, PeraltaGil M, GamaCastro S, MunizRascado L, BonavidesMartinez C, Paley S, Krummenacker M, Altman T, et al: EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Res. 2011, 39: D583D590. 10.1093/nar/gkq1143.PubMed CentralView ArticlePubMedGoogle Scholar
 GamaCastro S, Salgado H, PeraltaGil M, SantosZavaleta A, MunizRascado L, SolanoLira H, JimenezJacinto V, Weiss V, GarciaSotelo JS, LopezFuentes A, et al: RegulonDB version 7.0: transcriptional regulation of Escherichia coli K12 integrated within genetic sensory response units (Gensor Units). Nucleic Acids Research. 2011, 39: D98D105. 10.1093/nar/gkq1110.PubMed CentralView ArticlePubMedGoogle Scholar
 Sharma UK, Chatterji D: Transcriptional switching in Escherichia coli during stress and starvation by modulation of Sigma 70 activity. Fems Microbiology Reviews. 2010, 34 (5): 646657.View ArticlePubMedGoogle Scholar
 Wei Y, Lee JM, Richmond C, Blattner FR, Rafalski JA, LaRossa RA: Highdensity microarraymediated gene expression profiling of Escherichia coli. J Bacteriol. 2001, 183 (2): 545556. 10.1128/JB.183.2.545556.2001.PubMed CentralView ArticlePubMedGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.