Hierarchical decomposition of dynamically evolving regulatory networks

Background Gene regulatory networks describe the interplay between genes and their products. These networks control almost every biological activity in the cell through interactions. The hierarchy of genes in these networks as defined by their interactions gives important insights into how these functions are governed. Accurately determining the hierarchy of genes is however a computationally difficult problem. This problem is further complicated by the fact that an intrinsic characteristic of regulatory networks is that the wiring of interactions can change over time. Determining how the hierarchy in the gene regulatory networks changes with dynamically evolving network topology remains to be an unsolved challenge. Results In this study, we develop a new method, named D-HIDEN (Dynamic-HIerarchical DEcomposition of Networks) to find the hierarchy of the genes in dynamically evolving gene regulatory network topologies. Unlike earlier methods, which recompute the hierarchy from scratch when the network topology changes, our method adapts the hierarchy based on the wiring of the interactions only for the nodes which have the potential to move in the hierarchy. Conclusions We compare D-HIDEN to five currently available hierarchical decomposition methods on synthetic and real gene regulatory networks. Our experiments demonstrate that D-HIDEN significantly outperforms existing methods in running time, accuracy, or both. Furthermore, our method is robust against dynamic changes in hierarchy. Our experiments on human gene regulatory networks suggest that our method may be used to reconstruct hierarchy in gene regulatory networks.

the highest possible hierarchy level. Figure 1 illustrates this concept. Figure 1(a) shows a hypothetical network. Figure 1(b) illustrates the same network after the nodes are assigned to three hierarchy levels.
Correct assignment of the hierarchy levels to the genes in the regulatory networks is essential for comprehensive analysis of the gene regulatory networks, particularly for understanding the impact of external perturbations and diseases on the transcription and abundance of genes and gene products. While the experimental approaches can be used to identify the genes and their interactions, finding the hierarchy between these genes remains to be a computational task.
Some of the key approaches for finding the hierarchy of genes in regulatory networks computationally include BFS [11], Vertex-Sort [12], HINO [13], HIDEN, and Divide and Conquer HIDEN (DC-HIDEN) [14]. The BFS method uses breadth-first search technique to assign hierarchies to genes in a network [11]. Although this method is applicable to any gene regulatory network, it fails to correctly assign hierarchy levels for networks that contain cycles. Vertex-Sort uses a topological sort algorithm to assign hierarchies to genes in the given network [12]. This method does not assign a single hierarchy level for each gene; it instead assigns a range of possible levels for genes. HINO improves the BFS method by introducing two correction steps, called "downgrade" and "upgrade". Downgrade step places each node at the lowest level where that vertex and its regulating nodes are assigned to. Upgrade step assigns nodes to the next higher level if they regulate other nodes which are located at the same level and no other node regulates them. Despite this improvement, its accuracy remains very low. HIDEN [14] formulates the problem using integer linear programming. In comparison to BFS, Vertex-Sort and HINO, HIDEN produces more accurate hierarchy assignments. However, this comes at the price of significantly increased computational cost. As a result, HIDEN does not scale to large gene regulatory networks. To remedy this shortcoming, the same study, proposes a divide and conquer strategy, named DC-HIDEN. This strategy computes a set of local solutions for randomly selected subnetworks of the given network and then combines them. As a result of this localization, this method scales to large networks, but its accuracy levels are lower in comparison to that of HIDEN. In summary, all of these aforementioned approaches are either grossly inaccurate or take enormous amount of time as the network size grows. As we explain next, such enormous cost leads to another challenge that makes these methods impractical.
An inherent characteristic of gene regulatory networks is that their topologies are dynamic. New interactions can appear or existing interactions can become infeasible over a period of time [21]. For example, many genes are involved in regulatory networks that control differentiation in different cell types during early development. The same network after nodes are assigned to three hierarchy levels with optimal hierarchical assignment. Dashed arrows indicate conflicting edges. There are two conflicting edges. Thus the penalty of this decomposition is 2. (c) The network obtained by mutating the network in (a) after removing edge (2,4) and inserting edge (4,3). (d) The optimal hierarchical decomposition of the network in (c) into three hierarchy levels.
The interactions between these genes vary throughout development and in different cell types. One reaction that exists at one time point may be lost at a later time point, or a reaction that exists in one cell type may be missing in another cell type. These dynamic changes in the regulatory interactions reflect the changes in the needs of the system and are key for correct development of the organism. After the organism develops into its adult form, cell differentiation is minimal and most of the regulatory interactions that are involved in these regulatory pathways are lost. However, the dynamic rewiring of the gene regulatory networks for other cellular functions is still needed to efficiently respond to environmental changes [22][23][24]. Clearly, such topological changes in the regulatory networks can alter the position of a subset of genes in the hierarchical decomposition, yet we do not know which genes will change or how they will change their hierarchies [15][16][17][18][19][20]. Determining such topological changes is key for understanding how gene regulatory networks function, how the development works for different organisms and finding cures for diseases such as cancer.
To the best of our knowledge, all the methods in the literature targeted at computing the hierarchical decomposition of biological networks ignore the dynamic nature of the network topologies. As a result, they cannot handle dynamic changes in the gene regulatory networks. An obvious way to apply these methods on dynamic network topologies is to recompute the hierarchical decomposition from scratch each time the network topology changes, even when the change is miniscule. This is, however, not practical as finding the hierarchical decomposition is a computationally expensive task. New methods that can quickly update the hierarchical decomposition from the existing hierarchy when network topology is updated slightly are needed.
Our contributions In this paper, we develop a new method, namely Dynamic Hierarchical Decomposition of Regulatory Networks (D-HIDEN) which finds the hierarchy in gene regulatory networks. Unlike existing methods, D-HIDEN can handle networks whose topologies change dynamically. The idea behind D-HIDEN is that small changes in the topology of a network alters the hierarchies of only a small number of nodes of that networks. When the network topology changes, instead of recomputing the hierarchy from scratch for the entire network, D-HIDEN computes the hierarchy levels of a small part of the underlying network that is most likely to move in the hierarchy. We formulate this problem as an integer linear programming problem. The challenge here is to predict which subnetwork yields the variables in the resulting mixed integer linear programming. D-HIDEN tackles this problem by learning a function that describes the probability that a given gene in the network will move in the hierarchy, based on a given set of alterations in the network topology (i.e., insertions and deletion of interactions). Our extensive experimental results on both synthetic and real networks demonstrate that D-HIDEN achieves much better performance (accuracy and running time) than five currently available methods. Applications of our method to the human gene regulatory networks could successfully reconstruct the hierarchy in these networks. Our analysis on the human gene regulatory networks evolving due to cell differentiation reveals changes in the hierarchy of the regulatory roles of genes.
The implementation of the methods we developed in this paper and the datasets we used in our experiments are available at http://bioinformatics.cise.ufl.edu/dhiden. The rest of the paper is organized as follows. In the Methods section, we formally define the problem and describe the D-HIDEN algorithm. In the Results section, we present the results of our method. In the Discussion and Conclusion section we summarize our findings.

Methods
In this section, we first present the key terms that are essential to describe our method. We then explain our method in detail.

Terms and definitions
A gene regulatory network describes the regulatory interactions between genes and their products. Mathematically, we model a gene regulatory network using a directed graph denoted with G = (V , E). In this notation, V denotes the set of nodes where each node corresponds to a unique gene. E ⊆ V × V denotes the set of directed edges where each edge corresponds to an interaction. In the rest of this paper, we will use the term graph to denote directed network unless otherwise stated explicitly. We start by defining the inversion operation on graphs and how we use it to enrich a graph.

Definition 1 (Inversion of a graph). Given a graph
Definition 2 (Direction enriched graph). Consider a graph G = (V , E) and its inverse G i = (V , E i ). We say We say that a graph G = (V , E) is connected if its direction enriched graph contains at least one path from all the nodes of V to all the other nodes of V . In the rest of this paper, unless otherwise specified, we assume G to be directed and connected. It is worth noting that our method does not rely on this assumption. We only make this assumption to simplify the method description. We defer the discussion on disconnected graphs to the end of this section. Verbally, the hierarchical decomposition assigns a level, which is described by an integer, to the nodes of the graph. In this hierarchy, larger numbers indicate a higher level in the hierarchy, and thus, M is the highest level of hierarchy. The Figure 1(b) shows an example of hierarchical decomposition where M = 3.
The hierarchical decomposition of a graph describes the relative role of the genes as the regulators or the regulated ones in that graph. Ideally, if a gene denoted by node u regulates another gene denoted by node v (i.e., there is a directed edge from u to v in the corresponding graph), then we would like to place node u at a higher level in the hierarchy than node v. The following two definitions capture this. For instance, in Figure 1(b), the edge (4, 5) is a conflicting edge because H(4) = H (5). Similarly, the edge (5, 3) is also conflicting as H(5) < H(3).

Definition 5 (Penalty of hierarchical decomposition).
Consider a graph G = (V , E), and its hierarchical decomposition, H : The penalty of H on the graph G is the number of conflicting edges in G based on the hierarchy function H. Formally, it is computed as: For instance, in Figure 1(b), penalty of the given hierarchy is two since there are two conflicting edges. We say that a hierarchical decomposition of a given graph into M hierarchical levels is optimal if it yields the least penalty among all possible hierarchical decompositions.
Definition 6 (Dynamically evolving graphs). Assume that we are provided with a positive integer τ . Also assume that we are given sequences of graphs We say that G is a sequence of dynamically evolving graphs with respect to τ if both of the following two criteria are satisfied for consecutive graphs G i−1 and G i in G, ∀i(1 < i ≤ K), In Definition 6 above, the first condition enforces that the set of nodes should be preserved (we will relax this constraint later in this section). The second condition requires that only a limited amount of edges can be altered (i.e., inserted or removed) from one graph to obtain the next graph in the sequence.
When the threshold τ in Definition 6 is small, instead of considering the K graphs in the dynamically evolving sequence G as K independent graphs, we view them as single graph whose topology is gradually changing from G 1 to G 2 , then to G 3 , and so on until we reach G K .
Notice that as the graph topology evolves as provided in the sequence G, the hierarchy level assigned to the nodes of the graph by the optimal hierarchical decomposition can also change. This is because as new edges are inserted or existing ones are removed, the role of the gene (i.e., node) incident to those edges as the regulator of other genes can change. For instance, in Figure 1(a), if we remove edge (2, 4) and insert edge (4, 3), we obtain the graph in Figure 1(c). Figure 1(d) presents the optimal hierarchical decomposition of the resulting graph. Notice that the hierarchical decomposition in Figure 1(d) differs from that in Figure 1(b). Following from this observation, we define two sets of nodes based on the graphs corresponding to two consecutive graphs of a given dynamically evolving graphs next.

Definition 7 (Dynamic and complementary nodes).
Consider two consecutive graphs G i−1 and G i in a sequence of dynamically evolving graphs. Let us denote their optimal hierarchical decomposition with H i−1 and H i respectively.
We define the subset of nodes whose hierarchy levels change as G i−1 evolves to G i as the dynamic node set and denote it with D i . Formally We define the nodes which are not in D i but have direct connections to at least one of the nodes in D i as the complementary set and denote it with C i . Formally We are now ready to define the problem considered in this paper formally.
Formal problem definition Given a sequence of dynamically evolving graphs G = (G 1 , G 2 , ..., G K ), our aim is to find an optimal hierarchy assignment for all graphs in G. In other words, for each graph G i in G, we would like to find H i such that The naive solution to the problem above is to compute the hierarchical decomposition for each graph G i in G independently using existing algorithms, such as HIDEN [14]. This is however not practical as the size of the graph (number of nodes and edges) and the number of graphs in G increases, the cost of finding the optimal hierarchical decomposition grows rapidly.

Our algorithm: Dynamic-HIDEN
Here, we describe our method, named Dynamic-HIDEN (D-HIDEN), which computes the hierarchial decompositions of the graphs in a given sequence of dynamically evolving graphs G = (G 1 , . . . , G K ). The central idea behind our method is our conjecture that if the topology of the underlying graph does not change significantly from one graph in the sequence to the next, then their hierarchical decomposition also does not change dramatically. Following from this conjecture, our method computes the hierarchical decomposition of the first graph G 1 similar to the HIDEN method. To compute the hierarchical decomposition for the remaining graphs in G, say G i with 1 < i ≤ K, it exploits two kinds of information: the part of the topology that is different between consecutive graphs G i−1 and G i into consideration as well as the hierarchical decomposition of G i−1 denoted with H i−1 . Figure 2 illustrates this idea. We elaborate on our algorithm next.

Integer linear programming formulation of hierarchical decomposition
D-HIDEN computes the hierarchical decomposition of each graph in a sequence of dynamically evolving graphs G using Integer Linear Programming (ILP). Unlike, existing methods that also use ILP, it creates variables only for a small set of nodes and edges. As we will explain later in detail, these variables correspond to the nodes that will change their position in the hierarchical decomposition with a high probability. We present the ILP formulation next by focusing on the ith graph in the sequence denoted by graph Suppose that we are given the hierarchy assignment H i−1 for the graph G i−1 , and our aim is to compute a new hierarchy assignment H i for the graph G i . We start by constructing an induced subgraph S i = (V s i , E s i ) using the dynamic and complementary node sets D i and C i as follows: Figure 3 illustrates construction of subgraph S i . Notice that knowing the sets V s i and E s i requires knowing the hierarchical decomposition H i of G i , which is actually the problem we are trying to solve here. We discuss how we predict the sets D i and C i to construct the graph S i prior to computing H i later in this section. We continue explaining the ILP formulation.  We keep the hierarchy assignments of all the nodes in V i − D i the same as the previous graph's hierarchy assign- We only compute the hierarchy assignments for the nodes in D i by solving the following optimization problem.
Minimize : Where: M is the maximum number of hierarchy levels This formulation minimizes the number of conflicting edges, where uv can take either 0 (if (u, v) is not a con- is not a conflicting edge), the condition holds no matter uv takes 0 or 1. But in order to minimize (u,v)∈E s i uv , the uv will take 0. The second constraint requires uv to be either 0 or 1. The third constraint assigns nodes in D i to one of the levels from 1 to M. The last constraint says that level assignment for nodes in C i will be taken from H i−1 .
The problem above has |D i | + |E s i | variables. Since the topologies of G i−1 and G i are highly similar (see Definition 6), it is expected that their corresponding assignments H i−1 and H i also share high similarity. Thus, we expect that the cardinality of the dynamic set D i to be very small as compared to that of V i . In other words, very small number of nodes are likely to change their hierarchy assignments as graphs evolve gradually. As a result, the number of variables in our ILP formulation is dramatically smaller than the standard ILP solution, such as HIDEN.

Dealing with node insertion and deletion So far we have formulated our D-HIDEN algorithm for solving a sequence of dynamic graphs
. According to Definition 6 however, the set of nodes may also slightly change as graph evolves. That is, new nodes can be inserted or existing nodes can be removed in a graph in G to obtain the next graph in the sequence. We solve this problem as follows. Consider two consecutive graphs G i−1 and G i in G such that V i−1 = V i . To compute the hierarchy of G i , we do not need to define any variable for the nodes that are removed from V i−1 (i.e. the nodes in V i−1 − V i ). Since we do not know the initial assignment of newly introduced nodes (i.e., nodes in V i − V i−1 ), we include all of these new nodes in the dynamic node set while creating the ILP constraints.
Dealing with disconnected graphs Another assumption we have previously made is that the graph is connected. For disjointed graphs, we apply D-HIDEN on each of the individual components to obtain hierarchical decompositions. The rationale behind this strategy is the following: as there is no edge between any pair of disjointed components, the decomposition is optimal if and only if the decompositions of these components are optimal.
So, what is the challenge? The ILP formulation we provide above is brief and varies slightly from the traditional ILP solutions for the same problem. So, the obvious question is: What is the challenge here?
The crux of our algorithm lies in creating the dynamic set of nodes D i , which will be detailed in this section.
Recall that this set can only be precisely known after the decomposition H i is actually computed. This information is however not known at this stage. It is actually what is computed at the end of this stage. To deal with this cyclic dependency, we predict the set D i prior to computing H i . Let us denote our predicted set with D p i . The performance and the accuracy of our algorithm depends on how well D p i represents D i . Thus, the central challenge lies in predicting the dynamic set precisely. In the following, we describe how we address this challenge.
OUR PREDICTION STRATEGY IN A NUTSHELL. In order to predict whether a node will change its hierarchy level as the graph evolves from G i−1 to G i , we learn a probabilistic model that predicts whether a given node will change its hierarchy level based on its neighboring nodes. We do this by learning from a large collection of graphs with random topological perturbations. The rationale behind this local modelling strategy is that the penalty arising from the hierarchy level of a node is determined by its neighboring nodes. This is because the penalty associated with this node involves conflicting edges between itself and its neighbors. We explain our method in detail next. DETAILED ALGORITHM. Consider two consecutive graphs G i−1 and G i in a sequence of dynamically evolving graphs, with H i−1 given. We start by defining the following sets which partition the nodes of the evolving graph into three classes based on their movements in the hierarchy.
Conceptually, φ down , φ same and φ up contain the set of nodes which move down, stay at the same level, or move up in the hierarchy respectively. Next, we define six more sets describing the six possible relationships between neighboring nodes based on their relative positions in the hierarchy. Figure 4 illustrates the relationships denoted with ψ 1 , ψ 2 , ψ 3 , ψ 4 , ψ 5 , ψ 6 .
Using the sets above, we define two more sets: Next, we define a function f : This function tells the probability that for a given node v, the probability that this node moves down (v ∈ φ down ), stays at the same level (v ∈ φ same ), or moves up (v ∈ φ up ). Let us denote the set of neighboring nodes of a given node v with In order to compute the function f (v, φ), we first need to define two other functions. The first one, denoted with P ψ (v ∈ φ|u ∈ φ ), is the probability that v ∈ φ on the condition that the neighbor node u ∈ φ and (v, u) ∈ ψ where ψ ∈ . The second one is the prior of the given node v, denoted as f 0 (v, φ).
The prior function f 0 represents the probability of movement in hierarchy for node involved in edge operations as denoted by set φ. We explain how we compute the function P ψ and f 0 in detail later in this section. Using these definitions, we formulate f (v, φ) as The first term in this equation is the normalized average contribution from the neighboring nodes. The second term is the prior of the given node v. The parameter α is a real number in the [ 0, 1] interval. It controls the relative contributions of these two terms. Next, we define a vector x of size |V | × | | to describe the relationship of each node in V with each class in . To simplify our notation, we describe x in doubly indexed form as same, up] and i = 1, . . . , | | We further define a matrix A and vector b such that Using these definitions, we rewrite Equation (1) as We compute the solution x * to Equation (2) as After solving Equation (2), the probabilities of node v to move down, stay the same and move up in the hierarchy are given by x * v,down , x * v,same and x * v,up , respectively. Recall that, in order to evaluate Equation (1), we need to compute the prior function f 0 and the function P ψ that captures the conditional probability of the relative location of a node in the hierarchy conditioned on the hierarchy of its neighboring nodes. In the following, we explain how we compute these two functions in detail.

Computation of the prior function
The computation of f 0 involves two parts. The first part trains a model using a large number of random graphs. The second part predicts the value of f 0 for a given graph.
Algorithm 1 presents the first part. Briefly, for training, the algorithm takes two consecutive graphs G 1 and G 2 in the dynamically evolving sequence along with their hierarchical decompositions H 1 and H 2 as input. It then iterates over all the edges (lines 5-27) and identifies the ones that are not common to the two graphs (lines 6-13). These are the edges that have dynamically changed (i.e., inserted or removed) in the input graphs. For all such edges (u, v), it then counts the number of times we observe that these nodes have moved up, moved down, or stayed at the same level (lines 14-15). Next, it records the same statistics for the neighboring nodes of u and v (lines [16][17][18][19][20][21][22][23][24][25]. Finally, it normalizes these counts with the total number of nodes in each class to find the fraction of nodes that move up, move down, or stay at the same level for each class (lines [28][29][30][31][32][33][34][35]. Algorithm 2 presents the second part. This algorithm takes the two consecutive graphs G 1 and G 2 along with the hierarchical decompositions H 1 of G 1 , and the model M learned in Algorithm 1. It starts by initializing the prior function, so that all the nodes stay at the same level with 100% probability (lines 5-7). This defines the base case, where there is no change in the graph topology. It then iterates over all the edges (lines 9-31) and identifies the ones that are not common to the two graphs (lines [11][12][13][14][15][16][17]. continue 13: end if 14: src[E,MOVEMENT-OF-NODE(H 1 (u), H 2 (u))] ++ (see Table 1) 15: dst[E,MOVEMENT-OF-NODE(H 1 (v), H 2 (v))]++ 16: for each neighbor node w of u do 17: Table 1) 19: src_neighbor[E,M,P]++ 20: end for 21: for each neighbor node w of v do for each m ∈ M do 23: for each neighbor node w of u do 24: end for 26: for each neighbor node w of v do 27: 28: end for 29: end for 30: end for 31: end for 32: for i = 1 to |V | do 33: f 0 (i, :) = f 0 (i, :)/sum(f 0 (i, :)) 34: end for 35: Return f 0 conditioned on each other. Algorithm 3 shows a pseudocode that describes how we learn this function.
The algorithm takes a graph G 1 , its hierarchical decomposition H 1 , and a potentially large integer N denoting the number of random hierarchy perturbations (i.e., number of times, the hierarchy of nodes will be altered) to be used for training as input. It iteratively alters the position of a node N times (line 8). At each iteration, it randomly picks a node and moves it to a different level in the hierarchy (lines 9-10). After this alteration, it recomputes the hierarchy of the rest of the nodes by fixing the location of this node at the newly selected location (line 11). We denote this new hierarchy with the function H 2 . It then records the movement of the randomly selected node as well as those of all of its neighbors by comparing H 1 and H 2 (lines [12][13][14][15][16][17][18][19][20][21][22][23][24]. It records the number of times each of the relative movement class is observed over all these neighbors (lines [25][26]. Finally, it returns the distribution of the fraction of times each class is observed as the function P ψ . Fix node v at level h, and solve the ILP problem, resulting in hierarchy H 2 . 12: if h < H 1 (v) then 13: m 2 ="move down" 14: else 15: m 2 ="move up" 16: end if 17: for each neighbor node w of v do 18: if H 2 (w) < H 1 (w) then 19: ψ =NEIGHBOR-RELATIONSHIP(v, w, E 1 , H 1 ) (see Table 2) 26: J(m 1 , m 2 , ψ)++ 27: end for 28: end for 29: Return JOINT-PROBABILITY-NORMALIZATION(J) (see Table 2)

NEIGHBOR-RELATIONSHIP(v, w, E, H) JOINT-PROBABILITY-NORMALIZATION(J)
Inputs: Joint probability J M = {"move down","stay same","move up"} for φ ∈ {"move down","move up"} do for ψ = 1 to 6 do count end for end for end for for ψ = 1 to 6 do P ψ ("move down"|"stay same") = 0 P ψ ("stay same"|"stay same") = 1 P ψ ("move down"|"stay same") = 0 end for Return P ψ How do we choose the dynamic node set? So far, we have described how we compute the probability distribution for each node in G i−1 to move up, down, or remain at the same level in the hierarchy based on the topological differences between G i−1 and G i . Recall that the aim of D-HIDEN is to recompute the hierarchy levels for only a small set of nodes (called dynamic node set) that can move in the hierarchy rather than recompute it for the entire graph. The final question we need to answer to complete the description of our method is how we construct the dynamic node set. To find the node in this set, we first sort all the nodes in G i in ascending order of their probability to stay unchanged in the hierarchy. More specifically, we sort all v ∈ V i in increasing order of x * v,same . The reason behind this ranking strategy is that nodes with higher ranks have higher probability to change their hierarchy levels, and are more preferable to be a part of the dynamic node set.

Results
In this section, we evaluate the performance of the D-HIDEN method extensively on both synthetic and real datasets. We compare the D-HIDEN method to five currently available methods that can only deal with a static network topology: BFS, Vertex-Sort, HINO, HIDEN, and DC-HIDEN. To the best of our knowledge, our method is the first one to address the hierarchical decomposition problem for dynamically evolving networks. To use them for a sequence of dynamically evolving networks G = (G 1 , G 2 , . . . , G K ), we run each of these competing methods independently on each G i in G. We compute the performance of these methods in terms of the resulting penalty and the running time.

Datasets
We use both synthetically generated and real gene regulatory network datasets in our experiments.

SYNTHETIC DATA SETS
We randomly generate scale-free gene regulatory networks following the Barabasi-Albert model [10] for varying number of nodes and densities (i.e., number of edges per node). Using this model, we create the first network, G 1 , in the sequence of dynamically evolving networks for each sequence, G = (G 1 , G 2 , . . . , G K ). Let us denote each network G i in this sequence with G i = (V i , E i ). In order to construct the subsequent networks G i (1 < i ≤ K) in each sequence, we mutated the topology of G i−1 using the degree preserving edge shuffling method [25]. This technique is frequently used in the literature to alter network topology while ensuring that all nodes maintain their degrees. Briefly, each mutation step of this technique randomly picks two edges, such that the edges do not share a node. It then swaps the end points of those edges. For instance, assume that the two randomly selected edges are (u 1 , v 1 ) and (u 2 , v 2 ). The mutation step replaces these two edges with (u 1 , v 2 ) and (u 2 , v 1 ). Given a mutation rate r ≥ 0, we perform r×|E i−1 | 2 mutation steps on G i−1 to generate G i . We use r = 0.1 as the mutation rate in our experiments (i.e. at most 10% of the edges are swapped) unless otherwise specified.

REAL DATA SETS
We use the human gene regulatory network dataset of Neph et al. 2012 [21]. This dataset contains the regulatory information of the transcription factors for 41 different cell and tissue types. The number of nodes in these networks vary from 493 to 533, and the number of interactions range from 16,461 to 17,320. Among these 41 networks, we select four dynamically evolving sequences, each containing three cell types that are known to follow each other throughout the development stages. The first one contains embryonic stem cells, hematopoietic stem cells and erythroid (K562) cells. The second contains embryonic stem cells, hematopoietic stem cells and Th1 T-Lymphocyte. The third contains embryonic stem cells, hematopoietic stem cells and B-Lymphocyte (CD20+). The last one contains embryonic stem cells, skeletal myoblast and skeletal muscle cells. Figure 5 illustrates these four sequences.

Quality measures used
We used various quantifiable measures to evaluate the success and the limitations of our method and the competing methods.
• We report the number of conflicting edges (see Definitions 4 and 5) as the penalty of a given hierarchical decomposition. Smaller values of this measure indicates better results. • Recall that our method aims to accelerate the hierarchical decomposition in dynamically evolving networks by predicting the nodes that are likely to move in the hierarchy. Thus, after the first network in the sequence, errors in prediction may increase the penalty of the resulting decomposition. We report the accuracy of our method in terms of how it compares to the exhaustive method, HIDEN, when HIDEN uses all the nodes and edges as variables for all networks in the sequence. Let us denote the penalties of the methods HIDEN and D-HIDEN with penalty(HIDEN) and penalty(D-HIDEN) respectively. We compute the accuracy as 1 + penalty(HIDEN) 1 + penalty (D-HIDEN) .
Accuracy takes a value in the (0, 1] interval. Larger values indicate better results. It is worth mentioning that the mixed integer linear programming solution of the hierarchical decomposition problem can have time complexity exponential in the number of variables used. Therefore, the true optimal result can only be computed for very small networks which are not considered in this paper.

Implementation and sytem details
We implemented the D-HIDEN, HIDEN, DC-HIDEN, BFS, Vertex-Sort and HINO algorithms using MATLAB. We conducted all the experiments on a hexa core 4.5-GHz CPU (Intel 3960x) 64 GB-memory computer, running the Linux operating system.

Evaluation of large scale networks
We first compare our method against the state-of-theart methods in the literature, namely BFS [11], Vertex-Sort [12], HINO [13] and DC-HIDEN [14] for large scale networks. We omit the HIDEN method in this experiment as it does not scale to networks with over 100 nodes.
To observe the performance trend as network size varies, we run experiments on different network sizes, ranging from 100 to 1000. For each network size, we randomly create 50 initial networks with edge density 2. Then, we mutate these networks with mutation rate 0.1 to get the corresponding mutated networks in the sequence of dynamically evolving networks. In order to apply our method, we first compute the initial hierarchical decomposition of the first network at each dynamically evolving sequence using the DC-HIDEN method. We then compare the performance of hierarchical decomposition on the mutated networks. Note that for each sequence of networks, the initial decomposition needs to be computed only once. We set the size of dynamic node set for our method (see Definition 7) to 50 for all networks. To ensure that the results are reliable, we repeat this process to create 50 such mutated networks for each network size and report their average penalty and running time ( Figure 6). D-HIDEN yields significantly less penalty than all the competing methods for all network sizes (Figure 6(a)). Thus, it is the most accurate among all competing methods. Furthermore, the gap between the accuracy of D-HIDEN and the rest of the methods grow consistently as network size increases. These results are highly encouraging especially given the fact that D-HIDEN considers updating the hierarchy levels of only a small subset of nodes while all the other methods consider updating all the nodes.
To determine the computational cost of these methods next we compare the running times of the two most accurate methods, D-HIDEN and DC-HIDEN, as the remaining methods have extremely low accuracies. Figure 6(b) shows that D-HIDEN is orders of magnitude faster than DC-HIDEN. As the network size grows, DC-HIDEN's running time increases exponentially while D-HIDEN's running time increases linearly. As the network size grows to 1000 nodes, the penalty of the hierarchical decomposition of our method is 15% less than DC-HIDEN and it is also about 100 times faster. These results suggest that as the network size increases, our method becomes even more advantageous.

Evaluation of D-HIDEN under various network characteristics
So far in our experiments, we have demonstrated the superiority of D-HIDEN against other scalable methods.
Here, we focus further on our method and compare it against HIDEN. The HIDEN method is exhaustive as it minimizes the penalty while allowing all node and edges to be variables. On the other hand, D-HIDEN creates variables for only a small fraction of the nodes and edges. Therefore, HIDEN yields the smallest possible penalty among all the hierarchical assignments (using the ILP method). Thus, the purpose of the next set of experiments is to determine how close our solutions are to those of HIDEN. We use synthetically generated scale-free networks of different characteristics in these experiments. Due to explosive increase in the running time of the HIDEN method, we limit our experiments to the largest possible network sizes and densities we could run HIDEN on our system. We focus on three key parameters which affects the performance of hierarchical decomposition in this experiment: number of nodes, network density, and maximum number of hierarchy levels allowed. While evaluating each of these parameters, we fix all the others (Table 3). For each parameter setting, we first randomly generate an initial network. We then mutate that network with mutation rate 0.1 to get subsequent evolving networks. We compute the hierarchy assignment of the first network in each sequence using HIDEN. For the remaining networks in the sequence, we use D-HIDEN to compute the hierarchy incrementally. In order to observe the impact of the size of the dynamic node set, we vary the size of this set from 10% to 80% of the total number of nodes in the given network. Notice that smaller sizes for this set yield fewer variables in the ILP formulation of D-HIDEN. Inversely, large sizes imply conservative predictions for D-HIDEN allowing for more nodes to alter their hierarchy levels. We also run HIDEN on each network. For each parameter setting, we repeat this experiment 500 times and measure the average penalties and running times.

Impact of network density
First, we explore the impact of network density. We fix the number of nodes to 50 and the number of allowable hierarchy levels to 5. We experiment for densities 1,2 and 3. Note that for larger densities, HIDEN did not run till completion due to exponential increase in the running time.  Each row corresponds to one experiment where we fix the value of two parameters and vary one.
For high density networks the accuracy of D-HIDEN is very high even if we use a small fraction of nodes as the dynamic node set (Figure 7(a)). For low density networks, although the accuracy is low for very small dynamic node set, it improves quickly as the size of dynamic node set increases. The running time of our method is significantly faster to that of HIDEN for dense networks (Figure 7 (b)). Especially for small dynamic node sets, the speed up of our method over HIDEN becomes more dramatic. For example at density equal to 3, when the size of the dynamic node set is 40% of the node set, our method is almost two orders of magnitude faster than HIDEN. These results are highly encouraging since our method can obtain high accuracies using small dynamic node sets.
Collectively these results suggest that our method is highly preferable, especially when the network has medium to high density.

Impact of number of nodes
Next, we consider the impact of number of nodes. We fix the density to 2 and the number of allowable hierarchy levels to 5. We experiment for the number of nodes 30,50 and 70. D-HIDEN can achieve a very high accuracy regardless of the network size (Figure 8(a)). For example, when the dynamic node set includes only 20% of nodes, D-HIDEN can achieve 90% accuracy. This suggests that our method can be applied to large networks without incurring noticeable decrease in accuracy. Figure 8(b) reflects that as the number of nodes increases, the running time of HIDEN increases dramatically while that of D-HIDEN grows gradually. Interestingly, as the size of the dynamic node set increases, the advantage in the running time of D-HIDEN does not noticeably decrease initially (e.g. ratio from 0.1 to 0.5), which allows us to set a higher ratio for D-HIDEN to achieve better accuracy without too much additional cost in running time. In summary, these results suggest that our method is favorable, especially when the size of networks gets larger.

Impact of the number of allowed hierarchy levels
Finally, we focus on the impact of maximum hierarchy level. We fix the density to 2 and the number of nodes to 50. We experiment for the number of allowed hierarchy levels 3,5 and 10. We observe a very high accuracy of D-HIDEN (Figure 9(a)). As the number of allowed hierarchy levels decreases, the accuracy increases consistently, because the number of nodes that need to change their hierarchy levels also decreases. Based on these results, we can conclude that D-HIDEN consistently achieves extremely high accuracy, especially for small number of allowed hierarchy levels. Figure 9(b) suggests that the running time is relatively insensitive to the number of allowed hierarchy levels.

Evaluation of the impact of the number of dynamic steps
So far, we have analyzed the performance of our method for only one dynamic step. In other words, in the sequence of dynamically evolving networks G = (G 1 , . . . , G K ), K was set to 2. In this experiment, we evaluate how the performance of our method is affected by the length of this sequence (i.e. K). The purpose is to explore how the error propagates as the deviation between the topology of the subsequent networks from that of the initial network grows with increasing value of K.

Evaluation on large networks
In this experiment, we generate synthetic network with 500 nodes and density equal to 2 to use as the initial network G 1 in the sequence. We then create 25 subsequent networks G 1 , G 2 , . . . , G 25 . To construct each G i (1 < i ≤ 25), we mutate G i−1 with 0.1 mutation rate. We repeat this process 1000 times to construct 1000 such independent sequences of dynamically evolving networks. Because these networks are too large for the HIDEN method, we compare our method to the most accurate competing method (DC-HIDEN) which can solve networks of this size.
We observe that as the number of mutation step increases, the advantage of our method over the DC-HIDEN enlarges (Figure 10(a)). Initially at mutation step 1, our method has about 20% less penalty than the DC-HIDEN, and when the mutation step reaches 24, our method has 68% less penalty than the DC-HIDEN. This reveals that the accuracy of our method relative to that of DC-HIDEN gets better on continuously changing networks with growing number of networks in the sequence. Though the initial hierarchy assignments of base networks were inaccurate (as they were computed with DC-HIDEN), at each mutation step our method has an opportunity to identify the nodes whose hierarchy levels should be updated, and sets them as variables instead of  constants. Thus, it corrects the hierarchy of new set of nodes each dynamic step.

Evaluation on small networks
Next, we use networks with a small number of nodes to compare the performance of our method against the exhaustive HIDEN method. More specifically, we construct networks with 50 nodes and density equal to 2. Similar to the previous experiment, we generate sequences of dynamically evolving networks with 25 networks in each sequence, and use 0.1 mutation rate while altering the topology of consecutive networks. We repeat this process 1000 times and report the average results.
We observe that as the number of mutation steps increase, the advantage of HIDEN over our method enlarges (Figure 10 (b)). This reveals that HIDEN is intrinsically better than our method in terms of accuracy only (note that again the computational cost of HIDEN is exponentially higher than our method as it exhausitively considers all nodes in all networks. Thus, for networks with over 100 nodes, it becomes intractable). The reason behind this trend is also similar to the previous experiment but this time the initial assignment of base networks were computed by HIDEN. Interestingly the curve converges quickly around step 10, where the penalty of our method is only about 25% higher than HIDEN. This is very promising, because even after mutating the initial network 24 times with 10% mutation at each step, the error incurred by our method does not grow significantly. This suggests that our method does not diverge from the optimal solution greatly even for a large number of dynamic steps.

Results on real data
To assess the applicability of our method to real biological regulatory networks, in this section we analyze four dynamically evolving cell lineages each containing three cell types. In the first three cell lineages embryonic stem cells are developed into hematopoietic stem cells, and the hematopoietic stem cells are developed into erythroid, Tlymphocyte and B-lymphocyte respectively. In the fourth cell lineage, embryonic stem cells develop into skeletal myoblast and eventually to skeletal muscle.

Penalty comparison on real data
In the above sections, we have demonstrated the superiority of D-HIDEN against the state of the art methods BFS, Vertex-Sort, HINO and DC-HIDEN on synthetic datasets. To show the applicability of our method to real gene regulatory networks, here we compare the accuracy of the D-HIDEN method to these four methods on a real dataset (described above). In this comparison, we focus on the gene regulatory networks of the seven cell types in our dataset: hESC, CD34+, HSMM, K562, Th1, CD20+ and SKMC. To apply our method; we first find the initial hierarchical decomposition of the hESC gene regulatory network (first network in the chain) using DC-HIDEN method and then determine the hierarchy assignments of the genes in the rest of the cells using D-HIDEN. In the final step, we compute the penalties for each celltype based on the resulting hierarchy assignments. We apply the other methods as follows; we first find the hierarchical decomposition of all the networks in different cell types using the specific method and then compute the penalties for each network based on the resulting assignments.
Our results demonstrate that D-HIDEN yields significantly less penalty than all the competing methods ( Figure 11). Thus it is the most accurate among the competing methods on gene regulatory networks of the seven cell types. We observe that the penalty score of the HINO, Vertex Sort and BFS methods are usually at least 3 fold more than the D-HIDEN method. Similarly, D-HIDEN method is more accurate than the DC-HIDEN method. The superior accuracy of D-HIDEN method on real datasets strengthens our analysis on the synthetic datasets, and suggests that our method can be used to create highly accurate hierarchical decomposition of real datasets.

Biological derivations on real data
Here, we apply our method to four dynamically evolving cell linages described above. The experimental results reveal the changing hierarchy of gene regulation in these four lineages of embryonic stem cells (Tables 4 and 5). Similar trends can be observed for each cell lineage. In the lineage of embryonic stem cells (H7-hESC) that develop into erythroid cells (K562), most genes (146 genes) are expressed at level five the highest level in the genetic hierarchy at the embryonic stem cell stage (Table 4). At the erythroid stage, most genes (156 genes) are at level one, which is the lowest level in the hierarchy. In addition, the highest number of genes (533 genes) is observed at the embryonic stem cell stage and this number gradually decreases in each of the following two developmental stages. These two observations are also valid for the cell lineages that develop into T-lymphocyte (Th1) and Blymphocyte (CD20+) cells. In the cell lineage that develop into skeletal muscle cells (SKMC) the former trend is also observed but the latter is not, in that the number of genes observed at the final stage, skeletal muscle cell, is greater than the number observed in the intermediate, skeletal myoblast (HSMM) stage. It is important to emphasize that as these stem cells develop, more genes move down in the hierarchy (Table 5) in comparison to moving up, which suggests that in general the role of genes as key regulators become less important since they regulate fewer downstream target genes. We observe that most of the genes do not change their hierarchy level, which suggests that the network rewiring is substantial, but it does not involve most of the genes. It is also interesting to note that genes do not fluctuate in their rankings very often. The changes in the hierarchy levels of all of these genes should be analyzed in detail to understand their changing roles in different cell lines.
For example the MYOG gene moves up in the hierarchy for skeletal muscle cell lineage. Myogenin (MYOG) is the second muscle-specific gene to be expressed in embryonic stem cells in the process of muscle cell development [26]. In a study in myoblasts Brunetti and Goldfine found MYOG to be important in muscle cell development [27]. Mice with a mutation in MYOG could develop myoblasts but the myoblasts could not further form functional myofibers; i.e. MYOG is important for later stages of muscle cell differentiation [28]. In addition, Hasty et al. 1993 found that when MYOG is knocked out in mice embryos, the embryos develop but die after birth [29]. These results are consistent with this gene moving up in the genetic hierarchy, as it becomes important later in the muscle cell development. One gene that moves down in the muscle cell lineage is MEF2C because of its importance in early myoblast development. In embryonic stem cells, this gene is used as a marker of cardiac cells and its expression is found to be increased with greater Figure 11 Comparison of the penalties arising from the hierarchical decomposition obtained by five methods on real datasets. In the figure, x-axis represents the seven cell types and y-axis represents the penalty score.  [30]. In myoblasts, this gene is an important indicator of early muscle cell differentiation activity because of its importance in regulating skeletal muscle differentiation [31,32]. Taken together, genes that move up or down in the hierarchy seem to have a change in their function and relative importance. The literature supports this conclusion, although more research should be performed to assess the exact functions of these genes in these specific cell types at different stages in cell development.
STAT5 is another gene that moves up the hierarchy in the erythroid cell lineage. When active STAT5 is expressed in embryonic stem cells, the cells are more likely to undergo formation of hematopoietic stem cells (CD34+) [33]. STAT5 is suggested to be important for cell proliferation in hematopoietic stem cells. Overexpression of STAT5 in hematopoietic stem cells resulted in greater cell growth of these cells through the PI3kinase/AKT pathway [34]. Furthermore, if STAT5 is over expressed in hematopoietic stem cells (CD34+) there is increased renewal of these cells and the hematopoietic stem cells are much more likely to acquire erythroid cell fate [35]. Other researchers have found that in erythroid cells STAT5 is in its active, phosphorylated state and when STAT5 expression is knocked out, these cells exhibit reduced growth in vitro [36]. One gene that moves down in the erythroid cell lineage is GFI1. In embryonic stem cells, GFI1 and GFI1B are downstream targets of RUNX signaling and when RUNX is knocked out, GFI1 along with GFI1B promote hematopoiesis as in the studies with RUNX-/-, the cells display increased expression of the hematopoietic cell marker CD41 as well as a rounded shape [37]. This study suggests that in normal embryonic stem cell development, RUNX expression causes GFI1 expression to be reduced. Since GFI1 may prevent proliferation of hematopoietic stem cells [38], and GFI1 is a downstream target of C/EBPα, which prevents cell proliferation when GFI1 levels are low not all cells will become hematopoietic [39]. Thus the roles for the STAT5 and GFI1 genes are changing through development and it is likely that the hierarchy levels of these genes are consequently changing in the gene regulatory network. Our analysis suggests that the STAT5 gene moves up in hierarchy, and GFI1 moves down in the hierarchy. The literature cited above is not clear about the relative importance of these genes in different cell types. So, our method's predictions suggest potentially new experiments to elucidate the changing role of these genes in different cell types. Clearly, a thorough experimental study is needed to find the relative position of these genes in the regulatory hierarchy in different cell types.

Discussion and conclusion
Biological systems are tightly controlled, and more importantly, they are affected by dynamic changes in the gene regulatory networks. Because of that, understanding the dynamic changes in gene regulatory networks is critical for a true comprehension of biological systems. One particularly important organizational feature of the gene regulatory networks is the hierarchy of the interactions. Hierarchy in gene regulatory networks describes the flow of control in biological systems. Major experimental and computational efforts are performed to establish the gene regulatory networks; however finding hierarchy in these networks is very challenging. There have been studies to discover the hierarchy in gene regulatory networks, however these studies mainly focused on static networks. In this study, we presented a novel method named D-HIDEN for discovering hierarchy in dynamic gene regulatory networks. To the best of our knowledge, this is the first method which can incrementally update the hierarchy in a sequence of dynamically evolving networks. D-HIDEN formulates the hierarchy level assignment problem as a local mixed integer linear programming problem. We compared the D-HIDEN method to five currently available methods, namely BFS, Vertex-Sort, HINO, HIDEN and DC-HIDEN, on synthetic and real gene regulatory networks. Our analysis on these networks demonstrated that the D-HIDEN method outperforms the other five methods in terms of minimizing the conflicting edges in hierarchy. In addition to the superior accuracy level of D-HIDEN methods, the running time for this method was also faster than the DC-HIDEN method, which is the next best method in terms of accuracy. Application of the D-HIDEN method to the human gene regulatory networks shows that human genes' hierarchy levels change dynamically. These changes in the hierarchy levels are not random and reflect functional changes in the network. This method could be applied to other biological datasets such as in cancer biology or neuronal development.