Open Access

Exact reconstruction of gene regulatory networks using compressive sensing

BMC Bioinformatics201415:400

https://doi.org/10.1186/s12859-014-0400-4

Received: 18 April 2014

Accepted: 27 November 2014

Published: 14 December 2014

Abstract

Background

We consider the problem of reconstructing a gene regulatory network structure from limited time series gene expression data, without any a priori knowledge of connectivity. We assume that the network is sparse, meaning the connectivity among genes is much less than full connectivity. We develop a method for network reconstruction based on compressive sensing, which takes advantage of the network’s sparseness.

Results

For the case in which all genes are accessible for measurement, and there is no measurement noise, we show that our method can be used to exactly reconstruct the network. For the more general problem, in which hidden genes exist and all measurements are contaminated by noise, we show that our method leads to reliable reconstruction. In both cases, coherence of the model is used to assess the ability to reconstruct the network and to design new experiments. We demonstrate that it is possible to use the coherence distribution to guide biological experiment design effectively. By collecting a more informative dataset, the proposed method helps reduce the cost of experiments. For each problem, a set of numerical examples is presented.

Conclusions

The method provides a guarantee on how well the inferred graph structure represents the underlying system, reveals deficiencies in the data and model, and suggests experimental directions to remedy the deficiencies.

Keywords

Gene regulatory networks Inference Compressive sensing

Introduction

Mathematical modeling of biological signaling pathways can provide an intuitive understanding of their behavior [1]–[3]. However, since typically only incomplete knowledge of the network structure exists and the system dynamics is known to be sufficiently complex, the challenge has become to show that the identified networks and corresponding mathematical models are enough to adequately represent the underlying system. In the last years, many data-driven mathematical tools have been developed and applied to reconstruct graph representations of gene regulatory networks (GRNs) from data. These include Bayesian networks, regression, correlation, mutual information and system-based approaches [4]–[10]. Also, these approaches either focus on static or on time series data. The latter approach has the advantage of being able to identify dynamic relationships between genes.

However, data-driven reconstruction of the network structure itself remains in general a difficult problem; nonlinearities in the system dynamics and measurement noise make this problem even more challenging. For linear time invariant (LTI) systems, there exist necessary and sufficient conditions for network reconstruction [11]. However, for time-varying or nonlinear systems, there has not been as yet any statistical guarantee on how well the inferred model represents the underlying system [12]–[15]. The recent work [16] addresses the problem of data-driven network reconstruction, together with measurement noise and unmodelled nonlinear dynamics, yet this work points out that these complications impose a limit on the reconstruction, and with strong nonlinear terms the method fails. Additionally, identifying whether important nodes in the graph structure are missing, how many are missing, and where these nodes are in the interconnection structure remains a challenging problem.

Moreover, in order to continue to have an impact in systems biology, identification of the graph topology from data should be able to reveal deficiencies in the model and suggest new experimental directions [17]. For example, Steinke et al. [18] presented a Bayesian method for identifying a gene regulatory network from micro-array measurements in perturbation experiments and showed how to use optimal design to minimize the number of measurements.

Since biological regulatory networks are known to be sparse [19]–[23], meaning that most genes interact with only a small number of genes compared with the total number in the network, many methods [12]–[15],[24]–[27] take advantage of the sparsity. The methods typically use l 1-norm optimization, which leads to a sparse representation of the network and improves the ability to find the actual network structure. Even though many methods [14],[15] show that the reconstruction results are fairly good, the methods cannot guarantee exact recovery. This stems from the fact that the so-called incoherence condition is typically not satisfied for the matrix Ω in a linear measurement model Y=Ω q, where q is the signal to be reconstructed, Y is the measurement, and Ω is known as the sensing matrix. In this paper, we construct Y and Ω from time series protein or gene expression data with candidate basis functions or kinetic features, and q reflects the (unknown) underlying GRN structure to be reconstructed. Roughly, incoherence is a measure of the correlation between columns of the sensing matrix. Since the incoherence condition of the sensing matrix provides a metric of performance, this is one of the motivating factors for the use of compressive sensing (CS) [28] in GRN reconstruction. CS is a signal processing technique for efficiently acquiring and reconstructing a signal by taking advantage of the signal’s sparsity and allowing the entire signal to be determined from relatively few measurements under a certain condition, i.e., it requires that the incoherence condition to be satisfied. In the Human Epidermal Growth Factor Receptor2 (HER2) positive breast cancer signaling pathway that we studied in [29],[30], time series data sets consist of only 8 time point measurements of 20 protein signals, and we would like to use this limited data to identify a graph structure which could have 20×20 or 400 edges.

In this paper, we consider reconstruction of biochemical reaction networks (protein-protein interaction) or GRNs (including mRNA, transcription factors). Biological processes in cells are properly performed by gene regulation, signal transduction and interactions between proteins and thus, the network structure that we consider is an abstraction of the system’s biochemical reaction dynamics, describing the manifold ways in which one substance affects all the others to which it is connected. Thus, we are interested in directed graph representations of signaling pathways and propose models of biochemical reaction networks or GRNs as a set of basis functions or features that best explain a set of time series observations such as protein expression or gene expression time series data. We develop a new algorithm for GRN reconstruction based on CS. First, we focus on sparse graph structures using limited time series data with all nodes accessible and no measurement noise. We test the network reconstruction algorithm on a simulated biological pathway in which the structure is known a priori. We explain the limitation of the proposed method’s performance when the dataset naturally has high coherence and propose a way to overcome this limitation by designing additional effective experiments. Thus, one big part of the whole study is the method for proposing new experiments. As systems biology matures, experimental study design becomes more important in systems biology rather than simply building computational models of experimental data. Thus, the proposed method can be useful for experimental design. By revealing deficiencies in the data and model, the method can suggest experimental directions to remedy the deficiencies. By doing this, we can minimize the number of experiments and reduce the cost of experiments.

Next, the proposed algorithm is extended to a more general problem: we consider partially corrupted data with data inconsistencies due to model mismatch, and measurement noise affecting all the data. Typically, data inconsistencies may result from missing nodes in the model; or in some cases arbitrary data corruption may result from human errors such as mislabeling or the improper use of markers or antibodies. The question is whether one can still recover the graph structure reliably under these conditions. Inspired by a robust error correction method [31], the exact recovery of the graph structure can be guaranteed under suitable conditions on the node dynamics, provided that hidden nodes can affect relatively few nodes in the graph structure. Also, a set of numerical examples is provided to demonstrate the method, including some from an RPPA (Reverse Phase Protein Array) dataset [32] collected from HER2 positive breast cancer cell lines.

In this paper, the main contributions are the following.
  • The CS framework uses the coherence of the sensing matrix as a performance index, which allows us to assess and optimize mathematically network reconstruction.

  • Coherence also provides a guideline for optimizing experiment design for network reconstruction.

  • By utilizing an error correction method in conjunction with the CS framework, network reconstruction may be performed even when there are hidden nodes and measurement noise.

Background

Overview: compressive sensing

Consider measurements Y R m of a signal q R n :
Y = Ω q
(1)

where Ω R m × n is called the sensing matrix.

One key question [33] is how many measurements m are needed to exactly recover the original signal q from Ω:
  • If m>n and Ω is a full rank matrix, then the problem is overdetermined. If m=n and Ω is a full rank matrix, the problem is determined and may be solved uniquely for q.

  • If m<n, the problem is underdetermined even if Ω has full rank. We can restrict q R n to the subspace which satisfies Y=Ω q. However, q cannot be determined uniquely.

For the underdetermined case, the least squares solution q = arg min q q l 2 = Ω ( ΩΩ ) 1 Y is typically used as the “best guess” in many applications. However, if q is known to be sparse, meaning that many of its components are zero, one might expect that fewer than n measurements are needed to recover it. It is thus of interest to obtain a good estimator for underdetermined problems in which q is assumed to be s-sparse, meaning that s of the elements of q are nonzero. In principle, the theory of compressive sensing (CS) asserts that the number of measurements needed to recover q R n is a number proportional to the compressed size of the signal (s), rather than the uncompressed size (n) [34]. To be able to recover q, CS relies on two properties: sparsity, which pertains to the signals of interest, and incoherence, which pertains to the sensing matrix.

Proposition 1.

[33]. Suppose that any 2s columns of an m×n matrix Ω are linearly independent (this is a reasonable assumption if m≥2s). Then, any s-sparse signal q R n can be reconstructed uniquely from Ω q.

The proof [33] of the above proposition also shows how to reconstruct an s-sparse signal q R n from the measurement Y=Ω q where q is the unique sparsest solution to Y=Ω q:
q = arg min q q l 0 subject to Y = Ω q
(2)
and q l 0 : = i = 1 n I ( q i 0 ) is the cardinality of q. However, the l 0-minimization is computationally intractable (NP-hard in general). Recent breakthroughs enable approximating l 0-optimization by using l 1-minimization which is a convex optimization problem and can be solved in a simple but effective way by linear programming:
q = arg min q q l 1 subject to Y = Ω q
(3)

The l 1-minimization in Equation (3) requires mild oversampling, more specifically, mc·μ(Φ,Ψ)2·s logn for some positive constant c where Ω can be decomposed as a product of a sparsity basis Ψ, and an orthogonal measurement system Φ [28]. We have m measurement in the Φ domain (sensing modality) under bases Ψ (signal model) and μ represents the coherence defined as follows:

Definition 1.

[28]. The coherence of a matrix Ω is given by
μ ( Ω ) = max j < k | Ω j , Ω k | Ω j 2 Ω k 2

where Ω j and Ω k denote columns of Ω . In plain English, the coherence measures that the largest correlation between any two columns in Ω .

Several theoretical results [35] ensure that the l 1-minimization guarantees exact recovery whenever the sensing matrix Ω is sufficiently incoherent. For example, we can say that Ω is incoherent if μ is small. Coherence is a key property in the compressed sensing framework because if two columns are closely correlated, it would be very difficult to distinguish whether the influence from the components in the sparse signal comes from one or the other (recall that the measurement Y(=Ω q) is a linear combination of each columns of the sensing matrix Ω with the components in q as coefficients). Also, numerical experiments suggest that in practice, most s-sparse signals are in fact recovered exactly once m≥4s [28]. Here, “exact recovery” means that we find the sparsest solution (q) such that Y=Ω q.

Therefore, if the sensing matrix Ω satisfies the incoherence condition (mc·μ(Φ,Ψ)2·s logn, or in practice, m≥4s), a sufficiently sparse signal q can be exactly recovered from the limited dataset without any prior knowledge of the number of nonzero elements, their locations, and their values. On the other hand, if the condition is not satisfied, exact recovery cannot be guaranteed [17],[34]. However, it is possible to use the property of coherence to guide biological experiment design, basically to collect a more informative dataset. As we will discuss in this paper, this can be done by inhibiting or stimulating certain genes to manipulate the gene expression.

CS can help reconstruct GRNs

In graph theory, a digraph can be represented by G=(V,E) where V and E represent nodes and edges respectively. For GRNs, each node represents a gene and each edge represents an influence map which models how genes affect each other. For example, the interactions could be how genes inhibit or stimulate each other. Since the connectivities of GRNs are typically unknown, often the best we can do is to select a set of possible candidate functions encoding possible unknown connectivities between genes.

In this section, we formulate a data-driven network identification problem into CS framework: first, we define a dynamical model of gene regulatory network. Then, we encode system dynamics into the sensing matrix (Ω ) and denote unknown connectivities between genes by q, a signal to be recovered.

A dynamical model of gene regulatory networks (GRNs)

We consider a dynamical system described by:
x ̇ = f ( x ) + u + g ( x h ) + w
(4)

where x R n denotes the concentrations of the rate-limiting species which can be measured in experiments; x ̇ = x ̇ 1 x ̇ 2 x ̇ n R n is a vector whose elements are the change in concentrations of the n species over time; f ( · ) : R n R n represents biochemical reactions, such as those governed by mass action kinetics, Michaelis-Menten, or Hill kinetics. Thus, f(·) can include functions of known form such as product of monomials, monotonically increasing or decreasing Hill functions, simple linear terms, and constant terms [24]. u R n denotes the control input which could represent inhibitions and stimulations; For example, if we consider protein-protein interaction, u represents drug-induced perturbation such as inhibition and stimulation. For a GRN, we consider the use of siRNA as a control input to the system for gene knockdown; g ( · ) : R n h R n represents influence from hidden nodes x h R n h , which cannot be measured in experiments; n h is the number of hidden nodes and unknown; and w R n represents energy-bounded process noise or measurement noise. Here x, x ̇ and u is assumed to be known where x can be measured in experiments and x ̇ may not be measured directly in experiments but we could calculate these quantities by interpolating x and using numerical derivatives. In [36], the authors pointed out that although data on time derivative can be difficult to obtain especially in the presence of noise, it is possible to estimate the gene expressions relatively accurately by repeating measurement with careful instrumentation and statistics [37].

Since we do not know whether important nodes in the gene regulatory network are missing, how many missing nodes there are, and how they affect system dynamics (i.e., x h , n h and g(x h )), we denote a vector of (unknown) influence from hidden nodes’ dynamics by v ( g ( x h ) ) ; Also, without loss of generality, since we know u, we define y as follows:
y x ̇ u = f ( x ) + v + w
(5)

Formulating a dynamical system as a GRN

The nonlinear function f(x) can be decomposed into a linear sum of scalar basis functions, f b , i ( x ) R , where we select the set of possible candidate basis functions:
where N is the number of possible candidate basis functions and q ij are unknown parameters which reflect underlying structure, i.e., influence of f b,i(x) on the j-th state ( x ̇ j ). Typically, we may choose a larger set than necessary, and allow the CS method to indicate the importance of each function, as we shall describe. Thus the Equation (5) can be written as follows:
y = S q F b ( x ) + v + w
(7)

where S q = q 11 q N 1 q 1 n q Nn = q 1 q n R n × N reflects the (unknown) underlying GRN structure, q i = q 1 i q 2 i q Ni R N and F b ( x ) = f b , 1 ( x ) , , f b , N ( x ) R N is the vector field which includes possible candidate basis functions. In this way, any biochemical reactions can be represented by a linear map S q and a function F b (x) where S q encodes the underlying graph structure and F b (x) includes all possible candidate functions that could be included in the biochemical reactions.

In practice, we can construct F b (x) by selecting the most commonly used candidate basis functions to model GRNs or protein-protein interaction, for example, all monomials, binomials, other combinations and Hill function.

Example 1.

Consider the simple nonlinear ordinary differential equations (ODEs):
y 1 = x ̇ 1 = γ 1 x 1 + k 1 x 2 n act 1 + x 2 n act y 2 = x ̇ 2 = γ 2 x 2 + k 2 1 1 + x 3 n ihb y 3 = x ̇ 3 = γ 3 x 3 + k 3 1 1 + x 1 n ihb
(8)
where x i denotes the concentration of the i-th species, x ̇ i is the change in concentration of the i-th species, the γ i denotes protein decay rate, the k i denotes the maximum promoter/inhibitor strength. Here, there is no input (u=0), no hidden node (v=0) and no process noise (w=0). Also, n act represents positively cooperative binding (activation) and n ihb represents negative cooperative binding (inhibition). The set of ODEs corresponds to a topology where gene 1 is activated by gene 2, gene 2 is inhibited by gene 3, and gene 3 is inhibited by gene 1 as shown in Figure 1(A). We can write Equation (8) as follows:
y = y 1 y 2 y 3 = x ̇ 1 x ̇ 2 x ̇ 3 = γ 1 0 0 0 k 1 0 0 γ 2 0 0 0 k 2 0 0 γ 3 k 3 0 0 S q x 1 x 2 x 3 1 1 + x 1 n ihb x 2 n act 1 + x 2 n act 1 1 + x 3 n ihb = S q F b ( x )
where S q R 3 × 6 represents the influence map.
Figure 1

A graph representation of nonlinear ODEs. (A) (Example 1, n=3, S q R 3 × 6 ) : among 18(=3×6) components, only 6 components are non-zero (B) (Example 1 with hidden node and measurement noise) there exists hidden node x h affecting x 1 and process noise w.

We can also consider a version of Equation ( 8 ) in which there exists a hidden node ( x h ) affecting ( x 1 ) as shown in Figure 1 (B), as well as process noise.
y = x ̇ 1 x ̇ 2 x ̇ 3 = S q F b ( x ) + v + w

where v= [ v 1,0,0]and w= [ w 1,w 2,w 3].

Formulating GRN into the CS framework

Suppose the time series data are sampled from a real experimental system at discrete time points t k . By taking the transpose of both sides of Equation (7), we obtain
y ( t k ) = F b ( x ( t k ) ) q + v ( t k ) + w ( t k )
(9)
where q = S q = q 1 q 2 q n R N × n . Assuming that M successive data points are sampled, then define:
y y 1 y n = y 1 ( t 1 ) y n ( t 1 ) y 1 ( t 2 ) y n ( t 2 ) y 1 ( t M ) y n ( t M ) R M × n Φ f b , 1 ( x ( t 1 ) ) f b , N ( x ( t 1 ) ) f b , 1 ( x ( t 2 ) ) f b , N ( x ( t 2 ) ) f b , 1 ( x ( t M ) ) f b , N ( x ( t M ) ) = F b ( x ( t 1 ) ) F b ( x ( t 2 ) ) F b ( x ( t M ) ) R M × N v 1 v n v ( t 1 ) v ( t 2 ) v ( t M ) R M × n w 1 w n w ( t 1 ) w ( t 2 ) w ( t M ) R M × n
(10)
This leads to n independent equations:
y i = Φ q i + v i + w i ( i = 1 , , n )
(11)

where y i = y i ( t 1 ) , y i ( t 2 ) , , y i ( t M ) R M represents the M successive data points, Φ R M × N consists of N possible candidate bases which are functions of given time series data x and q i R N represents the unknown influence map corresponding to the i-th species. Since a biochemical reaction network is typically sparse, as a consequence, q i is sparse and we have NM for Φ because we assume the limited time series data and may choose a larger set of basis functions than necessary.

Although we formulate n independent linear regression problems in Equation (11), we consider n independent equations in Equation (11) together by stacking y i as follows:
Y = y 1 y 2 y n = Φ O O O Φ O O O Φ q 1 q 2 q n + v 1 v 2 v n + w 1 w 2 w n Ω q 1 q 2 q n + v + w
(12)

Now, Equation (12) is in the form of the CS formulation in Equation (1) where Y R n · M is the measurement, Ω R n · M × n · N is the sensing matrix which consists of basis functions for the given time series data, v v 1 ; v 2 ; ; v n R n · M represents possibly large corruption from hidden nodes and w w 1 ; w 2 ; ; w n R n · M represents process or measurement noise.

In this paper, we make the following assumptions:

Assumption 1.

We consider two cases:
  • (ideal case) We assume that we can measure all states x (i.e., there is no hidden node (x h ), thus v=0) and there is no measurement noise (w=0). Also, there are enough columns in Φ in Equation (11) to represent the underlying system in Equation (6).

  • (extension) We consider hidden node (x h ) where v(=g(x h )) is assumed to be sparse, and process noise (w≠0). Here, we can also consider the case where the columns (also known as the dictionaries in the CS literature [28]) of Φ may not be able to represent the underlying system. In this case, we consider the influence from the missing dictionaries as v.

We first consider the ideal case for simplicity in explaining the main results, and then extend the proposed method to the more general case.

Results and discussion

Formulating GRN identification problem into CS

Many existing algorithms [16],[24] consider the n independent linear regression problems in Equation (11) separately. Since the columns of matrix Φ are composed of time series data as in Equation (10), it is difficult to a priori guarantee low correlation and sometimes Φ even suffers rank deficiency. Also, Pan et al. [24] pointed out that correlation between the columns of Φ is usually high (μ(Φ ) is close to 1).

Intuitively, if two columns of the sensing matrix are highly correlated, it is hard to distinguish the corresponding components in the sparse signal q (note that the measurement Y is a linear combination of each column of the sensing matrix with the components in q as coefficients). In order to deal with high coherence in Equation (11), many method combines CS with different techniques such as Bayesian formulation [24], Kalman filter [25], and Granger causality [26]. Also, it is a well known problem in the lasso formulation of network inference: if there is high coherence in the sensing matrix, one can use an elastic net which combines the l 1 and l 2 norms. Although each reconstruction result [24]–[26] might be the optimal solution in the sense of its formulation (i.e., maximum likelihood), the identified graph may not represent the underlying GRN. In other words, if the data set is not informative enough to fully explore the underlying system, while the identified graph structure based on the given data set may be an optimal solution of the particular optimization problem, it may not represent the true system.

In this paper, our goal is to get the smallest data informative enough to recover the underlying graph structure exactly. Since the proposed method maintains the CS framework by reducing coherence of the sensing matrix, the method is fundamentally different than any other methods which make use of different techniques in conjunction with the l 1 optimization [24]–[26] which leads to a sparse representation of the network. Hence, we use all the properties of CS in order to access the ability to exactly reconstruct the underlying graph structure, reveal deficiencies in the data and model, and design new experiments to remedy the deficiencies if necessary.

While maintaining the CS framework, in order to deal with high coherence, we formulate Equation (12) instead of Equation (11) since we have strongly uncorrelated columns in Ω . In other words, since Ω has many independent columns, we have more degrees of freedom to reduce coherence of Ω . We will show that by using a transformation, the components of the sensing matrix can be made more uniformly distributed so that we could reduce coherence.

Moreover, since each q i has different degrees of sparsity in general, if we consider n independent equations in Equation (11), Φ should satisfy the incoherence condition M≥2 maxi(‖q i 0) stated in Proposition 1. On the other hand, Ω only needs to satisfy the condition M · n 2 · i = 1 n q i l 0 . Since the averaged sparsity = i = 1 n q i l 0 n max i q i l 0 is smaller than max i q i l 0 , we can reduce the required number of samples (M). Also, in case of rank deficiency, we can simply remove the corresponding rows (note that the rank deficiency is more likely caused by the row since NM).

Optimal design of sensing matrix

Reducing coherence by transformation

In order to reduce coherence, first we rearrange Ω with respect to a spatial information, and then we consider the transformation in order to reduce coherence (see Method:Rearranging the sensing matrix for mathematical details):
Z z ̄ ( 1 ) z ̄ ( 2 ) z ̄ ( M ) = Ψ 1 s ( I n Φ 1 , : ) Ψ 2 s ( I n Φ 2 , : ) Ψ M s ( I n Φ M , : ) q 1 q 2 q n v ̄ ( 1 ) v ̄ ( 2 ) v ̄ ( M ) + w ̄ ( 1 ) w ̄ ( 2 ) w ̄ ( M ) Ω ̄ q 1 q 2 q n + v ̄ + w ̄
(13)
where represents the Kronecker producta, Ω ̄ R M · n × N · n represents the rearranged sensing matrix multiplied by the transformation Ψ j s , and z ̄ ( j ) , v ̄ and w ̄ are defined in Equation (20). We want to find Ψ j s ( j = 1 , , M ) to minimize
min μ ( Ω ̄ ) where Ω ̄ = ψ 11 s Φ 1 , : ψ 12 s Φ 1 , : ψ 1 n s Φ 1 , : ψ 21 s Φ 2 , : ψ 22 s Φ 2 , : ψ 2 n s Φ 2 , : ψ M 1 s Φ M , : ψ M 2 s Φ M , : ψ Mn s Φ M , : has full row rank
(14)

where ψ ij s R n represents the j-th column of Ψ i s . In this paper, we propose a heuristic approach and a novel way to find Ψ i s by solving the optimization iteratively to reduce coherence (see Method: Section ‘Randomly chosen matrix Ψ for details).

Example 2.
(reducing coherence for a linear system) consider a simple linear system ( x ̇ = Ax ) where n=5,N=5,M=4,s=10 (note that n·M≥2s), and the elements of A are randomly chosen such that there is no isolated node. Figure 2 shows the sensing matrix for both Ω (top left) and the transformed sensing matrix (top right, denoted by Ω Ψ ). By reducing coherence, the components of the transformed sensing matrix (top right) are more uniformly distributed and the coherence is reduced by up to 0.6 although μ(Ω ) is close to 1 (bottom left) in Figure 2(A). Also, Figure 2(B) shows the result of the inferred graph structure based on given time series data without any a priori information where the x-axis represents indices of the influence map (i.e., the 1 st ,2 nd ,…,n th rows of influence map S q ; note that for a linear system, S q =A). Here, there are 5 states (n=5) in a linear system so the influence map A has 25 elements. Although L1 and L2b norm minimizations fail to recover the exact signal, CS in Equation (17) (see Recovery of gene regulatory networks for details) recovers the exact signal (bottom right) by reducing coherence of the sensing matrix. Note that (L1) solves the n independent equations in Equation (11) without reducing coherence and (L2) solves Equation (12) with l 2-regularization.
Figure 2

Reducing coherence for a linear system. (Example 2) (A) the sensing matrix Ω R 20 × 25 , without (top left) and with transformation Ψ (top right) and the corresponding coherence distribution (bottom) (B) reconstruction result (left) L1 (middle) L2 with Ω (right) l 1 optimization with Ω Ψ (CS) where x-axis represents indices of q R 25 and y-axis represents coefficient of q (a.u). By reducing coherence of the sensing matrix with transformation, we can recover the exact structure in (B) (right). However, both L1 and L2 fail to recover the exact structure.

Example 3.
(reducing coherence for a nonlinear system, n=3, N=9, M=5, s=6) consider simple nonlinear ODEs as follows:
x ̇ 1 = γ 1 x 1 + k + 12 x 2 n act 1 + x 2 n act x ̇ 2 = γ 2 x 2 + k 23 1 1 + x 3 n ihb x ̇ 3 = γ 3 x 3 + k + 13 x 1 n act 1 + x 1 n act
(15)
where (γ 1,γ 2,γ 3)=(−0.3,−0.25,−0.35), k +12=1.2, k +13=0.9, k −23=2.2, n act =4 (activation Hill coefficient) and n ihb =4 (inhibition Hill coefficient). The set of ODEs corresponds to a topology (x 2x 1x 3x 2) as shown in Figure 3(A). Figure 3(B) shows that we can reduce the coherence by up to 0.8 and (C) shows that only CS recovers the exact graph structure, and L2 regularization does not encourage sparsity but distributes the coefficients to be more similar to each other.
Figure 3

Reducing coherence for a nonlinear system. (Example 3) (A) Time series of x 1, x 2, x 3 for model in Equation (15) (B) the sensing matrix without (top left) and with transformation Ψ matrix (top right) and the corresponding coherence distribution (bottom) (C) reconstruction results. By constructing the sensing matrix Ω R n · M × n · N where n=3,N=9,M=5 and s=6, we recover the s-sparse signal in (B) (right). However, both L1 and L2 fail to recover the exact structure q.

Designing effective experiments

Consider the case where the sensing matrix is not incoherent. If the coherence condition ( M · n > ( Ω ̄ ) 2 · s log ( N · n ) ) is not satisfied, exact recovery cannot be guaranteed [34]. We use the transformation ( Ψ i s or P = P P in Equation (24); see Method section ‘Reducing coherence by transformation’) in order to reduce the coherence but obviously, sometimes we might have inherent limits to how much the coherence can be reduced. There are possible reasons:
  • Since we solve the relaxed problem in Equation (24) iteratively, P might be sub-optimal.

  • If the time series data of two different gene expressions, x i and x j are highly correlated, it might be difficult to reduce coherence. In this case, we need to design a new experiment to remedy deficiencies in the data.

As we mentioned, the incoherence of the sensing matrix can be used not only as a good metric to guarantee exact recovery but also as a guideline for designing new experiments. For example, from the coherence distribution, we can identify which columns of the sensing matrix have high coherence, i.e., f b,i(x) and f b,j(x) in Equation (6). Intuitively, in order to reduce ambiguities from the highly correlated columns of the sensing matrix, we should perturb either f b,i(x) or f b,j(x). Thus, it is possible to use this property of coherence to guide biological experiment design, to collect a more informative dataset. By doing this, we can minimize the number of experiments and reduce the cost of experiments.

Example 4.
(limitation of reducing coherence by Ψ , linear dynamics, n=5,N=5,M=4,s=10) In Figure 4, Exp#1 represents the original experimental data set which has the limitation of reducing coherence by Ψ . Since we consider linear dynamics, i.e., f b,i(x)=x i and f b,j(x)=x j , we found that x 2 and x 4 cause high coherence as shown in Figure 4 (circle marker). Thus, in order to reduce this high coherence, we should perturb either x 2 or x 4.
Figure 4

Guideline for designing new experiment based on coherence distribution. (Example 4) coherence distribution of the original Ω based on Exp#1 show that x 2 and x 4 cause high coherence. Thus, in order to reduce this high coherence, we should perturb either x 2 or x 4. We design two different experiment sets; for Exp#2, we perturb x 3 and for Exp#3, we perturb x 2. As we expected, Exp#3 is a more effective experiment to reduce the coherence between x 2 and x 4 and we recover the exact graph structure as shown in Figure 5. On the other hand, the coherence of Exp#2 remains almost the same as that of Exp#1 and we fail to recover the exact graph structure in Figure 5.

Figure 5

Design effective experiment for a linear system. (Example 4) Reconstruction results based on different experiments: Exp#1 represents the original experimental dataset which has limitation to reduce coherence; Exp#2 represents non-effective experimental dataset; and Exp#3 represents the effective experimental dataset (A) the sensing matrix and reconstruction result (B) coherence distribution comparison. This example demonstrates that if there is a limitation to reduce coherence in the original dataset, we can also design a more effective experiments based on the coherence distribution. By revealing deficiency in the data, the proposed method can help reduce the cost of experiments.

To show the effectiveness of the new experiment, we design two different experiment sets and compare the reconstruction results with each other; for Exp#2, we perturb x 3 and for Exp#3, we perturb x 2. As we mentioned earlier, intuitively, we expect that Exp#3 to be a more informative experiment to identify the graph structure since we would like to reduce the coherence between x 2 and x 4. As we expected, in Exp#3, we can reduce the coherence more than that of Exp#2 as shown in Figure 4, and recover the exact graph structure as shown in Figure 5. On the other hand, the coherence of Exp#2 remains almost the same as that of Exp#1 and we fail to recover the exact graph structure. This numerical example illustrates that by using the property of coherence, we can guide biological experiment design more effectively.

Example 5.
(limitation of reducing coherence by Ψ , nonlinear dynamics, n=3, N=9, M=5, s=9) consider the following set of ODEs:
x ̇ 1 = γ 1 x 1 + k + 12 x 2 n act 1 + x 2 n act + k + 13 x 3 n act 1 + x 3 n act x ̇ 2 = γ 2 x 2 + k 21 1 1 + x 1 n ihb + k 23 1 1 + x 3 n ihb x ̇ 3 = γ 3 x 3 + k 31 1 1 + x 1 n ihb + k 32 1 1 + x 2 n ihb
where (γ 1,γ 2,γ 3)=(−0.25,−0.23,−0.26), k +12=1.2, k +13=1.25, k −21=2.8, k −23=2.1, k −31=2.7, k −32=1.8, n act =4 (activation Hill coefficient) and n ihb =4 (inhibition Hill coefficient). The corresponding topology is shown in Figure 6(B) (note that we intentionally choose the symmetric structure and similar parameters). The reconstruction error using Exp#1 data is shown in Figure 6(A) (left, bottom) and the reconstruction error illustrates difficulties of resolving ambiguities from x 2 and x 3. This can be captured by the coherence distribution of the sensing matrix based on the Exp#1 dataset; the correlation between the columns corresponding to x 1 and x 2 is close to the correlation between the columns corresponding to x 1 and x 3. Based on the coherence distribution, we design two trials; for Exp#2, we perturb x 1 and for Exp#3, we perturb x 2. As we expected, Exp#2 is not an effective experiment in terms of information. On the other hand, by using Exp#3, we can reduce both the maximum coherence and the averaged coherence, and reconstruct the exact graph structure as shown in Figure 6(A).
Figure 6

Design effective experiment for a nonlinear system. (Example 5) Reconstruction result based on different experiment (nonlinear case): Exp#1 represents the original experimental dataset which has limitation to reduce coherence; Exp#2 represents non-effective experimental dataset (inhibit x 1); and Exp#3 represents the effective experimental dataset (inhibit x 2) (A) the time series of x 1, x 2, x 3 for each experiment (top), the reconstruction error (bottom left) and coherence comparison for each experiment (bottom right) (B) the corresponding topology (C) coherence distribution of the sensing matrix. In Exp#3, we reduce the coherence below 0.8.

Both Examples 4 and 5 illustrate that if the transformed sensing matrix is not incoherent enough to guarantee exact recovery, we can design a new experiment based on the distribution of coherence. Also, we show that the proposed experiment can help to reduce coherence more and thus reconstruct the exact graph structure. For a fair comparison, we use the same number of time points as M here. However, in practice, we can also stack all the experimental data sets together if we assume that the linear map S q and the set of basis functions F b (x) does not change for different experiment:
Z 1 Z k = Ω ̄ 1 Ω ̄ k q 1 q 2 q n + v ̄ 1 v ̄ k + w ̄ 1 w ̄ k
(16)

where the subscript Z i , Ω ̄ i , v ̄ i , w ̄ i represents the i-th experiment. As the number of measurements increase (M), one may be able to reduce the coherence. However, one can reduce the coherence only if the additional measurements provide us more useful information. As a trivial example, one could stack exactly the same data on top of the first, and increase M to 2M, however the coherence is exactly the same as that of the original dataset.

Recovery of gene regulatory networks

In this section, we present reconstruction of the exact graph structure and show how the condition for exact recovery will be used. First, we consider the ideal case where there are no hidden nodes and no measurement noise. Second, we extend the ideal case to the more general case.

Reconstructing gene regulatory networks (ideal case)

In Equation (13), q represents the s-sparse network structure which we want to reconstruct from the time series gene expression by solving the l 1-norm optimization:
min q l 1 s.t. Z ~ = Ω ~ q ( CS )
(17)

where Z ~ P Z , Ω ~ P Ω ̄ and is the optimal transformation in Equation (24).

Proposition 2.

If the sensing matrix Ω ~ constructed from time series data, multiplied by the optimal transformation, , has 2s linearly independent columns, then any s-sparse network structure q can be reconstructed uniquely from Z ~ = Ω ~ q .

Proof.

(Suppose not), then there are two s-sparse graph structures q 1,q 2 with Ω ~ q 1 = Ω ~ q 2 (or Ω ~ ( q 1 q 2 ) = 0 ). However, q 1q 2 is 2s-sparse, so there is a linear dependence between 2s columns of Ω ~ (contradiction).

The requirement of 2s linearly independent columns in Proposition2 may be translated to an incoherence condition on the sensing matrix. That is, if the unknown s-sparse signal q is reasonably sparse, it is possible to recover q under the incoherence condition on the sensing matrix. Although the sensing matrix consists of redundant dictionaries, the coherence of the sensing matrix can be reduced. In a heuristic way, we multiply the redundant dictionaries Φ k,: by a randomly chosen matrix Ψ k s at each time step k and iterate this step until the coherence is decreased. Or, we can find the optimal transformation (or P) in Equation (24) to reduce the coherence. In the previous numerical examples, we illustrated that the coherence of the sensing matrix is decreased by transformation and showed the exact reconstruction of graph structure.

Example 6.
(Statistics) Here, we compare the success rate of the proposed method with other methods such as L1 and L2. Figure 7 shows statistics of 50 trials for a simple linear case (for each trial, we randomly generate the influence map). Here, we count the number of successes of each method when any of the methods recover the exact structure. By reducing coherence, we can improve the success rate as shown in Figure 7(A). Also, L2-regularization does not encourage sparsity but distributes the coefficients to be more similar to each other as shown in Figure 7(B).
Figure 7

Comparison L 1, L 2, and CS . (Example 6, linear case) statistics of success rate and error among 50 trials: (A) success rate (blue bar) and reconstruction error (red bar) (B) component-wise reconstruction error. By reducing coherence, we can improve the success rate as shown in Figure 7(A). Also, since L2-regularization does not encourage sparsity but distributes the coefficients to be more similar to each other, component-wise errors are distributed as shown in (B).

Graph reconstruction with hidden nodes

The main contributions of the proposed method in the previous section is the conversion of the problem of inferring graph structure into the CS framework. Then, we demonstrate that one could recover sparse graph structures from only a few measurements. However, for practical use, the proposed method needs to be able to deal with both sparsely corrupted signals (v=g(x h )) and measurement noise (w) in Equation (5).

In general, the assumption of accessibility or observability of all nodes [17] is not satisfied. Thus, we focus on the case in which the hidden node affects observable nodes directly as shown in Figure 1(B). Also, without loss of generality, the hidden node dynamics could be any arbitrary dynamic model. Or, even if there is no hidden node, a small portion of the biological experiment dataset could be in practice contaminated by large error resulting from, for example, mislabeling, or improper use of markers or antibodies. Moreover, all biological datasets are contaminated by at least a small amount of noise from measurement devices. Therefore, the proposed method should be robust. We note this goes beyond the results in [11] due to the consideration of hidden node dynamics with measurement noise.

Here, the question is whether it is still possible to reconstruct the graph structure reliably when measurements are corrupted. Since hidden nodes and measurement noise are considered, the number of time points is assumed to be greater than that of the previous case [17] (i.e., no corruption and no measurement noise). Thus, the number of rows of the sensing matrix is assumed to be greater than the number of columns. If the number of the time points M(<N) is limited, then we can stack z ̄ ( j ) with different Ψ k s or including different dataset as described in Equation (16).

In CS literature [31], two decoding strategies for recovering the signal from a corrupted measurement are introduced, where the corruption includes both a possible sparse vector of large errors and a vector of small error affecting all the entries. It is shown that two decoding schemes allow the recovery of the signal with nearly the same accuracy as if no sparse large errors occurred. Our contribution is converting the problem of inferring the graph structure with hidden nodes into the highly robust error correction method framework [31] (see Method: Section ‘Two-step refinements’) and showing how this can improve the reliability of reconstruction.

Example 7.
(arbitrary corruption with no measurement noise) Consider nonlinear ODEs as follows:
x ̇ 1 = γ 1 x 1 + α 13 h act ( x 3 ) + v 1 x ̇ 2 = γ 2 x 2 + α 21 h act ( x 1 ) + v 2 x ̇ 3 = γ 3 x 3 + α 34 h act ( x 4 ) + β 32 h ihb ( x 2 ) + v 3 x ̇ 4 = γ 4 x 4 + β 41 h ihb ( x 1 ) + v 4
(18)
where h act ,h ihb represents Hill functions for activation and inhibition respectively, and v i represents arbitrary corruption shown in Figure 8(A) and assumed to be sparse (at each time step, we choose c a r d(v(j)) = 1). The magnitude of v i are about 50% of the magnitude of x ̇ . Since we consider arbitrary corruption, we need more time points (M > N). By using two-step refinements, first we estimate sparse large corruption v as shown in Figure 8(B) (top) and then, we reconstruct q (bottom).
Figure 8

Arbitrary corruption with no measurement noise. (Example 7) Reconstruction with corrupted signal. (A) time series of x,y, v (B) reconstruction results of v and q where each circle represents sampled time points (n=4, M=25, N=12, s=9). By using two-step refinements, first we estimate sparse large corruption v and then, we reconstruct q. Note that since we consider arbitrary corruption, we have more time points (M>N).

In practice, a specific node is corrupted by a hidden node and a small portion of the dataset can be largely corrupted by human error. Also, since we choose the set of possible candidate basis functions of the sensing matrix in Equation (6), the columns of the sensing matrix may not be able to represent the underlying system (i.e., missing dictionaries). Then, we can consider the influence from these missing dictionaries as v.

Example 8.
(Arbitrary corruption with measurement noise) Recall a model in Equation (18) with different parameters and consider sparse large corruption v and small magnitude noise w (1% of the magnitude of x ̇ ). Figure 9 shows the time series data and reconstruction result.
Figure 9

Arbitrary corruption with measurement noise. (Example 8) Reconstruction with corrupted signal. (A) time series of x,y, v (B) reconstruction results of v and q where each circle represents sample time points (n=4, M=40, N=12, s=9). Here we consider arbitrary large corruption with measurement noise. By choosing the parameters ε 1,ε 2 properly in Equation (30), we can recover the graph structure q within the small error bound.

Geometric view

In Equation (17), since we assume all nodes are accessible and perfect measurement (meaning that there is no hidden node, v=0 and no measurement noise, w=0), we can solve Equation (27) directly without filtering out the unmodelled dynamics in Equation (26) (i.e., Z = Ω ̄ q in Equation (27)). If there exist hidden nodes or measurement noise, we can still provide an unambiguous indication of the existence of these corruption (e0).

The intuition is that Z ( = Ω ̄ q + Ψ ̄ e ) can be decomposed as the superposition of an arbitrary element in V(=ZQ Q e) and of an element in V (=Q Q e) as shown in Figure 10. In other words, Zcan be decomposed as the superposition of modelled dynamics and anomalies caused by hidden node or unmodelled dynamics. This geometric view enables us to understand how we could reveal deficiencies in our model:
  • Z ̄ = Q Z = 0 : there is no hidden node Z R ( Ω ̄ )
    Figure 10

    Geometric view of two-step refinement. A geometric view of two consecutive l 1-norm optimizations. By using two-step refinements, first we estimate sparse large corruption v and then, we reconstruct q. For the ideal situation with no unobservable species (genes) or noise, we directly reconstruct q without estimating v.

  • Z ̄ 0 : Z cannot be represented by Ω ̄ q so there might be hidden nodes or our dictionaries in the sensing matrix Ω ̄ are not sufficient to represent Z (revealing deficiencies in our model or dictionaries).

HER2 Overexpressed breast cancer

We apply the proposed algorithm to study a breast cancer signaling pathway by reconstructing the graph structure using an RPPA dataset [32] as shown in Figure 11 (see also Additional file 1: Figure S1, S2, and S3 for details: each figure presents the RPPA dataset and the result of graph reconstruction compared with L1, L2-optimization). Here, we choose small networks which are composed of 3 nodes and known to be sparsely connected, i.e., PI 3 K→PDKAkt and PDKAktmTOR in order to satisfy our assumption such that the influence on observable nodes from a hidden node should be sparse (i.e., v is sparse). The graph structures identified by the proposed method are consistent with the current understanding of the networks, whereas those found using L1- and L2-optimizations fail to reconstruct the known structure as shown in Figure 12.
Figure 11

HER2+ overexpressed breast cancer. CS reconstruction result using Reverse Phase Protein Array data (SKBR 3 cell line, Serum [32]). See Additional file 1: Figure S1, S2 and S3 for further details (red: activation, blue: inhibition). Here, we choose small networks which are composed of 3 nodes and known to be sparsely connected in order to satisfy our assumption such that the influence on observable nodes from a hidden node should be sparse. The identified graph structures are consistent with the current understanding of the networks.

Figure 12

Sub networks inferred for the HER2/3 signaling network from Reverse Phase Protein Array data. The columns show the networks inferred by L1-optimization, L2-optimization, and CS. The network structures identified by CS agree with the current understanding of the network, whereas those found using L1 and L2 optimization do not. See Additional file 1: Figure S1, S2 and S3 for further details (red: activation, blue: inhibition).

Also, an abstract model of the breast cancer signal pathway proposed by M. Moasser [38] is considered, as shown in Additional file 1: Figure S3(B) where PHLPP isoforms are a pair of protein phosphatases, PHLPP 1 and PHLPP 2, which are important regulators of Akt serine-threonine kinases (Akt 1, Akt 2, Akt 3) and conventional protein kinase C (PKC) isoforms. PHLPP may act as a tumor suppressor in several types of cancer due to its ability to block growth factor-induced signaling in cancer cells [39]. PHLPP dephosphorylates Ser 4 7 3 (the hydrophobic motif) in Akt, thus partially inactivating the kinase [40]. Unfortunately, in our RPPA dataset, we do not have PHLPP so we simply consider three nodes (Ak t p T308,Ak t p S473 and mTOR). Figure 11(right) shows the result of the proposed method using the RPPA dataset. The reconstructed graph structure matches up to the known structure (Additional file 1: Figure S3(B)). Specifically, our result can capture the partial inactivating characteristics of PHLPP (i.e., mTOR(→PHLPP)Ak t p S473).

Discussion

Many network inference algorithms use l 1-norm optimization to find the actual network structure, but they cannot guarantee exact recovery. In this paper, we propose a novel approach to reconstruct GRNs based on compressive sensing with a dynamic system model. We demonstrate that the incoherence condition is essential for exact recovery. This is not properly considered in inference methods based on l 1-norm optimization. Also, we illustrate how the incoherence condition can be used to design new experiments effectively. Finally, we consider a more general setting, in which hidden genes exist and all measurements are contaminated by noise, and we show that the proposed method leads to reliable reconstruction.

We compare all the results with l 1- and l 2-norm optimization, since l 1-norm optimization is widely used in network inference and many inference methods combine the l 1- and l 2-norms (for example, elastic net). We highlight the improvement by the proposed method compared with these commonly used methods. Our use of the method to propose new experiments, and its extension to a more general setting goes beyond the results in any network inference methods.

From the lessons learned from the numerical study using both synthetic datasets and real RPPA datasets of HER2 overexpressed breast cancer signaling pathways, the proposed method can be applied to sub-networks in order to satisfy our assumption about the number of hidden nodes relative to the network size, and has great potential for reconstructing GRN by designing effective experiments. Note that the proposed method is not limited to the network size itself but rather by this assumption and the incoherence condition. In the ideal case with no unobservable species (genes) and no noise, the method can be applied to entire networks if the incoherence condition is satisfied (i.e., n·M≥2s, practically n·M≥4s) where n is the number of nodes, M is the number of time points and s represents the sparsity of the network structure. For example, suppose we only have one time point measurement (M=1) and then, the proposed method cannot be applicable for any network (n>2 where n is the number of nodes) since it is impossible to satisfy the incoherence condition. As an example, if we consider n nodes which are simply connected (i.e., s=n−1), then n·M<2s. With a reasonable number of time points (M), the incoherence condition could be satisfied. Similarly, in the practical case with unobservable species (genes) and measurement noise, in order to guarantee the exact recovery of the graph structure, we should satisfy Assumption 1 for a given network (hidden node only affects relatively few nodes in the given network) and the incoherence condition. Suppose we choose a small sub-network where a hidden node or unobservable node affects all nodes in the chosen sub-network: then, it is impossible to infer the exact structure of the sub-network. However, if a hidden node only affects relatively few nodes in the given network, we can still reconstruct the exact structure of the sub-network by inferring the hidden node influence first. Therefore, the proposed method is not limited to the network size itself but rather by Assumption 1 and the incoherence condition. In practice, we can work on a sub-network at a time and integrate the identified sub-networks.

Finally, we evaluate performance with respect to the size of the problem under typical parameters (the number of genes, the number of kinetic features and connectivity) in (Additional file 1: Table S1) which provides us a brief guideline of application, e.g., for an n-genes network with certain kinetic features, how many time points/how much resolution are appropriate for the reconstruction.

In this paper, we do not consider any a priori knowledge of connectivity. However, since we may have partial information of connectivity, we can also use prior knowledge of connectivity in the form of known nonzero elements in q allowing fewer experiments to be performed. Also, here we only consider protein expression or gene expression levels but we can also consider many other types of useful information that can be incorporated into the network reconstruction process, such as sequence motifs and direct binding measurements (e.g., ChIP-chip and ChIP-seq). Since it has been shown that ChIP-seq signals of Histone modification are more correlated with transcription factor motifs at promoter sites in comparison to RNA level, time series Histone modification ChIP-seq could provide a more reliable inference of GRNs in comparison to method based on expression level. We are interested in continuing this research direction for large-scale networks.

Conclusion

We proposed a method for reconstructing sparse graph structures based on time series gene expression data without any a priori information. We demonstrated that the proposed method can reconstruct graph structure reliably. Also, we illustrated that coherence in the sensing matrix can be used as a guideline for designing effective experiments.

Second, the proposed method is extended to the cases in which dynamics is corrupted by hidden nodes and the measurement is corrupted by human error in addition to the measurement noise. Using a two-step refinement procedure, we demonstrate good performance for the reconstruction of graph structure. A set of numerical examples is implemented to illustrate the method and its performance. Also, a biological example of HER2 overexpressed breast cancer using an RPPA dataset is studied. We are currently applying our method to recover the HER2 signaling pathway, where a significant part of the network is currently unknown.

Method

Reducing coherence by transformation

Rearranging the sensing matrix

Define z(j) as a vector of each component of y i at the j-th time point:
z ( j ) y 1 e j y 2 e j y n e j = y 1 ( t j ) y 2 ( t j ) y n ( t j ) = Φ j , : q 1 Φ j , : q 2 Φ j , : q n + v 1 ( t j ) v 2 ( t j ) v n ( t j ) + w 1 ( t j ) w 2 ( t j ) w n ( t j ) = ( I n Φ j , : ) q 1 q 2 q n + v ( j ) + w ( j )
(19)
where Φ j,: represents the j-th row of Φ . Consider Ψ j s R n × n and multiply Equation (19) by Ψ j s :
z ̄ ( j ) Ψ j s z ( j ) = Ψ j s ( I n Φ j , : ) q 1 q 2 q n + v ̄ ( j ) + w ̄ ( j )
(20)
where v ̄ ( j ) Ψ j s v ( j ) , w ̄ ( j ) Ψ j s w ( j ) and rank Ψ i s ( I n Φ i , : ) = rank Ψ i s since rank(I n Φ i,:)=rank(I n )·rank(Φ i,:)=n. By stacking z ̄ ( j ) ,
Z z ̄ ( 1 ) z ̄ ( 2 ) z ̄ ( M ) = Ψ 1 s ( I n Φ 1 , : ) Ψ 2 s ( I n Φ 2 , : ) Ψ M s ( I n Φ M , : ) q 1 q 2 q n + v ̄ ( 1 ) v ̄ ( 2 ) v ̄ ( M ) + w ̄ ( 1 ) w ̄ ( 2 ) w ̄ ( M ) Ω ̄ q 1 q 2 q n + v ̄ + w ̄
(21)

Randomly chosen matrix Ψ

The optimization problem in Equation (14) is not trivial because of the constraint. One simple and heuristic approach is that we select Ψ i s by (normalized) randomly chosen matrix with independent identically distributed (i.i.d.) random variable where Ω ̄ has full row rank, calculate μ ( Ω ̄ ) and run this with several times to reduce coherence. Since the randomly chosen matrix, Ψ i s , spreads out the component of Φ i,: uniformly, we can reduce coherence.

Finding the optimal transformation

This heuristic approach may not be enough to reduce the coherence. Consider a nonsingular matrix P R M · n × M · n and Ω ~ = P Ω ̄ where Ω ̄ is constructed by heuristic way, i.e., randomly chosen matrix.
μ ~ i , j Ω ~ i , Ω ~ j Ω ~ i 2 Ω ~ j 2 = P Ω ̄ i , P Ω ̄ j P Ω ̄ i 2 P Ω ̄ j 2 Ω ̄ i P Ω ̄ j Ω ̄ i P Ω ̄ i Ω ̄ j P Ω ̄ j
(22)
where P P P = P S M · n is positive definite and Ω ~ i denote the i-th column of Ω ~ . Note that if P P = I , μ ~ i , j = | Ω ~ i , Ω ~ j | Ω ~ i 2 Ω ~ j 2 = | Ω ̄ i , Ω ̄ j | Ω ̄ i 2 Ω ̄ j 2 μ ̄ i , j . Therefore, our goal is finding P ( = P P ) S + such that
min μ ( Ω ~ ) = min P max j < k μ ~ j , k = min P Γ
(23)
where Γ R ( n · N ) C 2 can be defined as follows:
Γ tr P Ω ̄ 1 Ω ̄ 2 P P Ω ̄ 1 Ω ̄ 2 P F tr P Ω ̄ 1 Ω ̄ N · n P P Ω ̄ 1 Ω ̄ N · n P F × tr P Ω ̄ 2 Ω ̄ 3 P P Ω ̄ 2 Ω ̄ 3 P F tr P Ω ̄ N · n 1 Ω ̄ N · n P P Ω ̄ N · n 1 Ω ̄ N · n P F
In practice, we ignore the denominator and solve the following problem:
min P = P 0 vec Ω ̄ 1 Ω ̄ 2 vec Ω ̄ 1 Ω ̄ N · n vec Ω ̄ N · n 1 Ω ̄ N · n vec ( P )
(24)

We can also combine ‖·‖ and ‖·‖1 to reduce the coherence. Note that for ‖·‖ , we minimize the maxim coherence and for ‖·‖1, we minimize the sum of the all possible combinations of the columns of Ω ̄ , i.e., μ ~ ij . Thus, if certain bases are highly correlated, P or P makes the components of the sensing matrix spread out enough to differentiate the influences from those bases. Since we ignore the denominator, the optimal solution of Equation (24) may be suboptimal. Thus, we can also combine heuristic way and Equation (24) iteratively to reduce coherence of the sensing matrix in practice.

Two-step refinements

Sparse large corruption with no measurement noise

Recall Equation (13) where w ̄ = 0 :
Z = Ω ̄ q 1 q 2 q n + diag Ψ 1 s , Ψ 2 s , , Ψ M s v ( 1 ) v ( 2 ) v ( M ) = Ω ̄ q 1 q 2 q n + Ψ ̄ v
(25)

where Ψ ̄ diag Ψ 1 s , Ψ 2 s , , Ψ M s and v(j) represents sparse large corruption at the j-th time point, that could result from the existence of hidden nodes. We assume that the influence from hidden nodes is sparse and unknown (i.e., v(j) is assumed to be sparse). In other words, hidden nodes can affect only a few nodes’ dynamics (intuitively, if hidden nodes affect all nodes, there is no way to reconstruct the graph structure). Then, we consider the reconstruction of graph structure q from the corrupted signal Z.

Since we assume the number of rows of the sensing matrix Ω ̄ is greater than the number of columns (M·n>N·n), we consider Q which annihilates the sensing matrix Ω ̄ on the left ( Q Ω ̄ = 0 ) where Q is any (M·nN·nM·n matrix whose kernel is the range of Ω ̄ in R M · n ( rank ( Q ) + nullity ( Q ) = M · n ) :

Then, the following two-step optimization problem enables us to compute q:

Filter unmodelled dynamics out from the measurement:
v ̂ = arg min v l 1 s.t. Z ̄ = Q Ψ ̄ v
(26)
Reconstruct q:
min q l 1 s.t. Z Ψ ̄ v ̂ = Ω ̄ q or Z ~ = P ( Z Ψ ̄ v ̂ ) = P Ω ̄ q = Ω ~ q ( P in Equation (23) )
(27)

If we could somehow get an accurate estimate v ̂ from Equation (26), Equation (27) represents the problem of reconstructing the graph structure q. The intuition is that Z ( = Ω ̄ q + Ψ ̄ v ) can be decomposed as the superposition of modelled dynamics and anomalies caused by hidden node or unmodelled dynamics.

The two step convex optimization problems in Equation (26) and Equation (27) are l 1-norm optimization problems in CS. Thus, if the sensing matrix Q Ψ ̄ and Ω ̄ (or P Ω ̄ ) satisfy the incoherence condition, signals v and q can be recovered exactly [17],[34]. Here, there are many possible choices of Q but we have to choose Q to satisfy the incoherence condition for the exact recovery of v [17]. To choose such a Q , we observe that Ω ̄ can be denoted as follows using Singular Value Decomposition (SVD):
Ω ̄ = U 1 U 2 Σ 0 0 0 V 1 V 2 = U 1 Σ V 1
(28)
where U 1 R M · n × r , U 2 R M · n × ( M · n r ) , Σ R r × r , V 1 R N · n × r , V 2 R N · n × ( N · n r ) and r is the rank of Ω ̄ . Suppose we choose Q such that Q = ΞU 2 . Then:
Q Z = Q Ω ̄ q + Q Ψ ̄ v Z ̄ = ΞU 2 U 1 Σ V 1 q + ΞU 2 Ψ ̄ v Z ̄ = ΞU 2 Ψ ̄ v

where Ξ can be used as a tuning matrix for satisfying the incoherence condition. A geometric view in Results and discussion: Section ‘Geometric view’. enables us to understand how we could reveal deficiencies in our model.

Sparse large corruption with measurement noise

While considering influence from hidden nodes is interesting, it still may not be realistic to assume that except for hidden nodes, one is able to measure the node dynamics with infinite precision. A better model would assume that there is measurement noise. Consider the problem of recovering the graph structure q from the vector Z which is corrupted by measurement noise w ̄ in Equation (13):
Z = Ω ̄ q 1 q 2 q n + Ψ ̄ v ( 1 ) + w ( 1 ) v ( 2 ) + w ( 2 ) v ( M ) + w ( M ) = Ω ̄ q 1 q 2 q n + Ψ ̄ e ( 1 ) e ( 2 ) e ( M )
(29)
where e(j)=v(j)+w(j), w is Gaussian noise N ( 0 , σ ) assumed to be bounded w l 2 ε . In general, we can consider any corruption decomposed into sparse large error v and small magnitude error w [31]. Then, modified two-step refinements can be applied as follows:
Z ̄ = Q Ω ̄ q + Q Ψ ̄ e = Q Ψ ̄ ( v + w ) v ̂ = a rg min v l 1 s.t Z ̄ Q Ψ ̄ v l 2 ε 1 q = a rg min q l 1 s.t. Z Ω ̄ q e ̂ l 2 ε 2 , e ̂ = Ψ ̄ v ̂
(30)

where the parameters ε 1,ε 2 above depend on the magnitude of the small errors ε, which can be determined as in [31].

Endnotes

a If A is an m×n matrix and B is a p×q matrix, then the Kronecker product product AB is the m p×n q block matrix:
A B = a 11 B a 1 n B A m 1 B a mn B
b We compare the performance of CS with the performance of the L1 and L2 optimization as follows:
min q i q i 1 subject to y i = Φ q i ( i = 1 , , n ) ( L 1 ) min q q 2 subject to Y = Ω q ( L 2 )

Additional file

Declarations

Acknowledgements

This research was supported by the NIH NCI under the ICBP and PS-OC programs (5U54CA112970-08), and by the Stand Up To Cancer-American Association for Cancer Research Dream Team Translational Cancer Research Grant SU2C-AACR-DT0409 to JWG.

Authors’ Affiliations

(1)
Department of Electrical Engineering and Computer Sciences, University of California
(2)
Department of Biomedical Engineering and the Center for Spatial Systems Biomedicine, Oregon Health and Science University
(3)
Faculty Scientist, Life Sciences Division, Lawrence Berkeley National Laboratory

References

  1. Ideker T, Ozier O, Schwikowski B, Siegel AF: Discovering regulatory and signaling circuits in molecular interaction networks . Bioinfomatics. 2002, 18: 233-240. 10.1093/bioinformatics/18.suppl_1.S233.View ArticleGoogle Scholar
  2. Wiley HS, Shvartsman SY, Lauffenburger DA: Computational modeling of the egf-receptor system: a paradigm for systems biology . Trends Cell Biol. 2003, 13 (1): 43-50. 10.1016/S0962-8924(02)00009-0.View ArticlePubMedGoogle Scholar
  3. Ma L, Iglesiass PA: Quantifying robustness of biochemical network models . BMC Bioinformatics. 2002, 3 (38): 1-13.Google Scholar
  4. Sontag ED: Network reconstruction based on steady-state data . Essays Biochem. 2008, 45: 161-176. 10.1042/BSE0450161.View ArticlePubMedGoogle Scholar
  5. Zechnera C, Ruessa J, Krenn P, Pelet S, Peter M, Lygeros J, Koeppl H: Moment-based inference predicts bimodality in transient gene expression . Proc Natl Acad Sci U S A. 2012, 109 (21): 8340-8345. 10.1073/pnas.1200161109.View ArticleGoogle Scholar
  6. Zavlanos MM, Julius AA, Boyd SP, Pappas GJ: Identification of stable genetic networks using convex programming . Proceedings of the American Control Conference (ACC) . 2008, IEEE, Seattle, WA, 2755-2760.Google Scholar
  7. Cooper NG, Belta CA, Julius AA: Genetic regulatory network identification using multivariate monotone functions . Proceedings of the IEEE conference on Decision and Control and European Control Conference (CDC-ECC) . 2011, IEEE, Orlando, FL, 2208-2213.View ArticleGoogle Scholar
  8. Porreca R, Drulhe S, de Jong H, Ferrari-Trecate G: Structural identification of piecewise-linear models of genetic regulatory networks . J Comput Biol. 2008, 15 (10): 1365-1380. 10.1089/cmb.2008.0109.View ArticlePubMedGoogle Scholar
  9. Bernardo DD, Gardner TS, Collins JJ: Robust identification of large genetic networks . Pac Symp Biocomput. 2004, 9: 486-497.Google Scholar
  10. Richard G, Julius A. A, Belta C: Optimizing regulation functions in gene network identification . IEEE Conference on Decision and Control (CDC) . 2013, IEEE, Firenze, Italy, 745-750.Google Scholar
  11. Gonçalves JM, Warnick SC: Necessary and sufficient conditions for dynamical structure reconstruction of lti networks . IEEE Trans Automatic Control. 2008, 53 (7): 1670-1674. 10.1109/TAC.2008.928114.View ArticleGoogle Scholar
  12. Sontag E, Kiyatkin A, Kholodenko BN: Inferring dynamics architecture of cellular network using time series of gene expression, protein and metabolite data . Bioinformatics. 2004, 20 (12): 1877-1886. 10.1093/bioinformatics/bth173.View ArticlePubMedGoogle Scholar
  13. Han S, Yoon Y, Cho K-H: Inferring biomolecular interaction networks based on convex optimization . Comput Biol Chem. 2007, 31 (5-6): 347-354. 10.1016/j.compbiolchem.2007.08.003.View ArticlePubMedGoogle Scholar
  14. Napoletani D, Sauer TD: Reconstructing the topology of sparsely connected dynamical networks . Phys Rev E. 2008, 77 (2): 026103-10.1103/PhysRevE.77.026103.View ArticleGoogle Scholar
  15. Napoletani D, Sauer T, Struppa DC, Petricoin E, Liotta L: Augmented sparse reconstruction of protein signaling networks . J Theor Biol. 2008, 255 (1): 40-52. 10.1016/j.jtbi.2008.07.026.View ArticlePubMedGoogle Scholar
  16. Yuan Y, Stan G-B, Warnick S, Gonçalves J: Robust dynamical network structure reconstruction . Automatica. 2011, 47: 1230-1235. 10.1016/j.automatica.2011.03.008.View ArticleGoogle Scholar
  17. Chang YH, Tomlin CJ: Data-driven graph reconstruction using compressive sensing . IEEE Conference on Decision and Control (CDC) . 2012, IEEE, Maui, HI, 1035-1040.Google Scholar
  18. Steinke F, Seeger M, Tsuda K: Experimental design for efficient identification of gene regulatory networks using sparse bayesian models . BMC Syst Biol. 2007, 1: 51-10.1186/1752-0509-1-51. doi: 10.1186/1752-0509-1-51,View ArticlePubMed CentralPubMedGoogle Scholar
  19. Gardner TS, di Bernardo D, Lorenz D, Collins JJ: Inferring genetic networks and identifying compound mode of action via expression profiling . Science. 2003, 301 (5629): 102-105. 10.1126/science.1081900.View ArticlePubMedGoogle Scholar
  20. Tegner J, Yeung MKS, Hasty J, Collins JJ: Reverse engineering gene networks: integrating genetic perturbations with dynamical modeling . Proc Natl Acad Sci. 2003, 100: 5944-5949. 10.1073/pnas.0933416100.View ArticlePubMed CentralPubMedGoogle Scholar
  21. Jeong H, Mason SP, Barabasi A-L, Oltvai ZN: Lethality and centrality in protein networks . Nature. 2001, 411 (6833): 41-42. 10.1038/35075138.View ArticlePubMedGoogle Scholar
  22. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, Alon U: Network motifs: Simple building blocks of complex networks . Science. 2002, 298 (5594): 824-827. 10.1126/science.298.5594.824.View ArticlePubMedGoogle Scholar
  23. Burda Z, Krzywicki A, Martin OC, Zagorski M: Motifs emerge from function in model gene regulatory networks . Proc Natl Acad Sci. 2011, 108 (42): 17263-17268. 10.1073/pnas.1109435108.View ArticlePubMed CentralPubMedGoogle Scholar
  24. Pan W, Yuan Y, Gonçalves J, Stan G-B: Reconstruction of arbitrary biochemical reaction networks: A compressive sensing approach . IEEE Conference on Decision and Control (CDC) . 2012, IEEE, Maui, HI, 2334-2339.Google Scholar
  25. Noor A, Serpedin E, Nounou M, Nounou H: Reverse engineering sparse gene regulatory networks using cubature kalman filter and compressed sensing . Adv Bioinformatics. 2013, 2013: 11-doi:10.1155/2013/205763,Google Scholar
  26. Emad A, Milenkovic O: CaSPIAN: A causal compressive sensing algorithm for discovering directed interactions in gene networks . PLoS ONE. 2014, 9 (3): e90781-10.1371/journal.pone.0090781. doi:10.1371/journal.pone.0090781,View ArticlePubMed CentralPubMedGoogle Scholar
  27. Wang W-X, Lai Y-C, Grebogi C, Ye J: Network reconstruction based on evolutionary-game data via compressive sensing . Phys Rev X. 2011, 1: 021021-Google Scholar
  28. Candès E, Romberg J: Sparsity and incoherence in compressive sampling . Inverse Probl. 2006, 23: 969-985. 10.1088/0266-5611/23/3/008.View ArticleGoogle Scholar
  29. Chang YH, Gray J, Tomlin CJ: Optimization-based inference for temporally evolving boolean networks with applications in biology . Proceedings of the American Control Conference (ACC) . 2011, IEEE, San Francisco, CA, 4129-4134.Google Scholar
  30. Chang YH, Gray J, Tomlin CJ: Optimization-based inference for temporally evolving networks with applications in biology . J Comput Biol. 2012, 19 (12): 1307-1323. 10.1089/cmb.2012.0190.View ArticlePubMed CentralPubMedGoogle Scholar
  31. Candès EJ, Randall PA: Highly robust error correction by convex programming . IEEE Trans Inf Theory. 2008, 54 (7): 2829-2840. 10.1109/TIT.2008.924688.View ArticleGoogle Scholar
  32. Hennessy BT, Lu Y, Gonzalez-Angulo AM, Carey MS, Myhre S, Ju Z, Davies MA, Liu W, Coombes K, Meric-Bernstam F, Bedrosian I, McGahren M, Agarwal R, Zhang F, Overgaard J, Alsner J, Neve RM, Kuo W-L, Gray JW, Borresen-Dale A-L, Mills GB: A technical assessment of the utility of reverse phase protein arrays for the study of the functional proteome in non-microdissected human breast cancers. Clinical Proteomics, 6(4):129–151.Google Scholar
  33. Tao T: Compressed sensing (Or: the equation Ax = b, revisited), Mahler Lecture Series . 2009, University of California, Los AngelesGoogle Scholar
  34. Candès E, Wakin M: An introduction to compressive sampling . IEEE Signal Process Mag. 2008, 25 (2): 21-30. 10.1109/MSP.2007.914731.View ArticleGoogle Scholar
  35. Candès EJ, Romberg JK, Tao T: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information . IEEE Trans Inf Theory. 2004, 52: 489-509. 10.1109/TIT.2005.862083.View ArticleGoogle Scholar
  36. Yeung MKS, Tegner J, Collins JJ: Reverse engineering gene networks using singular value decomposition and robust regression . Proc Natl Acad Sci. 2002, 99 (9): 6163-6168. 10.1073/pnas.092576199.View ArticlePubMed CentralPubMedGoogle Scholar
  37. Ideker T, Thorsson V, Siegel AF, Hood LE: Testing for differentially-expressed genes by maximum-likelihood analysis of microarray data . J Comput Biol. 2000, 7 (6): 805-817. 10.1089/10665270050514945.View ArticlePubMedGoogle Scholar
  38. Moasser M: Understanding the network topology underlying addiction to HER. Integrative Cancer Biology Program (ICBP) Retreat (Days of Science, the Sequel) . 2012, Waverly Club, SoutheastWaverly Drive, PortlandGoogle Scholar
  39. Brognard J, Newton AC: Phlipping the switch on akt and protein kinase c signaling . Trends Endocrinol Metab. 2008, 19 (6): 223-30. 10.1016/j.tem.2008.04.001.View ArticlePubMed CentralPubMedGoogle Scholar
  40. Gao T, Furnari F, Newton AC: Phlpp: a phosphatase that directly dephosphorylates akt, promotes apoptosis, and suppresses tumor growth . Mol Cell. 2005, 18 (1): 13-24. 10.1016/j.molcel.2005.03.008.View ArticlePubMedGoogle Scholar

Copyright

© Chang et al.; licensee BioMed Central. 2014

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.