 Research article
 Open Access
 Published:
Detecting temporal protein complexes from dynamic proteinprotein interaction networks
BMC Bioinformatics volume 15, Article number: 335 (2014)
Abstract
Background
Proteins dynamically interact with each other to perform their biological functions. The dynamic operations of protein interaction networks (PPI) are also reflected in the dynamic formations of protein complexes. Existing protein complex detection algorithms usually overlook the inherent temporal nature of protein interactions within PPI networks. Systematically analyzing the temporal protein complexes can not only improve the accuracy of protein complex detection, but also strengthen our biological knowledge on the dynamic protein assembly processes for cellular organization.
Results
In this study, we propose a novel computational method to predict temporal protein complexes. Particularly, we first construct a series of dynamic PPI networks by joint analysis of timecourse gene expression data and protein interaction data. Then a Time Smooth Overlapping Complex Detection model (TSOCD) has been proposed to detect temporal protein complexes from these dynamic PPI networks. TSOCD can naturally capture the smoothness of networks between consecutive time points and detect overlapping protein complexes at each time point. Finally, a nonnegative matrix factorization based algorithm is introduced to merge those very similar temporal complexes across different time points.
Conclusions
Extensive experimental results demonstrate the proposed method is very effective in detecting temporal protein complexes than the stateoftheart complex detection techniques.
Background
With the technological advances in highthroughput screening techniques, largescale proteinprotein interaction (PPI) data have been generated and catalogued for many species [1–3]. Proteins seldom act alone, and they often bind together to form complexes to carry out their biological functions [4–6]. Comprehensive investigation of protein complexes could help to reveal the structure of PPI networks, predict protein functions and elucidate cellular mechanisms underlying various diseases [7]. Computational detection of complexes has thus attracted tremendous attentions during the past decade [6, 8–14].
According to their life time, PPIs could be classified into stable or transient PPIs [15, 16]. Stable PPIs which are important in maintaining the cell fitness and stability are usually permanent and irreversible. Meanwhile, transient PPIs can associate and dissociate temporarily, and thus they provide a mechanism for the cell to quickly respond to extracellular stimuli. As physical interactions determined by popular highthroughput technologies, e.g. yeast twohybrid (Y2H) and Tandem Affinity Purification with mass spectrometry (TAPMS) lack of temporal information, majority of existing complex detection methods treat the PPI network as a static network that can not be used to detect temporal protein complexes. In reality, however, cellular systems are highly dynamic and responsive to environmental cues [17]. The real PPI network in cell keeps changing over different stages of the cell cycle [18], leading to multiple dynamic protein interaction networks. As such, it is desirable to design novel computational methods that can take the inherent dynamic characteristics of PPI networks into consideration to better detect temporal protein complexes.
Nevertheless, the advent of DNA microarray technologies has enabled the differential expressions of thousands of genes under various experimental conditions to be monitored simultaneously and quantitatively [19, 20], which provides the useful temporal information to complement the static protein interaction data in the gene level. There have been some attempts to investigate the temporal properties for individual proteins and protein interactions by integrating PPI data with timecourse gene expression data [21–29]. For example, in [22], the authors proposed a threesigma principle to identify active time points for individual proteins. They further investigated the temporal protein associations and protein state transition on the identified active time points.
Temporal protein complexes are typically constructed by the dynamic assembly or disassembly of proteins to perform various biological functions. Tracking the temporal protein complexes could reveal important insights into dynamic modular mechanisms and improve our understanding on the disease pathways etc [23, 30]. To detect temporal protein complexes, we need to leverage the temporal information from gene expression data to construct timeevolving dynamic protein interaction networks. In [31], the authors incorporated the "time" factor for proteins in the form of cellcycle phases into the analysis of complexes and studied the temporal phenomena of complex assembly and disassembly across various cell cycles. Wang et al. identified temporal protein complexes from the dynamic PPI networks by applying static complex detection methods (e.g., MCL) for each time point [22]. In [28], the authors proposed DHAC (Dynamical Hierarchical Agglomerative Clustering) complex mining method, to detect temporal complexes from individual dynamic PPI networks.
We observe that the above few methods for predicting temporal protein complexes suffer from the following two major limitations. Firstly, their methods just focus on the individual dynamic PPI networks and fully ignore the correlations between the networks at consecutive time points. Note that while there are different temporal complexes occur at different time points, many protein complexes will still form stable macromolecular complexes to perform their important biological functions [21]. As many stable interactions that perform fundamental roles for the cell are conserved across different time points, the corresponding complexes will also occur in multiple consecutive dynamics PPI networks and they should thus change smoothly across time [24, 26], to maintain the cell fitness and stability as well as to avoid the adverse disruption of the basic operations of the cell. These existing methods, however, have overlooked the smoothness of the temporal complexes at different time points and simply apply static complex detection methods for each individual dynamic PPI network. Secondly, as multifunctional proteins are often involved in different complexes, it is highly desirable to discover overlapping complexes to better decipher the inherent overlapping modular structures of PPI networks. However, existing methods, namely DHAC and MCL, do not generate the overlapping protein complexes and they are thus less accurate.
To address the above two issues, in this paper we propose a novel technique to detect temporal protein complexes from the dynamic PPI networks. We first construct a series of dynamic PPI networks by detecting stable interactions and transient interactions by integrating protein interaction data and gene expression data. Particularly, the stable interactions are reserved across different time points to serve as the backbone of the protein interaction networks, while the existence of a transient interaction at a certain time point depends on the specific activities and functions required from the two associated proteins. Then, based on the concept of overlapping temporal communities [32], we propose a novel Time Smooth Overlapping Complex Detection model (TSOCD) to detect overlapping temporal protein complexes from the constructed dynamic PPI networks, which allows individual complex to grow and shrink across different time points. Finally, a Nonnegative Matrix Factorization (NMF) based method is introduced to effectively merge those very similar temporal complexes across time and track their evolutionary process. We have performed extensive experiments to evaluate the performance of our TSOCD model. Experimental results show that TSOCD is able to achieve significantly better results than the stateoftheart algorithms for detecting protein complexes. Moreover, our algorithm is accessible as a tool, which could be downloaded from http://mail.sysu.edu.cn/home/stsddq@mail.sysu.edu.cn/dai/others/TSOCD.zip.
Methods
In this section, we first present how to construct dynamic PPI networks, and subsequently introduce how to detect overlapping temporal protein complexes from the constructed dynamic PPI networks.
Constructing dynamic PPI networks
The dynamic proteinprotein interaction networks (DPPI networks) are constructed by integrating timecourse gene expression data with static PPI networks. A static PPI network is often modelled as an undirected graph G = (V,E), where V consists of V = N proteins and E consists of E edges (protein interactions under different conditions between two proteins in V). The timecourse gene expression data of these N proteins across T time points are represented by a N × T matrix GE, which represents the expression level of N genes across T time points.
Now, we infer a DPPI network for each time point from GE and G. Existing methods construct DPPI networks solely by determining the peak time points of expression for each protein [22] and the connections among the networks at different time points are ignored. To address this problem, we first extract stable protein interactions from G, which are supposed to appear at all time points, as they are encoded by globally coexpressed gene pairs [27]. Particularly, for each protein interaction in G, we calculate their Pearson Correlation Coefficient (PCC) based on their gene expression profiles across all time points in GE. Then the protein interactions with PCC values greater than a certain cutoff δ are defined as stable interactions due to their corresponding globally coexpressed genes (we will discuss how to determine the value of δ in next section). These stable interactions represent the static part of the DPPI networks and are likely to be reserved across all time points. Note a N × N symmetric matrix S is introduced to indicate the stable interactions in the given PPI network G = (V,E), where S_{ ij } = 1 if protein i and j has a stable interaction, i.e. e_{ ij } ∈ E and PCC(e_{ ij }) > δ; S_{ ij } = 0 otherwise.
The dynamic parts of the DPPI network for each time point t(1 ≤ t ≤ T) are inferred from GE and G, as a transient interaction only presents at certain time points when both of the associated proteins are in their active forms. Particularly, at time point t, a protein i is considered to be in its active form if its expression value is above or equal to its active threshold which could be denoted as AT(i), as discussed in [22]. The active threshold for each protein is determined as follows:
where $u(i)=\frac{1}{T}{\sum}_{t=1}^{T}{\mathit{\text{GE}}}_{\mathit{\text{it}}}$ and σ(i) are the algorithm mean and standard deviation of the expression values over times 1 to T for protein i respectively, and F(i) = 1/(1 + σ^{2}(i)) is a weight function which reflects the fluctuation of the expression values of protein i. For more details, please refer to [22]. For each edge in the static PPI network (i.e., e_{ ij } ∈ E), it is presented at time point t if proteins i and j are in their active states (i.e., GE_{ it } ≥ AT(i) and GE_{ jt } ≥ A T(j)). The dynamic PPI networks can be represented by a set of graphs, G^{(t)} = (V,E^{(t)}), t = 1,…,T, where V denotes the original set of proteins and E^{(t)} represents the set of edges presented at time point t. Particularly, edge ${e}_{\mathit{\text{ij}}}^{(t)}\in {E}^{(t)}$ if S_{ ij } = 1 (i.e. stable interaction) or e_{ ij } ∈ E, G E_{ it } ≥ AT(i) and GE_{ jt } ≥ A T(j) (i.e. transient interaction). For each dynamic PPI network G^{(t)}, ${A}^{(t)}=\left[{A}_{\mathit{\text{ij}}}^{(t)}\right]\in {\{0,1\}}^{N\times N}$ is introduced to represent its adjacency matrix, where ${A}_{\mathit{\text{ij}}}^{(t)}=1$ if ${e}_{\mathit{\text{ij}}}^{(t)}\in {E}^{(t)}$ and ${A}_{\mathit{\text{ij}}}^{(t)}=0$ otherwise.
Detecting overlapping temporal protein complexes
Our objective is to infer D^{(t)}(1 ≤ t ≤ T), a sequence of timeevolving protein complexes, from the dynamic networks G^{(t)}(1 ≤ t ≤ T). Let ${D}^{(t)}=\left\{{D}_{k}^{(t)},k=1,\dots ,{r}_{t}\right\}$ contains r_{ t } predicted complexes at time point t. We define a N × r_{ t } proteincomplex assignment matrix H^{(t)} to indicate the membership of proteins in complexes, where ${H}_{\mathit{\text{ik}}}^{(t)}=1$ if protein i belongs to a complex ${D}_{k}^{(t)}\left(\text{i.e.}\phantom{\rule{2.77626pt}{0ex}}i\in {D}_{k}^{(t)}\right)$, and ${H}_{\mathit{\text{ik}}}^{(t)}=0$ otherwise. Here we allow overlapping proteins occur in multiple protein complexes simultaneously, i.e. ${H}_{\mathit{\text{ik}}}^{(t)}=1$, ${H}_{\mathit{\text{iz}}}^{(t)}=1$, and k ≠ z. Obviously, if we can compute H^{(t)}, we can easily infer D^{(t)}.
We further introduce another N × N matrix U^{(t)}, where each element ${U}_{\mathit{\text{ij}}}^{(t)}$ is the number of predicted complexes in D^{(t)} which contain both proteins i and j, i.e., ${U}_{\mathit{\text{ij}}}^{(t)}=\left\left\{{D}_{k}^{(t)}\in {D}^{(t)}:i\in {D}_{k}^{(t)},j\in {D}_{k}^{(t)},1\le k\le {r}_{t}\right\}\right$. Clearly, U^{(t)} represents the cocomplex membership among proteins at the time point t, which allows a protein to belong to more than one complex. Meanwhile, we have ${U}_{\mathit{\text{ij}}}^{(t)}=\sum _{k=1}^{{r}_{t}}{H}_{\mathit{\text{ik}}}^{(t)}{H}_{\mathit{\text{jk}}}^{(t)}$.
Model formulation
In order to predict D^{(t)}, we first infer U^{(t)} from the dynamic networks G^{(t)},1 ≤ t ≤ T. Particularly, We study the following three factors that are relevant for estimating U^{(t)}.
Firstly, based on the assumption that proteins belong to same complexes tend to interact with each other, ${H}_{\mathit{\text{ik}}}^{(t)}{H}_{\mathit{\text{jk}}}^{(t)}$ represents the expected number of interactions in complex k that lie between proteins i and j at time point t. Considering all complexes at time point t, ${U}_{\mathit{\text{ij}}}^{(t)}$ represents the expected total number of interactions between protein i and j in terms of all the r_{ t } complexes. Similarly to [13, 33], we assume the observed interaction between protein i and j at time point t is independently generated by a Poisson distribution with mean ${U}_{\mathit{\text{ij}}}^{(t)}$. Given the generative model, we can estimate U^{(t)} from A^{(t)} by maximizing the following likelihood function:
Taking the negative logarithm and dropping constants, maximizing the above likelihood function is equal to minimizing the following loss function:
Secondly, stable interactions are preserved across all the dynamic PPI networks, whereas the transient interactions only present at some special time points and absent at the other time points. Therefore, we introduce a smoothness regularization term R to enforce the stable interactions (with S_{ ij } = 1) and their corresponding complex membership ${U}_{\mathit{\text{ij}}}^{(t)}$ in U^{(t)} to change smoothly over time, rather than change dramatically between two consecutive time points. Here, the smooth regularization term ${R}_{t}=\sum _{i,j}{S}_{\mathit{\text{ij}}}{\left({U}_{\mathit{\text{ij}}}^{(t+1)}{U}_{\mathit{\text{ij}}}^{(t)}\right)}^{2}$ shows the temporal smoothness between ${U}_{\mathit{\text{ij}}}^{(t)}$ and ${U}_{\mathit{\text{ij}}}^{(t+1)}$. Correspondingly, $R={\sum}_{t=1}^{T1}{R}_{t}$ measures the overall smoothness across all time points.
Finally, as ${U}_{\mathit{\text{ij}}}^{(t)}={\sum}_{k=1}^{{r}_{t}}{H}_{\mathit{\text{ik}}}^{(t)}{H}_{\mathit{\text{jk}}}^{(t)}$, the rank of matrix U^{(t)} cannot be larger than the number of complexes r_{ t }. As we have no prior knowledge on r_{ t }, a low rank restriction for each U^{(t)} is thus needed during estimating U^{(t)}. In this paper, we use the trace norm constraint ∥U^{(t)}∥_{∗} as a relaxation of the low rank constraint [32], which prevents our model from producing too many complexes and controls the overlaps among complexes. In particular, ∥U^{(t)}∥_{∗} is the sum of singular values of U^{(t)}. According to the definition, it is easy to obtain ${\parallel {U}^{(t)}\parallel}_{\ast}={\parallel {H}^{(t)}\parallel}_{F}^{2}$, where ∥·∥_{ F } denotes Frobenius norm.
Temporal protein complex detection
Taking into account all the above three factors and dropping those constants, our objective function, aiming to minimize the loss function, the regularization term for smoothness, as well as the low rank constraint, is defined as follows:
where λ ≥ 0 and β ≥ 0 are the tradeoff parameters that control the balance among the three factors. The optimization problem (4) is combinatorial as U^{(t)} specifies all the possible cocomplex memberships among proteins at the time point t. As such, exhaustive search is impractical since there are exponentially many possible combinations. To address this problem, we relax the constrains of U^{(t)} and H^{(t)} from integers () to real numbers with U^{(t)} ≥ 0 and H^{(t)} ≥ 0.
Ideally, we could first compute the optimal solution ${\stackrel{\u02c6}{U}}^{(t)}$ and then extract a set of predicted complexes from it easily. However, because of the real number relaxation, while ${\stackrel{\u02c6}{U}}^{(t)}$ could approximate the underlying complex structure of A^{(t)}, it may not have a clear block structures that can clearly indicate protein complexes where protein pairs inside the complexes all have high ${\stackrel{\u02c6}{U}}^{(t)}$ values. Therefore, we still need to extract clusters from ${\stackrel{\u02c6}{U}}^{(t)}$ via clustering methods such as spectral clustering. In this paper, instead of taking two steps to infer H^{(t)}, we present a novel Time Smooth Overlapping Complex Detection (TSOCD) model with the following objective function by substituting ${U}_{\mathit{\text{ij}}}^{(t)}={\sum}_{k=1}^{{r}_{t}}{H}_{\mathit{\text{ik}}}^{(t)}{H}_{\mathit{\text{jk}}}^{(t)}$ into (4):
Therefore, we could directly extract clusters from the optimal solution ${\stackrel{\u02c6}{H}}^{(t)}$. To solve the above objective function (5), we adopt the multiplicative update rules [34] which are special cases of gradient descend method with an automatic step parameter selection and could naturally keep the nonnegativity of H^{(t)}. Please refer to the Additional file 1 for more details. Note each element ${\stackrel{\u02c6}{H}}_{\mathit{\text{ik}}}^{(t)}$ of ${\stackrel{\u02c6}{H}}^{(t)}$ is a continuous value, describing the propensity of protein i belonging to a predicted complex k. We discretize ${\stackrel{\u02c6}{H}}^{(t)}$ into the final proteincomplex assignment matrix H^{(t)}^{⋆} with the rules in Equation (6). Particularly, we assign protein i to the predicted complex k if the value of ${\stackrel{\u02c6}{H}}_{\mathit{\text{ik}}}^{(t)}$ exceed a threshold τ.
Here, ${{H}^{(t)}}_{\mathit{\text{ik}}}^{\star}=1$ represents protein i is in predicted complex k at time point t while ${{H}^{(t)}}_{\mathit{\text{ik}}}^{\star}=0$ denotes protein i is not in predicted complex k. In this study, the value of τ is set to 0.3, the same as in [14] (In next section, we will discuss how changing this parameter can affect the final results). In addition, we only consider predicted complexes with at least three proteins [12]. The detailed TSOCD algorithm of identifying temporal protein complexes is illustrated in Additional file 1: Figure S4.
Merging temporal protein complexes
Since the dynamic PPI networks, G^{(t)}(1 ≤ t ≤ T), contain a considerable fraction of stable interactions, some complexes detected across different time points will be quite similar. Thus we needed to merge those similar complexes to generate a final set of predicted complexes. Note we will only match and merge those very similar complexes but still maintain those timespecific complexes that occur only at certain dynamic PPI networks.
In this paper, we use a Nonnegative Matrix Factorization (NMF) model to merge similar temporal protein complexes, which provides a low rank approximation of a nonnegative matrix and has been widely used as a clustering method [35, 36]. After we compute a series of proteincomplex assignment matrices $\mathcal{H}=\left\{{{H}^{(1)}}^{\star},\dots ,{{H}^{(T)}}^{\star}\right\}$, a combined proteincomplex assignment matrix Y is defined as Y = [H^{(1)}^{⋆},…,H^{(T)}^{⋆}]. According to this definition, matrix Y = [Y_{ il }] ∈ {0,1}^{N×L} contains N rows and L = r_{1} + … + r_{ T } columns, each of which represents a complex detected at the corresponding time point, where Y_{ il } = 1 if protein i belongs to complex l and Y_{ il } = 0 otherwise. Our objective is to detect similar complexes from Y.
Assume there are K final complexes inherent in Y, we formulate the nonnegative matrix factorization of Y as:
where $W\in {\mathcal{R}}_{+}^{N\times K}$, $B\in {\mathcal{R}}_{+}^{K\times L}$ and ${\mathcal{R}}_{+}$ denotes the set of nonnegative real numbers. The model is solved by DTU:Toolbox [37] via multiplicative update method [34]. After calculating the solutions $\stackrel{\u02c6}{W}$ and $\stackrel{\u02c6}{B}$, we need to infer the group relationship of each complex l from $\stackrel{\u02c6}{B}$. Here, complex l is assigned to a group z if $z=\mathit{\text{arg}}\phantom{\rule{.3em}{0ex}}\underset{z}{\text{max}}\phantom{\rule{.3em}{0ex}}{\stackrel{\u02c6}{B}}_{\mathit{\text{zl}}}$. Finally, we merge complexes within same groups and obtain the final set of predicted complexes. The flowchart of our proposed algorithm, including 2 key steps, namely, constructing dynamic PPI networks, and detecting temporal protein complexes, is shown in Figure 1.
Results and discussion
In this section, we will first introduce the data, evaluation metrics and parameter settings. Then, we will present detailed experimental results.
Data, evaluation metrics and parameter settings
Protein interaction networks and time course gene expression data
Two yeast PPI networks have been employed for evaluating the performance of various complex detection methods, including 1) DIP PPI network [38], and 2) BioGrid PPI network (version 3.1.77) [39]. DIP data contain 21592 interactions among 4850 proteins, while BioGrid contain 59748 interactions among 5640 proteins.
Note that both DIP and BioGrid are aggregates of protein interactions obtained under different conditions or time points. In order to extract dynamic PPI networks from these datasets, we have used yeast metabolic cycle (YMC) gene expression microarrays [40] to infer stable and transient interactions. YMC reports the expression values for 3552 significant periodic genes [40] at 12 time points (i.e. T = 12 in our experiments, there are about 25 minutes per each time interval) over three successive cycles. The raw data are available on Gene Expression Omnibus (GEO) [41] with the accession number GSE3431. Similar to [22], in our experiment, the average expression value of each gene at the same time point of three cycles is used as its expression value at that time point. Among the 3552 genes, 2389 occur in DIP and 3057 occur in BioGrid. Thus, we retain these genes and their corresponding interactions in DIP and BioGrid respectively.
Gold standard protein complexes
To measure whether the predicted complexes match with known experimentally determined protein complexes, we have chosen two benchmark complex sets as our gold standard. They are derived from CYC2008 [42] and MIPS [43] respectively. For both gold standard sets, to avoid selection bias, we filter out the proteins that are not involved in the two PPI networks. Moreover, we only consider complexes with at least 3 proteins.
Metrics
We utilize two independent quality criteria, namely PR metric [44] and fmeasure [6], to evaluate the performance of various complex detection methods. Among these two measures, PR metric judge how well the predicted complexes match with known complexes mainly by considering the percentage of their overlapping proteins. fmeasure is the harmonic mean of recall and precision where recall measures how many known gold standard complexes are matched by the predicted complexes, while precision measures how many predicted complexes are matched with known complexes. The two metrics have complementary strengths and they could thus evaluate the prediction performance from different perspectives. In addition, they all give a value in the range of 01, where the higher values indicate the better performance.
Please refer to the Additional file 1 for more detailed description about the two PPI networks, two gold standard complex sets, as well as two evaluation metrics.
Parameter setting
When extracting dynamic PPI networks from given static PPI networks, we distinguish stable interactions from transient interactions by calculating the PCCs of their associated gene pairs’ expression values across all time points (i.e., PCC(e_{ ij })). Physical interactions with PCC values greater than a certain cutoff δ are defined as stable interactions. To determine the cutoff threshold, we use the PCC values of all the physical interactions and fit the PCC distribution with two parametric distributions, assuming one from the stable interactions and the other from the transient interactions.
As shown in Figure 2(a), the frequency distribution histogram of the PCC values of all physical interactions in BioGrid shows that they can be sorted into two well separated classes (interactions in DIP have similar properties as shown in Figure 2(b)). Therefore, we assume that the data consist of two distributional components: a η proportion of Gaussian distributed stable interactions and a (1 η) proportion of another Gaussian distributed transient interactions, which is consistent with the observed data. The proposed Gaussian mixture model (GMM) has the following form:
where η is the proportion with values between 0 and 1, $\mathcal{N}\phantom{\rule{0.3em}{0ex}}\left({\mu}_{1},{\sigma}_{1}^{2};x\right)$ is the Gaussian distribution with mean μ_{1} and variance ${\sigma}_{1}^{2}$.
We use Expectation Maximization (EM) algorithm to estimate the parameters of the above two Gaussian distributions $\left(\text{i.e.},\phantom{\rule{2.77626pt}{0ex}}{\mu}_{1},{\sigma}_{1}^{2},{\mu}_{2}\phantom{\rule{2.77626pt}{0ex}}\text{and}\phantom{\rule{2.77626pt}{0ex}}{\sigma}_{2}^{2}\right)$ for each dataset. The probability density functions learned from the BioGrid data and DIP data are shown in Figure 3(a) and (b) respectively. As shown in Figure 3, the two estimated distributions for each dataset are well separated. As stable interactions tend to be encoded by globally coexpressed gene pairs [27], the curve on the left side may correspond to the estimated distribution for PCC values of transient interactions while the curve on the right side may correspond to the estimated distribution for PCC values of stable interactions. From Figure 3, we can find that for both BioGrid and DIP, δ ∈ (0.2,0.4) can result in a relatively low rate of misclassification errors. Thus, we consistently keep δ = 0.3 in our experiments.
Both TSOCD and NMF need to define the number of complexes, i.e. {r_{1},…,r_{ T }} and K. With our low rank constrain of each U^{(t)}, we can give TSOCD a relatively large values of r_{ t } since the model could adaptively control the number of generated complexes. When merging similar temporal protein complexes via nonnegative matrix factorization, similar complexes likely to associate with same latent index and irrelevant latent indexes always obtain lower associations. As such, the value of K could also be relatively large since irrelevant dimensions will be filtered out. In this study, the values for r_{ t } (t = 1,…,T), and K are set to 1000 since our algorithm is not sensitive to their values.
Recall that TSOCD has three parameters τ, λ and β where τ is the threshold parameter, λ and β control the effects of the smooth regularization term R and low rank constrain respectively. To fully understand how these three parameters affect the performance of TSOCD, we perform the sensitivity studies. Particularly, we first keep τ = 0.3 and run TSOCD with different combination values of λ(λ ∈ {2^{7},2^{6},…,2^{1}}) and β(β ∈ {2^{0},2^{1},…,2^{6}}) and assess how well the predicted complexes match with gold standard sets. Then we fix the values of λ and β which result in the best performance, and study the effect of τ on the performance of TSOCD by setting τ = 0.1,0.2,…,0.6, respectively. Moreover, in order to verify the generalization of TSOCD, we select their best parameter values by testing the performance of TSOCD on DIP and BioGrid in terms of fmeasure with respect to the reference set MIPS. Therefore, the performance of TSOCD on DIP and BioGrid with respect to the other reference set CYC2008 can well validate the general performance of TSOCD.
From Figure 4 we observe that for a fixed value of λ, as the value of β increases, the fmeasure increases initially and decreases after reaching the maximum. Similarly, for a fixed value of β, as the value of λ increases, fmeasure increases initially and decreases after reaching the maximum. Thus both β and λ contribute to improve the performance of TSOCD. Overall, we find that for DIP and BioGrid, λ ∈ [2^{(4)},2^{(3)}] and β ∈ [2^{4},2^{5}] result in competitive results. On the other hand, we can find from Figure 5 that TSOCD is sensitive to τ. Overall, TSOCD achieved best performance when τ = 0.3. In order to avoid evaluation bias and overestimation of the performance, we do not tune the parameters for a particular dataset and fix τ = 0.3, λ = 2^{(4)} and β = 2^{4} in the following experiments. Nevertheless, it is worthy to mention that better performance may be achieved if the parameters are tuned for a particular PPI dataset or for a particular complex reference set.
Comparison with static complex detection methods
In order to demonstrate the benefits of using our constructed dynamic PPI networks, we compare our proposed TSOCD method with five stateoftheart algorithms, namely ClusterONE [12], MCL [8], MINE [45], COACH [46] and SPICi [47], which are originally designed for detecting protein complexes from static PPI networks. We apply these five algorithms on available static PPI networks (full PPI networks which are assembled by stable interactions and transient interactions) and apply TSOCD on our constructed dynamic PPI networks respectively, and evaluate the predicted complexes in terms of two metrics with respect to two gold standards. Note optimal parameters are set for MCL, MINE, COACH and SPICi to generate their best results (in terms of fmeasure with respect to MIPS and CYC2008) while ClusterONE has used the default parameters set by the authors. For detailed parameter settings of these five algorithms, please refer to Additional file 1.
We also apply TSOCD on static PPI networks, i.e., discard the smooth regularization term in the objective function (5) and take the static PPI networks as input (we denote it as OCD). For fair comparison, optimal parameters are also set for OCD to generate its best results. In addition, we discard their predicted complexes with less than three proteins, for all the 7 methods. Figure 6 shows the comparative performance of 7 different algorithms on two PPI networks with respect to benchmark complex set CYC2008. Moreover, Table 1 shows the size distribution of complexes detected by various algorithms, and the values of recall and precision for each algorithm.
As shown in Figure 6, for both DIP and BioGrid, our TSOCD outperforms other 5 existing methods in terms of two metrics based on the benchmark CYC2008 (we have similar results with respect to MIPS benchmark in Additional file 1: Figure S2). For instance, on DIP data, TSOCD achieves the highest fmeasure 0.472, which is 8.5% higher than the second best fmeasure 0.435, achieved by SPICi. On BioGrid data, TSOCD also achieves the highest fmeasure 0.487, which is 15.4% higher than the second best fmeasure 0.422 achieved by ClusterONE. In Table 1, we can find that TSOCD achieves a good performance due to its high recall and precision. Additional file 1: Table S1 in the Additional files shows similar results with respect to the MIPS benchmark. Interestingly, we also observe that OCD achieves better performance than the above 5 existing algorithms on both DIP and BioGrid data. Thus, even without using timecourse gene expression information, our method could also be utilized to better detect complexes from static PPI networks. On the other hand, by taking into account the temporal gene expression data to construct dynamic PPI networks, our method is able to capture timeevolving protein complexes and thus detect complexes much more accurately.
Comparison with dynamic complex detection methods
Recently, Park et al.[28] proposed Dynamical Hierarchical Agglomerative Clustering (DHAC) method to detect protein complexes from dynamic PPI networks, with two different versions, i.e. DHACconst and DHAClocal. The existing methods, such as ClusterONE, SPICi, MCL, COACH and MINE, can also be adapted to handle each of the dynamic PPI networks across different time points. For fair comparison, we have also applied nonnegative matrix factorization (NMF) to merge those clusters predicted by each method into their own final predicted complex results.
Figure 7 illustrates the comparison among all the above algorithms with respect to CYC2008 (the detailed comparative results of various algorithms are listed in Additional file 1: Table S2). We observe that TSOCD achieves best performance than existing algorithms consistently in terms of the two measures across DIP and BioGrid data (similar results obtained with respect to MIPS benchmark in Additional file 1: Figure S3). Moreover, some existing algorithms combined with our NMF model obtain notable gains in prediction accuracy on the dynamic PPI networks. For example, ClusterONE achieves 0.360 fmeasure on the static DIP data, but it increases to 0.427 on the dynamic DIP data. Similarly, SPICi achieves 0.355 fmeasure on the static BioGrid data, but it increases to 0.402 on the dynamic BioGrid data. Therefore, the information contained in dynamic PPI networks are indeed useful and they complement to the static PPI data for better complex detection.
Besides NMF, there are also some other algorithms that could be used to merge those similar complexes. Another widely used method is based on the overlap between different complexes. To study the effectiveness of NMF in merging those similar complexes, we also apply the reduction strategy proposed by Wang et al.[22] to merge those similar complexes. Since their method is based on the overlap between different complexes, how to decide the value of the similarity threshold is an important problem. In this study, the similarity threshold is set to be 0.65 as recommended by the authors. For more details about the reduction strategy proposed by Wang et al., please refer to [22]. The results of using the reduction strategy proposed by Wang et al. are shown in Additional file 1: Table S3. We can find from Additional file 1: Table S3 that TSOCD can still achieve the best performance. Furthermore, we could find that NMF is more accurate in merging those similar complexes, since better precision and recall are obtained when using NMF as the reduction strategy.
Detecting multifunctional proteins
Protein complexes predicted by various methods can be used for protein function prediction [48] – a unknown protein can be assigned with its involved complex’s functions. However, multifunctional proteins carry out different functions by interacting with different partners at different time points [11]. It is thus a challenging task for traditional complex detection methods to predict multifunctional proteins based on the static view of PPI networks, which cannot reflect the dynamic nature of real PPI networks. Our proposed TSOCD method, on the other hand, can handle this task well, as it is specially designed to detect timeevolving overlapping protein complexes by integrating PPI data with temporal gene expression data. Next, we present an interesting case study to show how the complexes predicted by our method help to detect and analyze multifunctional proteins.
YOR210W is a multifunctional protein which is shared by three complexes, namely, the DNAdirected RNA polymerase I, DNAdirected RNA polymerase II, and DNAdirected RNA polymerase III [39, 42]. Employing SPICi [47] (designed for nonoverlapping complex detection) and ClusterONE [12] (designed for overlapping complex detection) on the static BioGrid data, we can find that only one complex detected by SPICi includes YOR210W as shown in Figure 8(a), so that SPICi can only assign one function, i.e., DNAdirected RNA polymerase II to it. From Figure 8(b), ClusterONE is better than SPICi and it can assign the protein with two functions, namely DNAdirected RNA polymerase I and DNAdirected RNA polymerase II (for more examples, please refer to Additional file 1). Finally, our proposed TSOCD detect all the above 3 overlapping complexes in Figure 8(c) and thus we are able to predict proteins’ multifunctions more accurately.
Moreover, when running SPICi on dynamic BioGrid PPI networks, it predicts two different complexes with YOR210W, i.e., {YOR210W, YOR224C, YGR005C, YOR341W, YOR340C, YDR156W, YJL148W} and {YJL164C, YER125W, YHL024W, YOR151C, YOL005C, YGR005C, YOR210W, YOR224C}. These two complexes match with both the RNA polymerase I and II complexes. Recall that SPICi can only generate one cluster based on the static BioGrid data involving YOR210W. Hence, dynamic networks indeed provide us with more insights into the proteins’ temporal activities for dynamic complex formation.In addition, as shown in Figure 8(c), TSOCD predicts a novel protein YIR010W for both DNAdirected RNA polymerase I and III complexes. As protein YIR010W interacts with most members in RNA polymerase I, and all members in RNA polymerase III, we infer that YIR010W is likely to be multifunctional and highly related to RNA polymerase. By checking and browsing the literature, we find that YIR010W is a component of MIND kinetochore complex which is required for correct chromosome alignment and is related to the assembly of the RNA polymerase complex.
Conclusion
In real biological environments, protein interaction networks are not static – they dynamically change across different time points [29]. Many existing protein complex mining methods, however, detect protein complexes from the overly simplified static PPI network model, which can not capture the inherent dynamic nature of protein interactions as well as modular temporal protein complexes.
Temporal protein complexes are typically constructed by the dynamic assembly or disassembly of proteins to perform various biological functions [49]. As they can better reflect the realworld dynamic molecular mechanisms inside the cellular systems, it is thus crucial to detect them by systematically analyzing dynamic PPI networks. Although a few methods have been proposed to identify temporal protein complexes by applying static complex detection methods for each individual time point, they fully ignore the correlations between the consecutive dynamic protein networks and thus cannot work well. In addition, these methods can not generate overlapping protein complexes and they do not reflect the biological observation that proteins frequently involve in multiple protein complexes [6] to play diverse biological functions.
To address these problems, in this study, we introduce a novel Time Smooth Overlapping Complex Detection model (TSOCD) to detect overlapping temporal protein complexes from the dynamic PPI networks. Particularly, we construct a series of dynamic PPI networks by detecting stable interactions and transient interactions via integrating protein interaction data and gene expression data. Our proposed TSOCD allows individual complex to be assembled and disassembled across different time points. Furthermore, with the smoothness regularization term, our model can detect conserved protein complexes that play fundamental roles in cellular systems. The analysis on real biological data shows that our proposed TSOCD significantly outperforms existing stateoftheart temporal complex detection methods. Furthermore, with the constructed dynamic PPI networks, our method could detect multifunctional proteins more correctly. All the experimental results, including the predicted stable complexes and temporal complexes, are shown in Additional files 2 and 3. We also investigate the benefits of using the smoothness regularization term by comparing the performance of our model without the smoothness regularization term. Our experimental results show that with the smoothness constrain, our method could detect temporal protein complexes more accurately, as we can better consider the conserved protein interactions between the consecutive networks. The detailed comparison are shown in Additional file 1.
In summary, compared with existing methods, our model has the following advantages:

We have distinguished two different types of protein interactions for constructing dynamic PPI networks. In particular, the stable interactions are reserved across different time points to serve as the backbone of the protein interaction networks, while transient interactions are only presented under certain conditions and thus occurred in dynamic part of PPI networks.

It allows the dynamic assembly process, i.e. individual complex to be assembled and disassembled across different time points. In addition, with smoothness regularization, it prevents the value of the assigned cocomplex similarity for proteins with stable interactions from changing too dramatically.

It generates the overlapping temporal protein complexes, which clearly reflect the biological reality on proteins’ multifunctional roles.

Finally, our proposed method is unsupervised and thus is generic enough to apply for the dynamic complex detection of other species.
The computational complexity for updating H^{(t)} is O (N^{2}r_{ t }), where N is the number of proteins, and r_{ t } is the number of complexes at time t. Thus the overall time cost of TSOCD is O (N^{2}(r_{1} + … + r_{ T })I), where T is the number of time points and I is the number of iterations. In practice the time cost will be much smaller since H^{(t)} is sparse and the number of proteins at each time point is less than N.
Applying our proposed TSOCD method on dynamic PPI networks could effectively track the underlying dynamic modular organization and provide a new biological knowledge and insights about the molecular systems. In this study, we use timecourse gene expression data to help construct dynamic PPI networks since it is one of the most abundant data that include the temporal information of proteins in the gene level. However, as it contains noisy information, the performance of our proposed algorithm could be limited by its poor quality. Moreover, there are a few of other related information sources, including a collection of genomics, functional genomics, genetics studies and their corresponding result datasets, biological pathway databases, cellar compartment information and biomedical ontologies. As such, in our future work, we will study how to reduce the noise in the gene expression data as well as to incorporate other biological evidences for constructing more accurate dynamic PPI networks that could lead to further performance improvements for detecting temporal protein complexes.
References
 1.
Yu H, Braun P, Yıldırım MA, Lemmens I, Venkatesan K, Sahalie J, HirozaneKishikawa T, Gebreab F, Li N, Simonis N, Hao T, Rual JF, Dricot A, Vazquez A, Murray RR, Simon C, Tardivo L, Tam S, Svrzikapa N, Fan C, de Smet AS, Motyl A, Hudson ME, Park J, Xin X, Cusick ME, Moore T, Boone C, Snyder M, Roth FP, et al: Highquality binary protein interaction map of the yeast interactome network. Science. 2008, 322 (5898): 104110. 10.1126/science.1158684.
 2.
Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, Rau C, Jensen LJ, Bastuck S, Dumpelfeld B, Edelmann A, Heurtier MA, Hoffman V, Hoefert C, Klein K, Hudak M, Michon AM, Schelder M, Schirle M, Remor M, Rudi T, Hooper S, Bauer A, Bouwmeester T, Casari G, Drewes G, Neubauer G, Rick JM, Kuster B, Bork P, et al: Proteome survey reveals modularity of the yeast cell machinery. Nature. 2006, 440 (7084): 631636. 10.1038/nature04532.
 3.
Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S, Datta N, Tikuisis AP, Punna T, PeregrínAlvarez JM, Shales M, Zhang X, Davey M, Robinson MD, Paccanaro A, Bray JE, Sheung A, Beattie B, Richards DP, Canadien V, Lalev A, Mena F, Wong P, Starostine A, Mand Vlasblom CJM, Wu S, Orsi C, Collins SR, et al: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440 (7084): 637643. 10.1038/nature04670.
 4.
Gavin AC, Bosche M, Krause R, Grandi P, Marzioch M, Bauer A, Schultz J, Rick JM, Michon AM, Cruciat CM, Remor M, Hofert C, Schelder M, Brajenovic M, Ruffner H, Merino A, Klein K, Hudak M, Dickson D, Rudi T, Gnau V, Bauch A, Bastuck S, Huhse B, Leutwein C, Heurtier MA, Copley RR, Edelmann A, Querfurth E, Rybin V, et al: Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature. 2002, 415 (6868): 141147. 10.1038/415141a.
 5.
Spirin V, Mirny LA: Protein complexes and functional modules in molecular networks. Proc Nat Acad Sci USA. 2003, 100 (21): 1212312128. 10.1073/pnas.2032324100.
 6.
Li X, Wu M, Kwoh CK, Ng SK: Computational approaches for detecting protein complexes from protein interaction networks: a survey. BMC Genomics. 2010, 11 (Suppl 1): S310.1186/1471216411S1S3.
 7.
Lage K, Karlberg EO, Størling ZM, Olason PI, Pedersen AG, Rigina O, Hinsby AM, Tümer Z, Pociot F, Tommerup N, Moreau Y, Brunak S: A human phenomeinteractome network of protein complexes implicated in genetic disorders. Nat Biotechnol. 2007, 25 (3): 309316. 10.1038/nbt1295.
 8.
Enright AJ, Van Dongen S, Ouzounis CA: An efficient algorithm for largescale detection of protein families. Nucleic Acids Res. 2002, 30 (7): 15751584. 10.1093/nar/30.7.1575.
 9.
Bader GD, Hogue CW: An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics. 2003, 4: 210.1186/1471210542.
 10.
Cho YR, Hwang W, Ramanathan M, Zhang A: Semantic integration to identify overlapping functional modules in protein interaction networks. BMC Bioinformatics. 2007, 8: 26510.1186/147121058265.
 11.
Becker E, Robisson B, Chapple CE, Guénoche A, Brun C: Multifunctional proteins revealed by overlapping clustering in protein interaction network. Bioinformatics. 2012, 28: 8490. 10.1093/bioinformatics/btr621.
 12.
Nepusz T, Yu H, Paccanaro A: Detecting overlapping protein complexes in proteinprotein interaction networks. Nat Methods. 2012, 9 (5): 471472. 10.1038/nmeth.1938.
 13.
Zhang XF, Dai DQ, OuYang L, Wu MY: Exploring overlapping functional units with various structure in protein interaction networks. PloS ONE. 2012, 7 (8): e4309210.1371/journal.pone.0043092.
 14.
OuYang L, Dai DQ, Zhang XF: Protein complex detection via weighted ensemble clustering based on Bayesian nonnegative matrix factorization. PloS ONE. 2013, 8 (5): e6215810.1371/journal.pone.0062158.
 15.
Wang J, Peng X, Peng W, Wu FX: Dynamic protein interaction network construction and applications. Proteomics. 2014, 14 (4–5): 338352.
 16.
Nooren I, Thornton JM: Diversity of protein–protein interactions. EMBO J. 2003, 22 (14): 34863492. 10.1093/emboj/cdg359.
 17.
Xiao Q, Wang J, Peng X, Wu FX: Detecting protein complexes from active protein interaction networks constructed with dynamic gene expression profiles. Proteome Sci. 2013, 11 (Suppl 1): S2010.1186/1477595611S1S20.
 18.
Przytycka TM, Singh M, Slonim DK: Toward the dynamic interactome: it’s about time. Brief Bioinformatics. 2010, 11: 1529. 10.1093/bib/bbp057.
 19.
Lo K, Raftery A, Dombek K, Zhu J, Schadt E, Bumgarner R, Yeung K: Integrating external biological knowledge in the construction of regulatory networks from timeseries expression data. BMC Syst Biol. 2012, 6: 10110.1186/175205096101.
 20.
Li XL, Tan YC, Ng SK: Systematic gene function prediction from gene expression data by using a fuzzy nearestcluster method. BMC Bioinformatics. 2006, 7 (Suppl 4): S2310.1186/147121057S4S23.
 21.
Han JDJ, Bertin N, Hao T, Goldberg DS, Berriz GF, Zhang LV, Dupuy D, Walhout AJ, Cusick ME, Roth FP, Vidal M: Evidence for dynamically organized modularity in the yeast protein–protein interaction network. Nature. 2004, 430 (6995): 8893. 10.1038/nature02555.
 22.
Wang J, Peng X, Li M, Pan Y: Construction and application of dynamic protein interaction network based on time course gene expression data. Proteomics. 2013, 13 (2): 301312. 10.1002/pmic.201200277.
 23.
Yu H, Lin CC, Li YY, Zhao Z: Dynamic protein interaction modules in human hepatocellular carcinoma progression. BMC Syst Biol. 2013, 7 (5): 113.
 24.
Song L, Kolar M, Xing EP: KELLER: estimating timevarying interactions between genes. Bioinformatics. 2009, 25 (12): i128—i136
 25.
Ahmed A, Xing EP: Recovering timevarying networks of dependencies in social and biological studies. Proc Nat Acad Sci. 2009, 106 (29): 1187811883. 10.1073/pnas.0901910106.
 26.
Du N, Zhang Y, Li K, Gao J, Mahajan SD, Nair BB, Schwartz SA, Zhang A: Evolutionary analysis of functional modules in dynamic PPI networks. Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine, 7–10 October, 2012; Orlando, Florida. 2012, New York: ACM, 250257.
 27.
Das J, Mohammed J, Yu H: Genomescale analysis of interaction dynamics reveals organization of biological networks. Bioinformatics. 2012, 28 (14): 18731878. 10.1093/bioinformatics/bts283.
 28.
Park Y, Bader JS: How networks change with time. Bioinformatics. 2012, 28 (12): i40i48. 10.1093/bioinformatics/bts211.
 29.
Kim Y, Han S, Choi S, Hwang D: Inference of dynamic networks using timecourse data. Brief Bioinformatics. 2014, 15 (2): 212228. 10.1093/bib/bbt028.
 30.
Vinayagam A, Hu Y, Kulkarni M, Roesel C, Sopko R, Mohr SE, Perrimon N: Protein complexbased analysis framework for highthroughput data sets. Sci Signal. 2013, 6 (264): rs5
 31.
Srihari S, Leong HW: Temporal dynamics of protein complexes in PPI networks: a case study using yeast cell cycle dynamics. BMC Bioinformatics. 2012, 13 (Suppl 17): S16
 32.
Chen Y, Kawadia V, Urgaonkar R: Detecting overlapping temporal community structure in timeevolving networks. arXiv preprint arXiv:1303.7226 2013
 33.
Ball B, Karrer B, Newman M: Efficient and principled method for detecting communities in networks. Phys Rev E. 2011, 84 (3): 036103
 34.
Lee DD, Seung HS: Algorithms for nonnegative matrix factorization. Adv Neural Inf Process Syst. 2001, Cambridge: The MIT Press, 556562.
 35.
Lee D, Seung H: Learning the parts of objects by nonnegative matrix factorization. Nature. 1999, 401 (6755): 788791. 10.1038/44565.
 36.
Ding C, He X, Simon H: On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the SIAM International Conference on Data Mining (SDM’05). 2005, Philadelphia: Society for Industrial and Applied Mathematics, 606610.
 37.
Schmidt MN, Laurberg H: Nonnegative matrix factorization with Gaussian process priors. Comput Intell Neurosci. 2008, 2008: 3
 38.
Salwinski L, Miller CS, Smith AJ, Pettit FK, Bowie JU, Eisenberg D: The database of interacting proteins: 2004 update. Nucleic Acids Res. 2004, 32 (suppl 1): D449—D451
 39.
Chatraryamontri A, Breitkreutz BJ, Heinicke S, Boucher L, Winter A, Stark C, Nixon J, Ramage L, Kolas N, O’Donnell L, Reguly T, Breitkreutz A, Sellam A, Chen D, Chang C, Rust J, Livstone M, Oughtred R, Dolinski K, Tyers M: The BioGRID interaction database: 2013 update. Nucleic Acids Res. 2013, 41 (D1): D816D823. 10.1093/nar/gks1158.
 40.
Tu BP, Kudlicki A, Rowicka M, McKnight SL: Logic of the yeast metabolic cycle: temporal compartmentalization of cellular processes. Science. 2005, 310 (5751): 11521158. 10.1126/science.1120499.
 41.
Edgar R, Domrachev M, Lash AE: Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 2002, 30: 207210. 10.1093/nar/30.1.207.
 42.
Pu S, Wong J, Turner B, Cho E, Wodak SJ: Uptodate catalogues of yeast protein complexes. Nucleic Acids Res. 2009, 37 (3): 825831. 10.1093/nar/gkn1005.
 43.
Mewes HW, Amid C, Arnold R, Frishman D, Güldener U, Mannhaupt G, Münsterkötter M, Pagel P, Strack N, Stümpflen V, Warfsmann J, Ruepp A: MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res. 2004, 32 (suppl 1): D41—D44
 44.
Song J, Singh M: How and when should interactomederived clusters be used to predict functional modules and protein function?. Bioinformatics. 2009, 25 (23): 31433150. 10.1093/bioinformatics/btp551.
 45.
Rhrissorrakrai K, Gunsalus KC: MINE: module identification in networks. BMC Bioinformatics. 2011, 12: 19210.1186/1471210512192.
 46.
Wu M, Li X, Kwoh CK, Ng SK: A coreattachment based method to detect protein complexes in PPI networks. BMC Bioinformatics. 2009, 10: 16910.1186/1471210510169.
 47.
Jiang P, Singh M: SPICi: a fast clustering algorithm for large biological networks. Bioinformatics. 2010, 26 (8): 11051111. 10.1093/bioinformatics/btq078.
 48.
AltafUlAmin M, Shinbo Y, Mihara K, Kurokawa K, Kanaya S: Development and implementation of an algorithm for detection of protein complexes in large interaction networks. BMC Bioinformatics. 2006, 7: 20710.1186/147121057207.
 49.
Chen B, Fan W, Liu J, Wu FX: Identifying protein complexes and functional modules–from static PPI networks to dynamic PPI networks. Brief Bioinformatics. 2013, 15 (2): 212228.
Acknowledgements
This work is supported by the National Science Foundation of China [11171354, 61375033 and 61402190 to LOY, DQD, XFZ], the Ministry of Education of China [20120171110016 to LOY, DQD, XFZ], the Natural Science Foundation of Guangdong Province [S2013020012796 to LOY, DQD, XFZ], the International Program Fund of 985 Project, Sun Yatsen University.
Author information
Affiliations
Corresponding authors
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
LOY, DQD, XLL, MW, XFZ and PY conceived and designed the experiments. LOY and MW performed the experiments. LOY, MW and XFZ analyzed the data. LOY, MW and XFZ draft the manuscript under the guidance and supervision of DQD and XLL. All coauthors have seen a draft copy of the manuscript and agree with its publication. All authors read and approved the final manuscript.
Electronic supplementary material
12859_2014_6646_MOESM1_ESM.pdf
Additional file 1: Supplementary figures and text. This section provides the supplementary figures referred in the main text and some text which describes the detailed inference of the solution to Time Smooth Overlapping Complex Detection model, the data sets and the evaluation methods we have used, the effects of different parts of the model, random start effect, convergence analysis, brief description and detailed parameter settings of the compared clustering algorithms. (PDF 1 MB)
12859_2014_6646_MOESM2_ESM.xlsx
Additional file 2: Table S1. Complete lists of the predicted protein complexes. (XLSX 116 KB)
12859_2014_6646_MOESM3_ESM.xlsx
Additional file 3: Table S2. Complete lists of the predicted stable complexes and temporal complexes. (XLSX 46 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
12859_2014_6646_MOESM4_ESM.pdf
Authors’ original file for figure 1
12859_2014_6646_MOESM5_ESM.pdf
Authors’ original file for figure 2
12859_2014_6646_MOESM6_ESM.pdf
Authors’ original file for figure 3
12859_2014_6646_MOESM7_ESM.pdf
Authors’ original file for figure 4
12859_2014_6646_MOESM8_ESM.pdf
Authors’ original file for figure 5
12859_2014_6646_MOESM9_ESM.pdf
Authors’ original file for figure 6
12859_2014_6646_MOESM10_ESM.pdf
Authors’ original file for figure 7
12859_2014_6646_MOESM11_ESM.pdf
Authors’ original file for figure 8
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
The Creative Commons Public Domain Dedication waiver (https://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
OuYang, L., Dai, DQ., Li, XL. et al. Detecting temporal protein complexes from dynamic proteinprotein interaction networks. BMC Bioinformatics 15, 335 (2014). https://doi.org/10.1186/1471210515335
Received:
Accepted:
Published:
Keywords
 Dynamic proteinprotein interaction
 Gene expression
 Stable interaction
 Transient interaction
 Protein complex