Volume 14 Supplement 2
Selected articles from the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013): Bioinformatics
A novel subgradient-based optimization algorithm for blockmodel functional module identification
- Yijie Wang^{1}Email author and
- Xiaoning Qian^{1}Email author
DOI: 10.1186/1471-2105-14-S2-S23
© Wang and Qian; licensee BioMed Central Ltd. 2013
Published: 21 January 2013
Abstract
Functional module identification in biological networks may provide new insights into the complex interactions among biomolecules for a better understanding of cellular functional organization. Most of existing functional module identification methods are based on the optimization of network modularity and cluster networks into groups of nodes within which there are a higher-than-expectation number of edges. However, module identification simply based on this topological criterion may not discover certain kinds of biologically meaningful modules within which nodes are sparsely connected but have similar interaction patterns with the rest of the network. In order to unearth more biologically meaningful functional modules, we propose a novel efficient convex programming algorithm based on the subgradient method with heuristic path generation to solve the problem in a recently proposed framework of blockmodel module identification. We have implemented our algorithm for large-scale protein-protein interaction (PPI) networks, including Saccharomyces cerevisia and Homo sapien PPI networks collected from the Database of Interaction Proteins (DIP) and Human Protein Reference Database (HPRD). Our experimental results have shown that our algorithm achieves comparable network clustering performance in comparison to the more time-consuming simulated annealing (SA) optimization. Furthermore, preliminary results for identifying fine-grained functional modules in both biological networks and the comparison with the commonly adopted Markov Clustering (MCL) algorithm have demonstrated the potential of our algorithm to discover new types of modules, within which proteins are sparsely connected but with significantly enriched biological functionalities.
Introduction
Biomolecules interact with each other in a complex modular manner to maintain normal cellular functionalities [1, 2]. Identifying recurrent functional modules may help better understand the functional organization of cells [3]. Many existing network clustering algorithms for functional module identification focus on identifying "modules" within which nodes are densely connected [4–6]. However, identifying modules using these existing computational approaches may have artificially introduced biases from their module definitions and corresponding optimization methods [2]. Topologically defined modules may simply originate from the evolution process but do not necessarily correspond to functional units in cells [7]. In addition to densely self-connected modules, there are other topological structures in biological networks which capture important functional relationships among biomolecules. For example, transmembrane proteins, including receptors in many signal transduction pathways, have a special structure in which they rarely interact with each other but have a similar interaction patterns with the rest of the network [2]. To identify functional modules with richer topological structures, blockmodel network clustering recently has been proposed for functional module identification in biological networks [2, 4, 8, 9].
Blockmodel module identification problem has been investigated for years [2, 7, 10] and has recently been used for blockmodeling functional modules within biological networks [2]. However, the resulting optimization problem is NP hard with highly nonlinear and non-convex properties with many local optima in its objective function. This makes it computationally prohibitive to obtain the optimal modules, especially for large-scale biological networks. Simulated Annealing (SA) algorithm has been used to solve the optimization problem [2]. However, a slow cooling down procedure is required to guarantee the solution quality. Furthermore, its computational time escalates quadratically with the increasing number of modules to identify. Therefore, more efficient algorithms are needed for discovering fine-grained functional modules in genome-scale biological networks.
In this paper, an efficient optimization method--subgradient with path generation(SGPG) is proposed to solve this difficult non-convex combinatorial optimization problem. In order to achieve results close to global optima, SGPG combines the convex programming method, which uses subgradient (SG) to efficiently obtain the local optima, and a heuristic path generation (PG) strategy, which makes use of the obtained local optima to search for better solutions. We have applied our SGPG as well as SA for functional module identification in two large-scale protein-protein interaction (PPI) networks: Saccharomyces cerevisia (Sce) PPI network from the Database of Interacting Proteins (DIP) [11] and Homo sapien (Hsa) PPI network collected from the Human Protein Reference Database (HPRD version 9) [12]. The results demonstrate that our new SGPG method achieves competitive performance numerically and biologically comparing to SA but with significantly reduced computation time. Furthermore, we have implemented SGPG and the Markov Clustering (MCL) algorithm [13] to find fine-grained modules of these two PPI networks. The results reveal that SGPG can identify additional biologically meaningful modules that MCL may miss, which may provide us a better understanding of the functional organization of these PPI networks.
Blockmodel module identification
We first review the blockmodel module identification framework proposed in [2, 7, 10]. Any given network can be represented as a graph $\mathcal{G}=\left\{\mathcal{V},\mathcal{E}\right\}$, where $\mathcal{V}=\left\{{v}_{1},{v}_{2},\dots ,{v}_{N}\right\}$ denotes the set of N network nodes in $\mathcal{G}$, and $\mathcal{E}$ is the set of edges. The topology of the network $\mathcal{G}$ can be represented by an N × N adjacency matrix A, where each entry A_{ ij } represents the interaction between nodes v_{ i } and v_{ j } . The blockmodel framework introduces the image graph $\mathcal{M}\phantom{\rule{2.77695pt}{0ex}}=\phantom{\rule{2.77695pt}{0ex}}\left\{\mathcal{U},\mathcal{I}\right\}$ to abstract the function roles of the nodes in the original network and to outline the primary interactions among functional modules. In the image graph $\mathcal{M},\phantom{\rule{2.77695pt}{0ex}}\mathcal{U}=\left\{{u}_{1},\dots ,{u}_{q}\right\}$ represents the set of virtual module nodes in the module space and $\mathcal{I}$ preserves the interactions within $\mathcal{U}$. The topology of $\mathcal{M}$ also can be represented by a q × q adjacency matrix B, where the entry B_{ rs } denotes the interaction between modules u_{ r } and u_{ s }. For module identification, the mapping τ assigns N nodes in the original network $\mathcal{G}$ to q different modules in the image graph $\mathcal{M}$.
in which w_{ ij } denotes the weight of the corresponding edge in $\mathcal{G}$ (in this paper w_{ ij } = A_{ ij }); $M={\sum}_{i\ne j}^{N}{w}_{ij}$ is used to restrict E(τ, B) between 0 and 1; and p_{ ij } denotes the penalty of mismatching for the corresponding absent edges, which can be determined by ${p}_{ij}=\frac{{\Sigma}_{k\ne i}{w}_{ik}{\Sigma}_{\iota \ne j}{w}_{lj}}{{\Sigma}_{k\ne l}{w}_{kl}}$[10].
The optimization problem (2) is NP hard [2, 14]. In [10], SA has been proposed to solve the optimization problem, which has the time complexity escalating with the increasing q. In our biological application, the search space for annealing parameters also increases with the increasing q. To find a large number q of functional modules in large-scale networks, SA is a very time-consuming algorithm.
Subgradient with path generation (SGPG)
We propose to speed up the blockmodel module identification problem by convex programming combined with a heuristic path generation method. The basic idea is first to use the fast subgradient (SG) convex programming method to obtain the local optima, then use path generation (PG) to search for better solutions to reach global optima. We note that PG is originally proposed in this paper as a new useful heuristic to combine with subgradient algorithms to efficiently solve the hard combinatorial optimization problem. The combination of SG (time complexity O(qN^{2})) and PG (time complexity O(q^{2}N^{2})) can dramatically reduce the computational time with competitive performance compared to SA method.
Subgradient convex programming (SG)
Blockmodel module identification in matrix form
Note that we have converted our maximization problem into a minimization problem for the convenience of introducing subgradient methods in convex programming [15]. We denote Q = S^{ T } (W - P) S with its entries ${Q}_{rs}={S}_{r}^{T}\left(W-P\right){S}_{s}$, where S_{ r } is the r th column of S. Again, with the optimal assignment matrix S, we can derive the topology of the image graph B: B_{ rs } = 1 if Q_{ rs } > 0, and 0 otherwise.
Subgradient
where S^{ t } is the current solution, <, > is the inner product operator, and the new objective function is from the first-order Taylor expansion. The problem (5) at each iteration is a linear programming problem to search for the local extreme point along the gradient ∇F(S^{ t }) as in steepest descent. However, as previously stated, F(S^{ t }) takes the matrix L_{1} norm, which is non-smooth, and therefore non-differentiable. To address this last complexity, we apply subgradient methods [15] to replace ∇F(S^{ t }) by a subgradient ∂F(S^{ t }) instead [16]:
Definition (Subgradient): A matrix $\partial F\in {\mathcal{R}}^{N\times q}$ is a subgradient of a function $F:{\mathcal{R}}^{N\times q}\to R$ at the matrix $X\in {\mathcal{R}}^{N\times q}$ if $F\left(Z\right)\ge F\left(X\right)+<\partial F,\left(Z-X\right)>,\forall Z\in {\mathcal{R}}^{N\times q}$.
where 0 is a N × q matrix of all zeros. For our module identification problem, we have the following proposition derived from (6):
where α is a number between [-1, 1].
Proof: From (6), there always exists a $\stackrel{\u0304}{Q}$ satisfying ${\u2225\stackrel{\u0304}{Q}\u2225}_{{L}_{\infty}}\le 1$ and ${\u2225Q\u2225}_{{L}_{1}}=<\stackrel{\u0304}{Q},Q>$. As $\partial {\u2225Q\u2225}_{L1}=\partial <\stackrel{\u0304}{Q},Q>$ and the subgradient of differentiable functions is equal to its gradient [16], we have $\partial F\left({S}^{t}\right)=-\partial \phantom{\rule{2.77695pt}{0ex}}\left[\left|\right|Q|{|}_{L1}\right]=-\partial <\stackrel{\u0304}{Q},Q>=-\partial \mathsf{\text{tr}}\left({\stackrel{\u0304}{Q}}^{T}{S}^{{t}^{T}}\left(W-P\right){S}^{t}\right)=2\left(P-W\right){S}^{t}\stackrel{\u0304}{Q}$ when S^{ t } is close to the local minima. QED. □
Convex programming algorithm
Using Frank-Wolfe algorithm with the derived subgradient, we now have a conditional subgradient method [16] to iteratively solve the relaxed optimization problem as shown in the pseudo-code given in the following:.
Algorithm: Conditional Subgradient
Input: initial value S^{ t }, t = 0.
Do:
(i) Compute the subgradient ∂F (S^{ t }).
(ii) Solve the minimization problem:
S* = arg min_{ S } : <∂F (S^{ t }), S >s.t. S ∈ γ
(iii) Linear search for the step in the direction S* - S^{ t } found in (ii), update S^{ t }, t = t + 1.
Until: |ΔF| + ||ΔS^{ t }|| <ζ
Output: S^{ t }.
In this algorithm, step (ii) at each iteration can be solved using a generic linear programming solver in O((qN )^{3.5}). However, due to the special structure of the optimization problem, we instead solve it as a semi-linear assignment problem. As the assignment matrix [∂F(S^{ t })]_{ N × q } is not a square matrix, the optimization in step (ii) can be efficiently solved by assigning node i to module r, which is the index of the largest entry in row i of subgrident ∂F(S^{ t }), with the time complexity O(qN).
To derive the solution to the original problem (4) from the results of the relaxed problem by the conditional subgradient algorithm, we recover from the relaxed solution to a closest feasible solution by a simple rounding up strategy. Finally, we note that the presented conditional subgradient algorithm converges to a local stationary point of the combinatorial optimization problem (4) due to the non-convex nature of the objective function (3) with the worst case complexity O(qN^{2}) [15]. Hence, good initialization is critical to get high quality results. In our current implementation, we initialize S^{ t } by a modified Expectation-Maximization (EM) algorithm presented in [8].
Path generation (PG)
In order to make use of the local optima found by the above fast subgradient method, we propose a novel path generation method for our combinatorial optimization problem. The path generation method aims to conserve the overlap between two local optima, and get improvement based on the overlap which contributes significantly to the objective function value. Our new path generation is inspired by the path relinking method which connects two combinatorial local optima and try to find better results along the linking path [14]. However, our method does not relink two local optimal results but creates new paths by extracting potentially useful overlap between them.
The path generation based on (9) proceeds in the following manner: First, the most promising overlap Over(r_{ A }, s_{ B }) between modules r_{ A } and s_{ B } of the initiating solution x_{ A } and the guiding solution x_{ B } is identified by (9), then r_{ A } is locally adjusted to become Over(r_{ A }, s_{ B }) by removing nodes. After the adjustment, a new solution x_{1} is generated and C_{ A } = {r_{ A }} and C_{ B } = {s_{ B }}, where C_{ A } and C_{ B } denote the sets of used modules in both solutions, respectively. Local search is then applied to find the improved ${x}_{1}^{*}$. Then we preserve ${x}_{1}^{*}$ and let ${x}_{A}={x}_{1}^{*}$. The above procedure is repeated until no overlap exists or it reaches other relaxed termination conditions, for example, we can set N_{ stop } = 5 meaning that there are no larger than five nodes in the overlap of the modules from two solutions. Finally, we obtain the best solution along the generated paths. The whole procedure is illustrated in the following pseudo code:
Algorithm: Path Generation Method
Input: x_{ A }, x_{ B }, x, x_{ best }, N_{ stop }, C_{ A } = Ø, C_{ B } = Ø, Over = +∞, Q_{ best } = −∞
While( Over > N _{ stop } )
(1) (r_{ A }, s_{ B }) = argmax{S(r, s): r, s ∈ {1, ..., q} } and find Over(r_{ A }, s_{ B } );
(2) modify nodes from r_{ A } in x to make N_{ r } (x) = Over(r_{ A }, s_{ B }) and C_{ A } = {r_{ A }}, C_{ B } = {s_{ B }};
(3) $\left({Q}_{x}^{*},{x}^{*}\right)$ = LocalSearch(x);
(4) If $\left({Q}_{x}^{*}>{Q}_{best}\right)$
(5) Q_{ best } = ${Q}_{x}^{*}$ and x_{ best } = x*;
(6) EndIf
(7) x_{ A } = x* and find the next Over set using (9);
EndWhile
Output: x_{ best } and Q_{ best }.
Experimental results
We have implemented our SGPG method to identify functional modules in two biological networks: Saccharomyces cerevisia PPI network from the Database of Interacting Proteins (DIP) [11] and Homo sapien PPI network from the Human Protein Reference Database(HPRD) [12]. We first show the efficiency of SGPG comparing to the previous algorithms based on SA for functional module identification in two networks with q = 10, 50 and 100. We further evaluate the potential of SGPG to identify biologically meaningful modules by contrasting the differences of the identified fine-grained modules (q = 500 for Homo sapien PPI network and q = 300 for Saccharomyces cerevisia PPI network) detected by MCL algorithm [13]. We show that SGPG can unearth certain kinds of biologically meaningful modules that may not be detected by MCL.
Performance comparison between SA and SGPG
We first compare SA and SGPG for module identification in two PPI networks with relatively small q = 10, 50, and 100 as SA requires very slow cooling down procedures to guarantee the solution quality when q > 100. The Homo sapien PPI network has a largest component of 9,270 nodes and 36,917 edges. The upper bound of the objective function value in (2) ${Q}_{max}^{*}=0.98$ when we consider the original network itself as the image graph with q =9,270. We also have implemented our algorithm to the Saccharomyces cerevisia PPI network, which has a largest component of 4990 nodes and 21,911 edges with the upper bound ${Q}_{max}^{*}=0.\mathsf{\text{97}}$ when q =4,990.
Parameter settings in SA and SGPG
Para. | C _{ β } | T _{ start } | T _{ end } | T _{ sweep } | T _{ switch } | N _{ set } | N_{ stop }. |
---|---|---|---|---|---|---|---|
SA | 0.99 | 40 | 0.001 | 100 | 20 | - | - |
SGPG | - | - | - | - | - | 10 | 5 |
Comparison of SA and SGPG on Homo sapien and Saccharomyces cerevisia PPI networks
PPI | Method | Q*(q=10) | Time(h) | Q*(q=50) | Time(h) | Q*(q=100) | Time(h) |
---|---|---|---|---|---|---|---|
Homo sapien | SA | 0.5393 | 1.73 | 0.6530 | 45.07 | 0.7180 | 180.26 |
SGPR | 0.5346 | 0.5 | 0.6452 | 1.95 | 0.6898 | 6.35 | |
Saccharomyces cerevisia | SA | 0.5692 | 1.35 | 0.6834 | 25.02 | 0.7544 | 102.65 |
SGPR | 0.5690 | 0.3 | 0.6752 | 1.15 | 0.7292 | 3.34 |
Comparison between SGPG and MCL
In order to verify the biological significance of the modules identified by blockmodel identification, we have implemented both SGPG and MCL to detect fine-grained modules for both Saccharomyces cerevisia and Homo sapien PPI networks. Because SA will take months to obtain results with q > 200, we only have applied SGPG in this section. By analyzing the identified modules detected by two methods, we have found that SGPG can discover a comparable number of GO enriched modules as MCL detects. More importantly, SGPG discovers additional biologically meaningful modules in which proteins are sparsely connected but have the same interaction patterns to the rest of the network.
Saccharomyces cerevisia PPI network
We have identified fine-grained modules for the Saccharomyces cerevisia PPI network using SGPG and MCL. We set q = 300 for SGPG and the inflation parameter I = 1.5 for MCL, which identified 370 modules in total. Within these identified modules, 296 modules by SGPG and 307 modules by MCL have more than two nodes. From these, we have found 150 and 153 modules respectively with significantly enriched GO terms below 1% after Bonferroni-correction by GoTermFinder. SGPG performs competitively to MCL. But more importantly, we find that SGPG can detect sparsely connected modules with certain interaction patterns that MCL fails to detect.
Topological analysis of different KOG categories in Saccharomyces cerevisia PPI network
KOG ID | Method | proteins | sparse modules/modules | Avg. density | Avg. clustering coef. |
---|---|---|---|---|---|
U | SGPG | 353 | 15/26 | 2.98% | 0.0814 |
MCL | 256 | 0/21 | 27.38% | 0.2402 | |
K | SGPG | 359 | 6/24 | 6.68% | 0.1352 |
MCL | 361 | 0/19 | 26.35%0 | 0.1834 | |
J | SGPG | 579 | 9/24 | 9.16% | 0.0678 |
MCL | 358 | 0/25 | 37.90% | 0.1429 | |
T | SGPG | 169 | 13/21 | 3.47% | 0.0755 |
MCL | 94 | 0/12 | 31.31% | 0.0912 |
Sparse modules in U and T KOG categories for Saccharomyces cerevisia PPI network
KOG ID | Sparse module example | Enriched genes | Enriched GO Term | GO Level | p-value |
---|---|---|---|---|---|
U | YDR179C, YNL287W, YDL216C YCR099C, YIL004C,YAL026C YLR268W, YLR093C, YPR163C YPR148C, YOL064C, YOL117W YGL084C, YLR031W, YIL076W YPL179W, YKL191W, YPL010W | YOL117W, YDR179C, YDL216C | protein deneddylation | [+8, 0] | 2.01e-5 |
T | YJL092W, YDR490C, YOR231W YJL005W, YPL074W, YPL083C YNL323W,YOL100W | YDR490C, YOL100W, YNL323W, YJL005W, YOR231W | signal transduction | [+3, -1] | 6.09e-5 |
T | YDR076W, YDL059C, YJL173C YPL164C, YER171W, YPL026C YCR079W, YPL150W, YHR169W YJR062C | YDL059C, YPL026C, YER171W, YPL164C, YJL173C, YDR076W | response to endogenous stimulus | [+2, -1] | 4.77e-5 |
Homo sapien PPI network
For the Homo sapien PPI network, we set SGPG to identify q = 500 modules with the same settings in Table 1. For MCL, we set its inflation parameter I = 1.5 and have found 450 modules. We have performed GO enrichment analysis for these identified modules with more than two nodes (478 from SGPG and 380 from MCL). Based on GoTermFinder, 269 modules from SGPG and 265 modules from MCL are significantly enriched with p-values below 1% after Bonferroni-correction. SGPG has discovered a competitive number of GO enriched modules compared to MCL. We also note that the modules identified by SGPG are relatively smaller than those from MCL and these modules have more specific enriched functionalities and may provide more detailed information for future catalog of functional modules. More importantly, SGPG detects several modules with interesting functionalities that MCL has missed.
Following the same analysis method used in the previous section, we first annotate all the identified modules with KOG categories to scrutinize the differences between modules detected by SGPG and MCL. Figure 3B shows the percentages of the modules annotated to different KOG categories by both methods. Obviously, SGPG detects more modules annotated in KOG T and K categories, within which functional modules tend to have sparsely connected structures. However, MCL discovers more modules annotated in KOG U, within which functional modules tend to have a densely connected structure in the Homo sapien PPI network.
Topological analysis of different KOG categories in Homo sapien PPI network
KOG ID | Method | proteins | sparse modules/modules | Avg. density | Avg. clustering coef. |
---|---|---|---|---|---|
T | SGPG | 1970 | 59/126 | 4.91% | 0.0822 |
MCL | 2481 | 0/66 | 26.32% | 0.1696 | |
K | SGPG | 878 | 27/59 | 3.15% | 0.0779 |
MCL | 916 | 0/37 | 30.41%0 | 0.1928 | |
U | SGPG | 592 | 3/24 | 4.95% | 0.0448 |
MCL | 517 | 0/33 | 31.42% | 0.1359 |
Sparse modules in T and K KOG categories for Homo sapien PPI network
KOG ID | Sparse module example | Enriched genes | Enriched GO Term | GO Level | p-value |
---|---|---|---|---|---|
T | NTRK1, NTRK3, NTRK2 VAV1, VAV3 | NTRK1, NTRK2, NTRK3 | neurotrophin receptor activity | [+3, -1] | 2.95e-9 |
T | PIK3R3, PIK3R2, PIK3R1 | PIK3R3, PIK3R2, PIK3R1 | phosphatidylinositol 3-kinase complex | [+5, -1] | 4.77e-9 |
K | JUN, JUNB, JUND SPIB | JUN, JUNB, JUND | cellular response to calcium ion | [+6, -1] | 4.04e-7 |
Discussion
At present, most of the module identification methods for biological networks aim to find densely connected modules but ignore sparely connected modules, which can be manifested in biological systems due to their special functionalities. Here, in order to find more biologically meaningful modules with both types of modular structures, we adopt a blockmodel framework which detects densely connected modules and sparely connected modules simultaneously as it identifies modules by the interaction patterns. Our results indicate that the real world PPI networks, such as Saccharomyces cerevisia and Homo sapien PPI networks, do have the sparely connected modules, which may not be detected by the modularity based methods such as MCL.
We have proposed a novel efficient method SGPG that combines SG and PG to solve the blockmodel functional module identification problem. Our experimental results have proven that our SGPG method can achieve competitive performance numerically and biologically but with significantly reduced computation time compared to the original SA method in [2]. We have demonstrated that SGPG can identify biologically meaningful modules, specifically the ones with sparse interactions within them but with same interaction patterns to the rest of the network, which behave important cellular functionalities. Our future research will focus on designing more efficient algorithms to detect functional modules in large-scale biological networks. Our method can be further improved with the potential to enhance the performance. For example, the number of modules q needs to be given in our current algorithm. In [21], the authors have introduced a Bayesian strategy based on a stochastic block model to identify the module assignments as well as the optimal number of modules. However, this Bayesian approach only guarantees that the final solution converges to the local optimum. We may be able to combine the strengths from our SGPG method and the Bayesian approach to efficiently determine the optimal q in SGPG by adopting this Bayesian strategy to further improve the proposed algorithm. Also, there are some other promising efficient heuristics for global optimization, such as differential evolution [22] and genetic algorithms [23], which may also be coupled with our PG strategy to further increase the efficiency of these algorithms.
Declarations
The publication costs for this article were funded by the corresponding author's institution.
This article has been published as part of BMC Bioinformatics Volume 14 Supplement 2, 2013: Selected articles from the Eleventh Asia Pacific Bioinformatics Conference (APBC 2013): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/14/S2.
Declarations
Acknowledgements
XQ was supported in part by Award R21DK092845 from the National Institute Of Diabetes And Digestive And Kidney Diseases, National Institutes of Health; and by the University of South Florida Internal Awards Program under Grant No. 78068.
Authors’ Affiliations
References
- Hartwell L, Hopfield J, Leibler S, Murray A: From molecular to modular cell biology. Nature. 1999, 402: 47-52. 10.1038/46972.View ArticleGoogle Scholar
- Pinkert S, Schultz J, Reichardt J: Protein interaction networks: More than mere modules. PLoS Comput Biol. 2010, 6: e1000659-10.1371/journal.pcbi.1000659.PubMed CentralView ArticlePubMedGoogle Scholar
- Zinman G, Zhong S, Bar-Joseph Z: Biological interaction networks are conserved at the module level. BMC Systems Biology. 2011, 5: 134-10.1186/1752-0509-5-134.PubMed CentralView ArticlePubMedGoogle Scholar
- Newman M, Girvan M: Finding and evaluating community structure in networks. Phys Rev E Stat Nonlin Soft Matter Phys. 2004, 69: 026113-View ArticlePubMedGoogle Scholar
- Ziv E, Middendorf M, Wiggins C: Information-theoretic approach to network modularity. Phys Rev E Stat Nonlin Soft Matter Phys. 2005, 71: 046117-View ArticlePubMedGoogle Scholar
- Sharan R, Ulitsky I, Shamir R: Network-based prediction of protein function. Mol Syst Biol. 2007, 3: 88-PubMed CentralView ArticlePubMedGoogle Scholar
- Reichardt J, White D: Role modules for complex networks. Eur Phys J B. 2007, 60: 217-224. 10.1140/epjb/e2007-00340-y.View ArticleGoogle Scholar
- Newman M, Leicht E: Mixture models and exploratory data analysis in networks. Proc Natl Acac Sci USA. 2007, 104: 9564-9569. 10.1073/pnas.0610537104.View ArticleGoogle Scholar
- Wang Y, Qian X: Functional module identification by block modeling using simulated annealing with path relinking. ACM Conference on Bioinformatics, Computational Biology and Biomedicine (ACM-BCB 12). 2012Google Scholar
- Reichardt J: Structure in Networks. 2008, Springer-Verlag Berlin HeidelbergGoogle Scholar
- Salwinski L, Miller C, Smith A, Pettit F, JU JB, Eisenberg D: The Database of Interacting Proteins: 2004 update. Nucleic Acids Research. 2004, 32: D449-D451. 10.1093/nar/gkh086.PubMed CentralView ArticlePubMedGoogle Scholar
- Prasad T, Goel R, Kandasamy K: Human Protein Reference Database--2009 update. Nucleic Acids Research. 2009, 37: D767-D772. 10.1093/nar/gkn892.View ArticleGoogle Scholar
- Dongen S: A cluster algorithm for graphs. Technical Report INS-R0010. 2000Google Scholar
- Mateus G, Resende M, Silva R: GRASP with path-relinking for the generalized quadratic assignment problem. Journal of Heuristics. 2010, 1-39.Google Scholar
- Bertsekas D: Nonlinear Programming. 1999, Athena Scientific, 2Google Scholar
- Bach F, Jenatton R, Mairal J, Obozinski G: Optimization with sparsity-inducing penalties. Technical report, HAL 00613125. 2011Google Scholar
- Boyle E, Elizabeth I, Weng S: GO::TermFinder--open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004, 20: 3710-3715. 10.1093/bioinformatics/bth456.PubMed CentralView ArticlePubMedGoogle Scholar
- Lander E, Linton L, Birren B: Initial sequencing and analysis of the human genome. Nature. 2001, 409: 860-921. 10.1038/35057062.View ArticlePubMedGoogle Scholar
- Kaiser M: Mean clustering coefficients: the role of isolated nodes and leafs on clustering measures for small-world networks. New J Phys. 2008, 10 (083042):
- Petersen M, Pardali E: Smad2 and Smad3 have opposing roles in breast cancer bone metastasis by differentially affecting tumor angiogenesis. Oncogene. 2010, 29 (9): 1351-1361. 10.1038/onc.2009.426.View ArticlePubMedGoogle Scholar
- Hofman J, Wiggins C: A Bayesian Approach to Network Modularity. Phys Rev Lett. 2008, 100: 258701-PubMed CentralView ArticlePubMedGoogle Scholar
- Storn R, Price K: Differential evolution - a simple and efficient heuristic for global optimization over continuous spaces. Journal of Global Optimization. 1997, 11: 341-359. 10.1023/A:1008202821328.View ArticleGoogle Scholar
- Akbari Z: A multilevel evolutionary algorithm for optimizing numerical functions. International Journal of Industrial Engineering Computations. 2010, 2: 419-430.View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.