Skip to main content
  • Research Article
  • Open access
  • Published:

Comparison of module detection algorithms in protein networks and investigation of the biological meaning of predicted modules



It is generally acknowledged that a functional understanding of a biological system can only be obtained by an understanding of the collective of molecular interactions in form of biological networks. Protein networks are one particular network type of special importance, because proteins form the functional base units of every biological cell. On a mesoscopic level of protein networks, modules are of significant importance because these building blocks may be the next elementary functional level above individual proteins allowing to gain insight into fundamental organizational principles of biological cells.


In this paper, we provide a comparative analysis of five popular and four novel module detection algorithms. We study these module prediction methods for simulated benchmark networks as well as 10 biological protein interaction networks (PINs). A particular focus of our analysis is placed on the biological meaning of the predicted modules by utilizing the Gene Ontology (GO) database as gold standard for the definition of biological processes. Furthermore, we investigate the robustness of the results by perturbing the PINs simulating in this way our incomplete knowledge of protein networks.


Overall, our study reveals that there is a large heterogeneity among the different module prediction algorithms if one zooms-in the biological level of biological processes in the form of GO terms and all methods are severely affected by a slight perturbation of the networks. However, we also find pathways that are enriched in multiple modules, which could provide important information about the hierarchical organization of the system.


The biological function on the molecular level emerges from the complex interaction of biological entities of a cell [1, 2]. Specifically, different types of molecules, e.g., proteins, metabolites, miRNA or tiRNA, can interact in many various ways with each other in dependence on the tissue type and the environmental condition of an organism. The interactions among biological molecules can be broadly categorized into three types of networks: metabolic networks, transcriptional regulatory networks and protein interaction networks [36]. These networks need to be inferred from experimental observations generated by different high-throughput platforms, including Next-Generation Sequencing (NGS), proteomics and microarrays.

Nowadays, it is generally accepted that biological networks are not randomly connected but follow certain structural patterns that give rise to (I) a scale-free topology, (II) a hierarchical organization and (III) a modular structure [712]. Especially modularity is one of the most important features of biological networks, because it suggests that nodes, which are tightly connected with each other as a community, are most likely to be a part of the same biological function or pathway. This may also be reflected in the evolution of the organisms [8, 1315]. As a complicating factor, in reality, these pathways are not discrete, but each gene may take part in multiple biological functions, and therefore can be a part of multiple communities. Hence, a biological network with a modular structure can contain multiple overlapping communities, which might also contribute to the fact that biological networks are robust [16, 17].

For protein interaction networks (PINs) it is known that there are two types of modular structure that are of significant importance. These modules can be either formed by protein complexes or dynamic functional units [18]. Also the modules in PINs of different species have been explained as the efficient functioning of a cell and the basis of evolution in order to adapt the changes to the environment quickly [19, 20]. In [21] the existence of two further types of structural components of modules in protein networks has been revealed, which have been termed core components and ring components. The core components are more conserved and perform key biological functions, while the ring components performs certain specialized functions under particular circumstances potentially triggered by environmental changes. Furthermore, several methods have been developed to identify and integrate protein networks along with gene expression or other datasets such as disease-gene association to identify the functional activity of modules in different disease conditions [2225]. Finally, in [26] the algorithm ClusterONE has been developed to identify overlapping nodes in modules in protein networks. These examples demonstrate that any systems-based analysis on the genomic level is incomplete without a network understanding of interactions on the molecular level.

Our study has four major objectives. The first objective of our study is to compare community detection algorithms for benchmark networks as well as 10 protein interaction networks. Second, we provide an in depth analysis of the biological meaning of the predicted networks across a variety of different biological aspects. Third, due to the fact that all PINs are inferred from experimental data they carry a certain uncertainty with respect to the correctness of the inferred interactions. For this reason, we are performing a robustness analysis of the predicted modules by perturbing the PINs by edge deletions. Finally, we investigate overlapping pathways that may form functional bridges between more specialized modules.

For the community detection analysis, we are using the 5 most popular module detection algorithms, fast-greedy [27], walktrap [28], label propagation [29], spinglass [30] and multi-level community [31], that have been developed for application to large networks and propose in addition 4 correlation-based module prediction methods. Briefly, for our approaches, we assign weights to each pair of nodes depending on the distance between them in the network and utilize this for the module prediction. This provides competitive modularity measures for artificial and biological networks in comparison to other community detection algorithms. The details about all measure will be given in the Methods section.

Typically, for large real networks there is only limited information available about the true module structure within these networks because of our lack of understanding of the underlying phenomena. However, for protein networks we can make use of the Gene Ontology (GO) database [32], which provides a comprehensive overview of thousands of biological processes in a variety of different organisms. Utilizing this information allows a biologically meaningfully assessment of the predicted modules. Specifically, in our analysis, we use protein networks of 10 different species to investigate the modularity predicted by the different community detection algorithms.

This paper is organized as follows. In the next section, we describe all methods, measures and data sets used for our analysis, including a description of the protein interaction networks. In the Results section, we present our numerical findings and this paper finishes with the Conclusions section summarizing and discussing our results.



The module detection algorithms studied in this paper, optimize the modularity in a network. The measure for the modularity has been introduced in [27, 33] and is defined as follows.

$$ Q = \Sigma_{i} \left(e_{ii} - (a_{i})^{2}\right) $$

where e ij is a fraction of edges between communities i and j,

$$e_{ij} = \frac{1}{2m} \Sigma_{v \in i, w \in j} A_{vw} $$

A vw is the adjacency matrix element between v and w and a i is the fraction of edges which is connected to the nodes in community i, i.e.,

$$a_{i} = \frac{1}{2m}\Sigma_{v \in i} k_{v} $$

Here k v is a degree of node vi.

Fast-greedy algorithm

This method was proposed in [27]. The algorithm starts with the assumption that each individual node is an independent community and assigns modularity score, ΔQ ij , to each pair of nodes, and a i for each community. The ΔQ ij and a i are defined as follows:

$$\begin{array}{*{20}l} \Delta Q_{ij} &= \left\{ \begin{array}{ll} \frac{1}{2m}- \frac{k_{i} k_{j}}{(2m)^{2}} &\quad \text{if i, j are connected;}\\ &\quad\text{m is the total number of edges}\\ 0 &\quad \text{otherwise}. \end{array}\right.\\ a_{i}& = \frac{k_{i}}{2m} \end{array} $$

The algorithm starts by calculating ΔQ ij . Then it merges the two communities for which ΔQ ij is largest. After that, it updates ΔQ and a i for each community and repeats all steps until all communities are merged into one community. When two communities, i and j are merged the ΔQ is updated as follows:

$$ \Delta Q = e_{ij} +e_{ji} - 2a_{i}a_{j} $$

Walktrap algorithm

This method was proposed in [28]. The algorithm starts with the assumption that if two vertices, i and j, are in same community, then the random walk of length t from i and j to the nodes of other communities would be similar, \(P_{ik}^{t} \sim P_{jk}^{t}\). The random walk starting at vertex i to j through a path of length t is described as follows:

$$ \forall i, \, lim \, t \rightarrow +\infty \, P_{ij}^{t} = \frac{d(i)}{\Sigma_{k} d(k)} $$

where d(i) is the degree of vertex i.

In the first step of the algorithm, all nodes are considered as individual communities. In the second step, the two closest communities are merged based on the distance between them, and the community structure is updated. Then the second step is repeated until all communities are merged into one community.

The distance between communities is calculated as follows. Suppose there are C=C 1,C 2C k communities in the network.

$$ \sigma_{k} =\frac{1}{n} \Sigma_{{C_{k}}\in{C}}\Sigma_{{i \in {C_{k}}}} r^{2}_{i,C_{k}} $$

σ k is a mean square distance between two communities. The r is defined as follows:

$$r_{{C_{i}}{C_{j}}} = \sqrt{\Sigma_{k=1}^{n} \frac{\left(P^{t}_{C_{i}k}-P^{t}_{C_{j}k}\right)^{2}}{d(k)}} $$

Label propagation algorithm

This method was proposed in [29]. In this approach, a node x chooses to community to which the maximum numbers of its neighbours belong to. There are following steps to identify communities in the network.

  1. 1.

    Assign a unique label to each node.

  2. 2.

    Order nodes randomly.

  3. 3.

    label the selected node with the same label which is in maximum number in its neighbourhood.

  4. 4.

    If all the nodes have the same label, which is in maximum number in their neighbourhood, then stop the algorithm, otherwise repeat step 3.

Spinglass community algorithm

This method was proposed in [30]. In this approach the community detection is mapped to finding the ground state of an infinite ranged Potts spin glass model, by combining the information from both present and missing links, where the clusters are represented as the number of occupied spin states. In the Spinglass algorithm, existing edges within a community and non-existing edges between communities are rewarded while the edges which are not present in the community and edges between communities are penalized.

Multi-level community algorithm

This method was proposed in [31]. This algorithm is divided into two phases. In the first phase, all nodes are considered as independent communities. Then communities are merged into a larger community if the modularity of the network increase. The first phase is stopped if there is no further increase in the modularity. In the second phase each community is represented in the form of a node and edges between and within communities are replaced by weighted-edges. The number of edges between two nodes (communities) are replaced by a single weighted edge and all the edges in a community are replaced by a self-connecting weighted edge. After the construction of a new weighted network, first phase is repeated to obtain an improvement in modularity. These two phases are iterated until there is no further improvement in the modularity of the network.

Correlation based hierarchical clustering

In this approach, we start with the assumption that if two nodes are the part of same community then their shortest path distance to all other nodes are positively correlated. We first calculate the shortest path distance, S(G) for a graph G, between all pairs of nodes an calculate correlation between each pair of nodes. Here we provide some shortest path based measures to calculate correlation between pairs of nodes. Let S(G) is the shortest path distance matrix, and the correlation matrix is ρ(S(G)), then the distance between each pair of node is described as follows:

$$ D_{sp} = 1 - \rho(S(G)) $$

The second measure for correlation is described as follows: Let A and S(G) are adjacency and shortest distance matrix for a graph G, then the weight matrix of pairs of nodes.

$$ M = A \times S(G) $$

M is an asymmetric weight matrix where each row represent nodes and columns represent weights between each other. If the nodes are from same community then their weights w.r.t other nodes are strongly correlated. The distance matrix is defined as follows:

$$ D_{M} = 1 - \rho(M) $$

We use these two different distance measures for hierarchical clustering (ward algorithm). To get an optimal number of cluster we use modularity measure by newman [27] described in the “ Modularity ” section.


In the Results section, we first analyze the performance of the community detection algorithms with artificially generated benchmark networks, and then we study protein interaction networks of different species. A description of these networks is provided in the following subsections.

Benchmark networks

The benchmark networks are generated by an algorithm proposed by [34]. It has been introduced with the purpose to generate benchmark networks for testing module detection algorithms. The generation of the network proceeds along the following steps.

  1. (1)

    The degree, d, of each node is randomly assigned from the power law distribution with exponent γ, in our case it is 1. The degree distribution is assigned depending on the maximum degree d max ={20,40} and the average degree, d avg =10, selected as an input.

  2. (2)

    Nodes are assigned a fraction of edges, μ, that are shared with nodes of other communities and the remaining fraction, 1−μ, is shared within the community.

  3. (3)

    A community-size k min and k max is assigned in a following way, where k min >d min and k max >d max so that each node can be assigned to a community. The community size is decided based on the power law distribution so that the sum of the nodes in all communities is equal to the number on nodes in the network.

  4. (4)

    First, nodes are not assigned to any community and than nodes are assigned randomly to a community if the community-size exceeds the number of neighbours of the node in the community. This step is repeated until all nodes are assigned to a community.

  5. (5)

    In order to ensure that each node has a right approximation of μ and 1−μ for external and internal edges several rewiring steps are iterated.

For our analysis, we generated networks of vertex-size, |V|=1000, by varying different parameters for non-overlapping communities which are average degree, maximum degree, minimum cluster size, maximum cluster size and mixing parameter μ= {0.05, 0.10, 0.15, 0.20, 0.25, 0.30, 0.40, 0.50}.

Protein interaction networks

The protein interaction networks (PINs) we use for our analysis are obtained from Biogrid database [35]. In total, we use 10 PINs from 10 different species. The details are described in Table 1. These networks are pre-processed using the R package igraph [36] by extracting the giant connected components (GCC) of the networks.

Table 1 A list of protein networks used for detecting communities by different community detection algorithms

As one can see in Table 1 these biological networks show a large variety in the network parameters such as number of nodes and number of edges.

Normalized mutual information (NMI)

In order to assess the predicted modules of the algorithms qualitatively, we use the normalized mutual information (NMI) [3739].

The normalized mutual information is defined as follows. Suppose we have two community detection algorithms, U and V and they predict |R| and |C| communities in a network. The overlap between the two predicted communities is shown in the contingency Table 2, i.e., community U 2 and V 1 share n 21 nodes. Then the NMI [3739] is calculated as follows.

$$NMI_{max} = \frac{I(U, V)}{H(U) + H(V)} $$


$$\begin{aligned} H(U)&= -\Sigma_{i=1}^{R} \frac{a_{i}}{N} \left(log\frac{a_{i}}{N}\right)\\ H(V)&= -\Sigma_{i=1}^{C} \frac{b_{i}}{N} \left(log\frac{b_{i}}{N}\right)\\ I(U,V)&=\Sigma_{i=1}^{R} \Sigma_{j=1}^{C}\frac{n_{ij}}{N} \left(log\frac{n_{ij}/N}{a_{i} b_{j}/N^{2}}\right) \end{aligned} $$
Table 2 A contingency table which defines overlap between two communities, U and V


Benchmark networks

We start our analysis investigating the performance of community detection algorithms by application to benchmark networks. The benchmark networks are generated by an algorithm [34], as described in the Methods section, that result in networks with a predefined modularity structure. Hence, it is know that the networks have a module structure and can be used as a reference to quantify the performance of the community detection algorithms in an objective manner.

In the following, we study various parameters of the benchmark algorithm to generate benchmark networks. Specifically, we set the network size to |V|=1000 nodes, for the average degree of the vertices we use \(d_{i}^{avg} = 10 \) and for the maximum degree, \(d_{i}^{max} = 20\). The minimum community-size parameter, we vary for k min ={10,20,50,70,100,150} and the maximum community-size parameter for k max ={20,50,70,100,150,200}. For the mixing parameter, we study values in the set μ={0.05,0.10,0.15,0.20,0.25,0.30,0.40,0.50}. For each parameter combination, we generate 50 networks, resulting in a population of benchmark networks with the same characteristics but random variations. This allows an assessment of the robustness of the results due to stochastically ocuring structural changes in the networks.

As performance measure for assessing the predicted modules of the community detection algorithms we are using the normalized-mutual information (NMI); see Methods section. The NMI evaluates the comparison of the true communities and the predicted communities, as identified by the different algorithms. The distribution of NMI values for different community detection algorithms is shown in Fig. 1. The parameters studied are: (a) Mixing parameter μ=0.05, average number of modules is 33 (b) Mixing parameter μ=0.1, average number of modules is 20 (c) Mixing parameter μ=0.15, average number of modules is20 (d) Mixing parameter μ=0.2, average number of modules is 20 (e) Mixing parameter μ=0.25, average number of modules is 21 (f) Mixing parameter μ=0.3, average number of modules is 33.

Fig. 1
figure 1

Normalized mutual information of different module detection algorithms for the benchmark networks

Overall, the figure shows that as the mixing parameter, μ, increases the performance of all module detection algorithms deteriorates. Compared to all algorithms, the Label propagation algorithm underperforms throughout all values of μ and the Spinglass community algorithm performs better than all other algorithms, except for low values of the mixing parameter. This indicates that the method has an optimal working point for intermediately connected modules, which is a counterintuitive behavior. Furthermore, our distance measure-based approaches, notably A*SP Pearson and A*SP Spearman, are showing in general a good performances, and compared to Fast greedy and Walktrap they show even a favourable performance.

Performance of module detection algorithms by adding random edges

In this analysis, we test the robustness of different module detection algorithms against noise by adding a certain percentage of edges randomly to the network. Specifically, in the first step we generate synthetic networks, G=(V,E), with N modules as described in section Benchmark networks . Then we add a certain fraction of random edges resulting in G =(V,E ), where, E =EE ′′ with E ′′ is a randomly chosen set of edges between vertices in V of the benchmark network G. We then compare the modularity of the modules predicted by different module detection algorithms in G with the modules in G. The main objective of this analysis is to test the robustness of the module detection algorithms with respect to the addition of random edges to the network. The results of the performance of different module detection algorithms are shown in Fig. 2. In this figure we generated plots between modularity and mixing parameter (μ). From this analysis we find that the modularity of the modules predicted by different algorithms decrease as the percentage of added edges increases. The decrease in modularity is larger when the mixing parameter is higher. However, a small fraction of added edges do not effect the modularity, which can be seen in Fig. 2 a and b. From this analysis we find that the fast greedy and label- propagation algorithms are the worse performing algorithms, for higher values of the mixing parameter (μ) the label propagation performs worse and for lower mixing parameter (μ) the fast greedy performs worse compare to other algorithms. The spinglass algorithm performs best in all cases, the multi-level algorithm also performs better but for higher mixing parameter (μ) the walktrap community and the clustering algorithms show a slightly better performance than the multi-level algorithm.

Fig. 2
figure 2

A comparison of modularity of different module detection algorithms by showing plots between modularity and mixing parameter (μ) in synthetic networks. The synthetic networks are modelled by adding certain percentage of random edges in the networks. a 5 % (b) 10 % (c) 20 % (d) 30 % (e) 40 % (f) 50 % additional edges of total edges are randomly added in synthetic networks

Biological networks

Next, we extend our investigation to biological networks. Specifically, we use 10 PPI networks from different species. Details of these networks can be found in Table 1.

Modularity in PPI networks

First, we estimate the modularity, Q, and the number of modules in these PPI networks for the 9 community detection algorithms. The results of this analysis are shown in Tables 3 and 4 respectively.

Table 3 Modularity, Q, of PPI networks detected by different module detection algorithms
Table 4 Number of modules of PPI networks detected by different module detection algorithms

The first observation we make is that the best performing algorithms are the Multilevel and the Spinglass community algorithms. Interestingly, for some organisms, e.g., Schizosaccharomyces pombe and Homo sapiens, the Label propagation algorithm almost fails entirely to detect communities. In contrast, Fast-greedy and Walktrap are also finding acceptable modularity values for the networks for which the Label propagation algorithm has problems. Among the distance-based measures, \(D_{M_{pearson}}\phantom {\dot {i}\!}\), is the best performing method.

For the predicted number of modules, the Walktrap algorithm results in many more modules than any other method, whereas the remaining methods predict a comparable number of modules. For instance, for the PPI network of Homo Sapiens (9606), Walktrap predicts 38 times more modules than Fast-greedy and 163 times more modules than the Spinglas method. This is interesting because this is not beneficially reflected in the modularity values Q, see Table 3, in a way that this would lead to superior modularity values.

Aside from the number of predicted modules, it is important to know the size distribution of these, i.e., how many proteins belong to the corresponding modules. The distributions of the sizes of the modules for the studied organisms are shown in Fig. 3. Here one can see that there is a considerable variation among the methods. For instance, the variation of module sizes predicted by Walktrap are generally smaller for all organisms. This is understandable because the predicted number of modules is for this method by far the largest, which leads in general to rather small modules. In contrast, the variations for the correlation-based methods depend crucially on the organism. Overall, the largest variability is observed for the Label propagation algorithm.

Fig. 3
figure 3

Distribution of the size of modules detected in PPI networks by different module detection algorithms

Considering the agreement among different methods, the module structure of Candida albicans is least different and, hence, shows the highest level of consensus. For this organism, even Walktrap results in a moderate number of predicted modules, which is comparable to all other methods.

In Fig. 4 we combine the results from Tables 3 and 4 as a scatter plot between the number of modules and the modularity. For reasons of clarity, we show only results for four out of the nine methods because the other algorithms add nothing for the following discussion. The interesting observation is that Fast greedy displays a curious behavior because for an increasing number of predicted modules in the networks, the modularity decreases. In order to quantitatively confirm this observation we fit a polynomial regression of second order by the Least Squares method minimizing the residual sum of squares (RSS). For the linear and the quadratic term we obtain p-values of 0.0194 and 0.0211, which are significant for α=0.05. This confirms our observation statistically. In contrast, Multilevel and Spinglass can be approximated by a linear regression model, with p-values of 10−5 and 0.004.

Fig. 4
figure 4

Scatter plot between the number of modules and the modularity. Each method is color coded by a different color. The shown curves correspond to Least Squares regression models. For A*SP Pearson, no statistically significant model could be fit that would be different from a horizontal line

Interestingly, the A*SP Pearson algorithm is somehow located between these models in the sense that the best linear fit would only use an intercept but no slope and the quadratic regression is barely not significant with p-values of 0.08 for both the linear and quadratic term but higher values of adjusted R 2 values. For this reason, we do not include results form the regression in Fig. 4.

Comparison of algorithms

In order to investigate the similarity of the identified modules for different algorithms in detail, we use again the NMI measure. However, this time we use the NMI to compare the predicted community structure of one method with the predicted community structure of another method. In this way, the similarity of the predicted communities is assessed. In other words, this analysis will provide us with information about the consistency of results among different methods but does not allow to gain insights into the absolute quality of the predicted module structures, because the ground truth does not enter this analysis.

The results of this analysis are shown in the form of level plots of the NMI values between different community detection algorithms in Fig. 5. The color code of the NMI values goes from violet (low values) to blue (high values), see Fig. 5 for the different scales for the different organisms. In general, there is a good agreement among different methods, however, on a moderate level. For instance, for Drosophila melanogaster the NMI values are around 0.4. Similarly, for House mouse and Homo sapiens. In contrast, for Norway rat the NMI values for A*SP Spearman are succinctly lower than from all other algorithms. Also Label propagation stands out in a similar way for Plasmodium falciparum and yeast.

Fig. 5
figure 5

Similarity of the predicted module structures in PPI networks assessed by the NMI. The values of the NMI are color coded, as indicated by the color bar in each figure, showing the range of assumed values

By looking at the scale of the NMI values, one can see that for Candida albicans the lower values of the scale assumes higher values than for all other organisms, ranging from 0.86 to 1.00. This indicates that the similarity among all community detection algorithms is for this PPI networks highest, confirming our observation in Fig. 3, where we have seen that the variation of the size of modules is for all methods similar and quite small. Finally, we want to note that, in general, the distance-based measures are showing a higher similarity among each other than to the other community detection algorithms.

Robustness of module detection regarding perturbations

Our next analysis investigates the robustness of the predicted modules for perturbed PPI networks. Specifically, we test how a module detection algorithm changes its performance if some interactions in a PPI network are randomly deleted. The rationale of our analysis is based on the assumption that biological networks, and the interactions they are made of, are not known with absolute certainty. Instead, some interactions present in our PPI networks may be false positives due to measurement errors. Since all PPI networks we are using are inferred from experimental data, we think this assumption is very reasonable.

In order to study the effect of false positive interactions, we generate 20 perturbed networks for each PPI network, \(G^{sub}_{1}, G^{sub}_{2} \dots G^{sub}_{20}\), by deleting randomly 5 % of the edges in a PPI network. In order to make sure the the resulting networks are still connected, we remove only edges from nodes having a degree of D(v i ,G)≥2 and prevent removal of the last remaining edge. Then, we apply the community detection algorithms to the networks, \(G^{sub}_{1}, G^{sub}_{2} \dots G^{sub}_{20}\), and compare the predicted modules with the results from the unperturbed PPI network by using the NMI.

The results of this analysis are shown in Fig. 6. The first observation we make is that in all but two cases the NMI values are considerably smaller than 1.00, indicating a large change in the predicted communities. One exception is the Label propagation algorithm for Saccharomyces cerevisiae and the other is for all algorithms but Label propagation and Spinglass for Candida albicans. For all other algorithms and the remaining organisms, the obtained NMI values are much smaller, with the lowest value observed for Label propagation for Plasmodium falciparum. In general, compared to other methods and across the organisms, the most robust method appears to be Walktrap.

Fig. 6
figure 6

Robustness of module detection regarding perturbation of the PPI networks. Distribution of NMI values comparing communities obtained from the unperturbed and perturbed PPI networks generated by randomly deleting 5 % of the edges

Overall, the results show that even a moderate change in a PPI network leads, usually, in quite large changes of the predicted module structure, regardless of the algorithm or the organism.

Biological meaning of predicted modules

As far, we focused on more technical aspects of predicted modules. Now we switch gears by investigating the biological meaning of these modules. We do this by using external information, not included in the network structure itself, for assessing the predicted modules. As source for this external information we are using the Gene Ontology (GO) database [32] that provides comprehensive information about the involvement of genes across many organisms in diverse biological processes.

Specifically, we performed an enrichment analysis of biological pathways obtained from the GO database for the modules detected by the community detection algorithms. In order to test the statistical significance of biological pathways, corresponding to an over-representation of genes from a particular biological process, we use Fisher’s exact test. Since we are conducting 1000s of hypothesis tests, we need to apply a multiple testing correction. For this reason, we apply a conservative Bonferroni correction for a significance level of 0.001. The results of this analysis are shown in Table 5.

Table 5 Number of statistically significant pathways as identified by a Fisher’s exact test that are enriched in the predicted modules in the PPI networks

In the last column of this table, the total number of tested biological processes is shown as a reference. Overall, the Multilevel and Spinglass community detection algorithms have the largest number of enrichment biological pathways. But in general, these numbers are not too far apart from the remaining methods, with some exceptions. It is interesting to note that for Plasmodium falciparum (36,329) none of the algorithms predicts modules that contain at least one enriched pathway. The reason for this may be in the very small number of total pathways (51) tested for this organism.

In Table 6, we show the the percentage of enriched pathways. The highest percentage is observed for Arabidopsis Thaliana (3702), Saccharomyces cerevisiae (559,292) and drosophila milanogaster (7227) for different module detection algorithms. In contrast, Norway rat (10,116) leads to the least percentage 6 %.

Table 6 Percentage of statistically significant pathways (%) as identified by a Fisher’s exact test that are enriched in the identified modules in the PPI networks

The results in Tables 5 and 6 provide us with an overview of the enriched pathways, but they do not tell us if a pathway is enriched in just one predicted module or in several. This information is shown in Fig. 7. In this figure, we color-coded the number of pathways showing enrichment for multiple modules, ranging from 1 to 11 modules. The maximum number of modules is also shown as a number in the barplots, for each algorithm. From the shown results, we see that most pathways are only enriched in one module (red) indicating a biological specification of these modules. In general, the number of enriched pathways decreases with an increasing number of modules for all methods and across all organisms. These observations support the hypothesis that modules are used as functional units to carry out specific biological functions. In general, the modules predicted by the Walktrap community algorithm have a larger number of enriched pathways to multiple modules. Furthermore, the pathways of House mouse and Arabidopsis thaliana have a higher maximum number of pathways that are enriched for the maximum number of modules. The Label propagation algorithm predicts the lowest number of pathways enriched to multiple modules, except for Arabidopsis thaliana, which is a potential indicator of a poor predictability of modules in PPI networks. Another interesting aspect to remark, is that the algorithms Multilevel and Spinglass, which predicted modules with the highest modularity, are having in general the largest number of enriched pathways to the maximum number of modules.

Fig. 7
figure 7

Bar plots of the number of pathways that are enriched in multiple modules. The numbers inside each bar correspond to the maximum number of modules to which pathways are enriched

Next, we study the significant pathways that are common across different organisms. Specifically, in Fig. 8, we plot a distribution of common pathways. The Multilevel and Spinglasss have three and ten pathways respectively in common among 6 organisms; see Table 7. These processes are mostly involved in metabolic processes and cell communication. Other algorithms, except Label propogation, predict pathways common in four to five organisms, while Label propagation, has pathways that are common in only three organisms. Overall, the Walktrap community algorithm predicts the largest number of 287, pathways that are common in at least two modules.

Fig. 8
figure 8

Bar plots of pathways which are enriched in two or more organisms. The numbers in each figure are showing the total number of pathways that are enriched

Table 7 GO pathways which are enriched to more than one modules predicted by spinglass and multilevel community detectin algorithms that are common among 6 organisms (see Fig. 8)

Subnetwork analysis of Homo sapiens obtained from different experimental methods

We extend our investigation to the subnetworks of Homo sapiens. Specifically, we use the 4 largest connected PPI sub-networks from different experimental methods. Details of these networks can be found in Table 8. We estimate the modularity, Q, and the number of modules in these PPI networks for the 9 community detection algorithms. The results of our analysis are shown in Tables 9 and 10 respectively. The modularity of the subnetwork obtained from Affinity chromatography technology showing a slightly higher modularity for fastgreedy, multilevel and spinglass algorithms. However, for other subnetworks the modularity is considerably higher compared to the complete PPI network of Homo Sapiens. The modularities of of subnetworks highlight the fact that different subnetowrks obtained from different experimental methods provide a mixture of different structural properties of the complete PPI network. The analysis also highlights the fact that multilevel and spinglass algorithms are consistently performing better than other algorithms and walktrap community predicts more number of modules compare to other algorithms. Also the clustering based algotrithms and label propogation algorithms which perform better in synthetic networks are showing lowest modularity. In the next step of the analysis we performed enrichment analysis of pathways obtained form the CORUM complex database [40]. The results of this analysis are shown in Tables 11 and 12. The percentage of enriched pathways of CORUM complex database are higher compare to the GO pathways for individual organisms except the subnetwork obtained from two hybrid experimental data. In the next step we predicted that if a pathway is enriched in just one predicted module or in several. This information is shown in Fig. 9. In this figure, the color-coded barplots show the number of pathways showing enrichment for multiple modules, ranging from 1 to 4 modules. In this analysis a large fraction of pathways are enriched to just one modules and a few pathways are enriched to two or three modules predicted by different module detection algorithms. A list of pathways which are enriched to more than one modules predicted by multilevel and spinglass algorithms are shown in Table 13.

Fig. 9
figure 9

Bar plots of the number of pathways (CORUM complex) that are enriched in multiple modules of PPI subnetworks of Homo Sapiens from different experimental types. The numbers inside each bar correspond to the maximum number of modules to which pathways are enriched

Table 8 Subnetwork of PPI interactions of Human obtained from different experimental types
Table 9 Modularity, Q, of PPI subnetworks detected by different module detection algorithms
Table 10 Total number of modules of PPI subnetworks detected by different module detection algorithms
Table 11 Total number of significant CORUM complexes enriched to atleast one module of PPI subnetworks detected by different module detection algorithms
Table 12 Total percentage of significant CORUM complexes enriched to atleast one module of PPI subnetworks detected by different module detection algorithms
Table 13 CORUM complexes which are enriched to more than one modules predicted by spinglass and multilevel community detectin algorithms

Time complexity of the algorithms

Finally, we show results for the time complexity of the community detection algorithms. In Table 14 the run time in seconds for the analysis of the PPI networks are shown. Overall, the fastest algorithm is Label propagation that provided for all studied networks the quickest results, below one second. For all other methods, even when they are in general fast, there is at least one network that requires much more time. For instance, Fast-greedy is in general quite fast and comparable to Label propagation, but for the networks Saccharomyces cerevisiae (559,292) and Homo sapiens (9606) it takes over 463 respectively 2287 times longer than for Label propagation. A similar observation can be made for Walktrap.

Table 14 Estimated time, in seconds, to detect modules in biological networks by different module detection algorithms

Discussion and conclusion

In our analysis, we used 9 community detection algorithms to predict modules in PPI networks of 10 different organisms. Overall, our analysis provides a comprehensive understanding of the performance of large community detection algorithms. Also, our analysis highlights organism-specific differences of PPI networks and the biological meaning of the predicted modules.

Overall, from our analysis of these networks we found that the Spinglass, Multilevel and Fastgreedy algorithm preform in general much better than the other algorithms. Furthermore, the Multilevel and Fast greedy algorithm have, in addition, a good run time (see Table 14) that allows to obtain results for large networks within seconds. Interestingly, despite the fact that these three algorithms are performing better, there is no complete similarity among these algorithms in terms of the predicted modules, but the results are to a large extend method-specific. Another interesting fact about the Multilevel and Spinglass community algorithm is that the number of modules and the modularity are linearly correlated, while the performance of Fast greedy decreases as the number of modules increases (see Fig. 4). At this point it is unclear which behavior reflects the modularity vs number of modules dependency best for biological organisms. However, it appears reasonable to assume that there is a limiting factor in the growth of modularity of biological networks, which would suggest that the behavior of Fast greedy is a reflection of biological properties of the networks rather than a technical property or a bias of the method.

Although, we studied extensively the performance of modules in biological networks and found high modularity for some organisms, still, for some organisms, such as Homo Sapiens and Saccharomyces cerevisiae, we find a low modularity. This is especially surprising for Homo Sapiens. One reason for the low modularity in these networks could be the existence of many overlapping nodes between communities giving raise to overlapping modules and pathways. Therefore, the standard non-overlapping community prediction methods may not be optimally suitable for detecting communities in such organisms. This would suggest that more effort needs to be placed on the development of such algorithms, because only in this way one could shed light on the nature of the overlapping modular structure of PPI networks. Another explanation could be that the PPI networks contain incomplete information. One reason for this argument is because the highest modularity is predicted by the Spinglass algorithm for Arabidopsis Thaliana (3702), which is a less complex organism, and for this reason is easier to study. Also the modularity of Arabidopsis Thaliana (3702) is constantly predicted higher by all other algorithms.

By studying the biological meaning of predicted modules, we found that a large proportion of pathways is enriched in only a single module, in all organisms and for all algorithms. This underlines the role of biological pathways as part of a special functioning component in an organism. However, a small set of biological pathways is enriched in more than one module, and an even smaller proportion of pathways is commonly enriched to multiple modules in all organisms. In general the classification of these pathways can broadly be grouped into the following categories:

  • Pathways which are part of a single module only across many organisms.

  • Pathways which are part of multiple modules across many organisms.

  • Pathways which are part of a single module and a single organisms.

  • Pathways which are part of multiple modules and a single organisms.

It would be interesting to see what biological processes they contribute to and what role they play in different organisms in order to see changes in an evolutionary perspective or the emergence of a higher level of functioning in different organisms.

In summary, the identification of modules in networks is a very complex problem and more work needs to be done. A potential future direction could be to extend the analysis for identifying communities with overlapping proteins/genes. This would be a major step forward because it would require the inclusion of the hierarchy among the modules and as such, require fundamentally different algorithms.


  1. Emmert-Streib F. The chronic fatigue syndrome: A comparative pathway analysis. J Comput Biol. 2007; 14(7):961–72.

    Article  CAS  PubMed  Google Scholar 

  2. Emmert-Streib F, Glazko G. Network Biology: A direct approach to study biological function. Wiley Interdiscip Rev Syst Biol Med. 2011; 3(4):379–91.

    Article  CAS  PubMed  Google Scholar 

  3. Förster J, Famili I, Fu P, Palsson BO, Nielsen J. Genome-scale reconstruction of the saccharomyces cerevisiae metabolic network. Genome Res. 2003; 13(2):244–53.

    Article  PubMed  PubMed Central  Google Scholar 

  4. Guelzim N, Bottani S, Bourgine P, Kepes F. Topological and causal structure of the yeast transcriptional regulatory network. Nat Genet. 2002; 31(1):60–63.

    Article  CAS  PubMed  Google Scholar 

  5. Lee TI, et al.Transcriptional regulatory networks in saccharomyces cerevisiae. Science. 2002; 298(5594):799–804.

    Article  CAS  PubMed  Google Scholar 

  6. Vidal M, Cusick ME, Barabási AL. Interactome networks and human disease. Cell. 2011; 144(6):986–98.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Barabási AL, Albert R. Emergence of scaling in random networks. Science. 1999; 206:509–12.

    Google Scholar 

  8. Han J-DJ, Bertin N, Hao T, Goldberg DS, Berriz GF, Zhang LV, Dupuy D, Walhout AJM, Cusick ME, Roth FP, Vidal M. Evidence for dynamically organized modularity in the yeast protein-protein interaction network. Nature. 2004; 430:88–93.

    Article  CAS  PubMed  Google Scholar 

  9. Jeong H, Tombor B, Albert R, Olivai ZN, Barabasi AL. The large-scale organization of metabolic networks. Nature. 2000; 407:651–4.

    Article  CAS  PubMed  Google Scholar 

  10. Ravasz E. Detecting hierarchical modularity in biological networks. Methods in Molecular Biology, Springer. 2008; 541:1–16.

    Google Scholar 

  11. Watts DJ, Strogatz SH. Collective dynamics of ‘small-world’ networks. Nature. 1998; 393:440–2.

    Article  CAS  PubMed  Google Scholar 

  12. Yu H, Gerstein M. Genomic analysis of the hierarchical structure of regulatory networks. Proc Natl Acad Sci USA. 2006; 103:14724–31.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Emmert-Streib F. Limitations of the gene duplication model: Evolution of modules in protein interaction networks. PLoS ONE. 2012; 7(4):35531.

    Article  Google Scholar 

  14. Hallinan J. Gene duplication and hierarchical modularity in intracellular interaction networks. Biosystems. 2004; 74(1–3):51–62.

    Article  CAS  PubMed  Google Scholar 

  15. Wagner GP, Pavlicev M, Cheverud JM. The road to modularity. Nat Rev Genet. 2007; 8(1):921–31.

    Article  CAS  PubMed  Google Scholar 

  16. Kitano H. Systems biology: a brief overview. Science. 2002; 295(5560):1662–1664.

    Article  CAS  PubMed  Google Scholar 

  17. Van Regenmortel M. Reductionism and complexity in molecular biology. EMBO Rep. 2004; 5(9):1016–1020.

    Article  CAS  PubMed  Google Scholar 

  18. Spirin V, Mirny LA. Protein complexes and functional modules in molecular networks. Proc Natl Acad Sci U S A. 2003; 100(21):12123–12128.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Hintze A, Adami C. Evolution of complex modular biological networks. PLoS Comput Biol. 2008; 4:23. doi:10.1371/journal.pcbi.0040023.

    Article  Google Scholar 

  20. Clune J, Mouret JB, Lipson H. The evolutionary origins of modularity. Proc R Soc Lond B Biol Sci. 2013; 280(1755). doi:10.1098/rspb.2012.2863.

  21. Lin CY, Lee TL, Chiu YY, Lin YW, Lo YS, Lin CT, Yang JM. Module organization and variance in protein-protein interaction networks. Sci Rep. 2015; 5:9368.

    Article  Google Scholar 

  22. Dittrich MT, Klau GW, Rosenwald A, Dandekar T, Mueller T. Identifying functional modules in protein?protein interaction networks: an integrated exact approach. Bioinformatics. 2008; 24(13):223–31. doi:10.1093/bioinformatics/btn161.

    Article  Google Scholar 

  23. Taylor IW, Linding R, Warde-Farley D, Liu Y, Pesquita C, Faria D, Bull S, Pawson T, Morris Q, Wrana JL. Dynamic modularity in protein interaction networks predicts breast cancer outcome. Nat Biotech. 2009; 27(2):199–204.

    Article  CAS  Google Scholar 

  24. Zhang X, Zhang R, Jiang Y, Sun P, Tang G, Wang X, Lv H, Li X. The expanded human disease network combining protein-protein interaction information. Eur J Hum Genet. 2011; 19(7):783–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Cheng L, Li J, Ju P, Peng J, Wang Y. Semfunsim: A new method for measuring disease similarity by integrating semantic and gene functional association. PLoS ONE. 2014; 9(6):99415. doi:10.1371/journal.pone.0099415.

    Article  Google Scholar 

  26. Nepusz T, Yu H, Paccanaro A. Detecting overlapping protein complexes in protein-protein interaction networks. Nat Meth. 2012; 9(5):471–2.

    Article  CAS  Google Scholar 

  27. Clauset A, Newman MEJ, Moore C. Finding community structure in very large networks. Phys Rev E. 2004; 70:066111. doi:10.1103/PhysRevE.70.066111.

    Article  Google Scholar 

  28. Pons P, Latapy M. Computing communities in large networks using random walks. J Graph Algorithms Appl. 2004; 10(2):284–93.

    Google Scholar 

  29. Raghavan UN, Albert R, Kumara S. Near linear time algorithm to detect community structures in large-scale networks. Phys Rev E. 2007; 76(3):036106.

    Article  Google Scholar 

  30. Reichardt J, Bornholdt S. Statistical mechanics of community detection. Phys Rev E. 2006; 74:016110. doi:10.1103/PhysRevE.74.016110.

    Article  Google Scholar 

  31. Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E. Fast unfolding of communities in large networks. J Stat Mech Theory Exp. 2008; 2008(10):10008. doi:10.1088/1742-5468/2008/10/p10008.

    Article  Google Scholar 

  32. Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, et al.Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000; 25(1):25–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Newman MEJ. Modularity and community structure in networks. Proc Natl Acad Sci. 2006; 103(23):8577–582. doi:10.1073/pnas.0601602103.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Lancichinetti A, Fortunato S, Radicchi F. Benchmark graphs for testing community detection algorithms. Phys Rev E. 2008; 78:046110. doi:10.1103/PhysRevE.78.046110.

    Article  Google Scholar 

  35. Breitkreutz BJ, Stark C, Reguly T, Boucher L, Breitkreutz A, Livstone M, Oughtred R, Lackner DH, Bahler J, Wood V, Dolinski K, Tyers M. The BioGRID Interaction Database: 2008 update. Nucl Acids Res. 2008; 36(suppl 1):637–40.

    Google Scholar 

  36. Csardi G, Nepusz T. The igraph software package for complex network research. InterJournal. 2006; Complex Systems:1695.

    Google Scholar 

  37. Kvalseth TO. Entropy and correlation: Some comments. IEEE Trans Syst Man Cybern. 1987; 17(3):517–9. doi:10.1109/TSMC.1987.4309069.

    Article  Google Scholar 

  38. Danon L, Guilera AD, Duch J, Arenas A. Comparing community structure identification. J Stat Mech Theory Exp. 2005; 2005(9):09008–09008. doi:10.1088/1742-5468/2005/09/p09008.

    Article  Google Scholar 

  39. Vinh NX, Epps J, Bailey J. Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance. J Mach Learn Res. 2010; 11:2837–854.

    Google Scholar 

  40. Ruepp A, Brauner B, Dunger-Kaltenbach I, Frishman G, Montrone C, Stransky M, Waegele B, Schmidt T, Doudieu ON, Stúmpflen V, Mewes HW. Corum: the comprehensive resource of mammalian protein complexes. Nucleic Acids Res. 2008; 36(suppl 1):646–50. doi:10.1093/nar/gkm936.

    Google Scholar 

Download references


Matthias Dehmer thanks the Austrian Science Funds for supporting this work (project P26142).

Author information

Authors and Affiliations


Corresponding author

Correspondence to Frank Emmert-Streib.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

ST performed the analysis, interpreted the results and wrote the manuscript. SM, MD and FES conceived and designed the study, supervised the analysis, interpreted the results and wrote the manuscript. All authors approved the final version.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver( applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tripathi, S., Moutari, S., Dehmer, M. et al. Comparison of module detection algorithms in protein networks and investigation of the biological meaning of predicted modules. BMC Bioinformatics 17, 129 (2016).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: