An efficient protein complex mining algorithm based on Multistage Kernel Extension
© Shen et al.; licensee BioMed Central Ltd. 2014
Published: 6 November 2014
In recent years, many protein complex mining algorithms, such as classical clique percolation (CPM) method and markov clustering (MCL) algorithm, have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges. Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization.
Inspired by the formation process of cliques of the complex social network and the centrality-lethality rule, we propose a new protein complex mining algorithm called Multistage Kernel Extension (MKE) algorithm, integrating the idea of critical proteins recognition in the Protein- Protein Interaction (PPI) network,. MKE first recognizes the nodes with high degree as the first level kernel of protein complex, and then adds the weighted best neighbour node of the first level kernel into the current kernel to form the second level kernel of the protein complex. This process is repeated, extending the current kernel to form protein complex. In the end, overlapped protein complexes are merged to form the final protein complex set.
Here MKE has better accuracy compared with the classical clique percolation method and markov clustering algorithm. MKE also performs better than the classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.
Mining protein complexes is very important in biological processes since it helps reveal the structure-functionality relationships in biological networks. So much attention has been paid to accurate detection of protein complexes from the increasing amount of protein-protein interaction (PPI) network data. In recent years, many protein complex mining algorithms have developed for protein-protein interaction network. However, most of the available algorithms primarily concentrate on mining dense protein subgraphs as protein complexes, failing to take into account the inherent organizational structure within protein complexes. Thus, there is a critical need to study the possibility of mining protein complexes using the topological information hidden in edges . Moreover, the recent massive experimental analyses reveal that protein complexes have their own intrinsic organization .
Complex social networks and complex biological networks both contain distinct community structures . The formation of complex social network is often divided into several stages. First, the founders create the original kernel of the community according to common ideas or interests. Next, the kernel community is expanded by introducing the similar objects to join the community to form a basic framework and organizational structure, and the new community begins to run effectively. Subsequently, the community gradually assimilates objects sharing common ideas or interests incessantly joining this community, then a complex network is constructed which exerts corresponding influence and function in society.
Studies show that the PPI network is such a kind of the complex network, it has the properties similar to complex networks on topological structure, namely small world property , scale-free property , and it presents remarkable modular structures . The PPI network is composed by many protein complexes (function modules or clusters), and these protein complexes are made up of some proteins working together to carry out some functions. Protein complex refers to a group of proteins that interact with each other at the same time and in the same space. The formation of the protein complexes and the PPI follow its inherent objective laws, which is a gradually developing process, not accomplished at one stroke.
Halt et al. believed that criticality is an important property of protein complexes, and experimental data shows that critical proteins always heavily concentrate in certain complexes . Some researchers combined the recognition of the critical proteins with protein complexes detection. Zotenko et al. pointed out that densely connected protein complexes with same or similar biological function are rich in critical protein nodes, and these nodes around the critical nodes have a strikingly functional similarity . Jeong et al. discovered the centrality-lethality rule which demonstrates that the deletion of proteins with more neighbouring nodes is easier to affect the topological structure of the whole network, and then produces lethal effect on the body . That is to say, the protein nodes with higher degree more tend to exhibit the criticality in biological properties and play an important role in the protein complexes.
Based on the above ideas, we propose a novel protein complex mining algorithm called MKE (Multistage Kernel Extension) based on multistage kernel extension. MKE first transforms the undirected and unweighted graph of PPI network to a directed and weighted network graph, then selects the node set composed by high-degree and closely connected nodes in PPI network as the first level kernel of the protein complex, or as the kernel nodes of the protein complex, since these nodes are prone to play a key role in the biological function of the protein complex. Next, for each adjacent node of the first level kernel of the protein complex, MKE uses the definition of the weighted best neighbour node to determine the extent of the closeness between the adjacent node and the current kernel. If the extent of the closeness is greater than the average extent of closeness of the subgraph formed by the current kernel and its neighbouring nodes, then this neighbouring node can be added into the current kernel and be extended into the next level kernel. This process is continuously repeated; the kernel is extended stage-by-stage and finally a protein complex is constructed. Experimental results demonstrate that MKE is simple and effective, and the protein complexes identified with biological significance have a very high degree of match with reference protein complexes.
Constructing directed and weighted network graph
In the protein-protein interaction network, for each pair of protein nodes, it is difficult to determine whether they belong to the same protein complex just by the degree of the nodes and their connection characteristics. Since two protein nodes have their own neighbour node set in PPI network, we can get their common neighbour node set. If one pair of protein nodes has more common neighbour nodes, it indicates that they have closer connection. Thus the possibility that the two proteins belong to the same protein complex is greater, and the probability that they participate in the same cell function is larger as well. Therefore, the common neighbour node set of one pair of protein nodes acts as an important role in weighing the relationship between the pair of protein nodes.
In the PPI network, for any two protein nodes, denoted by and , if there is an undirected edge between them, it can be converted into directed and weighted edge. Initially, the edge between node and node is undirected and unweighted.
In the above definitions, the weights of the edges between two nodes are unequal. Assume that the degree of protein node is very large, while the degree of protein node is so small. Although they have the same number of common neighbour nodes, according to formula (2) and (3), >, as shown in Figure 1. In terms of node , the possibility that node and node belong to the same protein complex is small, but in terms of node , the possibility that node and node belong to the same protein complex is larger. Since the correlation is asymmetric, it is necessary to use two directed and weighted edges to weigh the correlation between the nodes. However, in these known protein complex mining algorithms, most of them fail to consider this detail, but treat it in the way with an equal weight, which is obviously unreasonable.
If there are nodes in the network, let denote the biggest degree of a node in current network. For each node , find its neighbour node set . For any node in set , the number of the edges between node and the other nodes in set is the number of common neighbour nodes between node and node , and the time complexity is that discovers the common neighbours. Then the time complexity is that converts the undirected and unweighted graph into the directed and weighted network graph.
Defining the weighted best neighbour node
In the PPI network, for one protein node, there are always a large number of neighbour nodes around it. Since the PPI graph has been transformed into a directed and weighted graph, it is convenient to find the neighbour node.
Where, is a weight threshold--an average weight of a local network formed by the current subgraph and the adjacent nodes of the subgraph. Its definition is given in the back (see formula (7)). (In the initial formation stage of the protein complex, as the initial nodes are the critical nodes with high degree and more adjacent nodes, and combining the experimental tests in this paper, the is initialized to 0.8.)
Identifying the first level kernel of protein complex
Where denotes the number of nodes in the network, is the given degree threshold, represents the degree of node . is the total number of nodes whose degrees are greater than or equal to in the PPI network. According to the different sizes of the PPI network, can be tuned properly, and in this paper it is set to be 0.01, because critical nodes account for small fraction of all nodes.
According to the degree distribution of nodes in the PPI network, first, all the nodes in the network are ranked according to the degree, and nodes with degree greater than or equal to are selected to be the initial kernel nodes of the first level kernels of protein complexes. For the protein nodes within the first level kernel of the protein complex, there are plenty of common neighbour nodes between them, otherwise there is little chance that these nodes would belong to the same protein complex. Thus for two protein nodes, node and node , when the directed weights between them are both greater than the given threshold value of the weight, it indicates that they closely connect with each other. Then they can be thought to belong to the same first level kernel of protein complex. [see Additional file 1 for Algorithm 1--Identifying the First level Kernel(IFLK) Algorithm and for its Time Complexity Analysis].
Identifying the second level kernel of protein complex
For each neighbour node of the identified first level kernel, we can adopt the definition of weighted best neighbour node to analyze the extent of closeness between the current kernel and the neighbour nodes. If the extent of closeness is greater than the average extent of closeness of the subgraph formed by the current kernel and its neighbour nodes, then add this neighbour node into the current kernel to generate the next kernel, otherwise, discard it.
The first level kernel of the protein complex identified is usually the single protein node with high degree or the set of some protein nodes with high degree. Compared with the first level kernel of the protein complex, the degree of the nodes of the second level kernel is slightly small. However, like the first level kernel of the protein complex, the second level kernel of the protein complex is in the central position of the protein complex as well, so there is no substantial difference on the extent of connection closeness between the second level kernel and the first level kernel of the protein complex.
Since the second level kernel of the protein complex is in the periphery of the first level kernel, we can naturally achieve the second level kernel by extending the first level kernel of the protein complex. In the protein-protein interaction network graph, the second level kernel of the protein complex can intuitively correspond to the network subgraph formed by the first level kernel and its adjacent nodes. Simply speaking, some special processing can be done on the adjacent network of the first level kernel of the protein complex, and then we can conveniently obtain the second level kernel of the protein complex.
Where, is the number of the nodes of the network formed by the current kernel and its neighbours, is the directed weight between node and node . [see Additional file 1 for Algorithm 2--Identifying the Second Level Kernel(ISLK) Algorithm and for its Time Complexity Analysis].
Mining protein complexes by multistage kernel extension
In the PPI network, the protein complex is a striking module structure which is formed by multistage kernel extension of the kernel protein nodes. The initial stage of kernel extension of the protein complex is very important, so we have elaborated on this before. Since the kernel extension stages of the protein complex are similar and the next stages of kernel extension are similar to the second stage, it is redundant and pointless to elaborate on the next stages of kernel extension of the protein complex.
During the multistage kernel extension process, the previous extension stage is more important than the next extension stage. In general, the more important protein node sets account for smaller percentage in PPI network. Thereby, we can make a reasonable hypothesis accordingly that the number of the nodes added into the current protein complex kernel in the next stage is greater than that in the previous stage. In addition, due to the specificity of different networks, some kernels need to early terminate the extension to form the protein complexes after several extension stages; consequently, the algorithm introduces the Extended Level Parameter as a constraint.
According to the above discussion, we implement the same process on the second level kernel like the way that the first level kernel extends to yield the second level kernel of the protein complex, and the process is repeated until the increased number of nodes of current kernel extension is smaller than the increased number of nodes of the previous kernel extension or until the extended level parameter is greater than the threshold , then output the ultimate kernel of the protein complex. Before the algorithm starts, the current kernel of the protein complex is null, naturally, the size of the kernel is 0. After having identified the first level kernel of protein complex, the increased number of nodes of first level kernel of the protein complex is obviously equal to its own size.
In terms of a node in the PPI network, the extent of closeness between the node and a protein complex is obtained by calculating the extent of closeness between this node and the nodes satisfying special condition within the protein complex, rather than by calculating the extent of closeness between this node and multiple nodes within the protein complex. Therefore, using the definition of the weighted best neighbour node for the PPI network, we can find that algorithm in this paper predicts the protein complex by one node in the kernel extending to the nodes outside the kernel, which is different from most of other available algorithms that predict the modules by multiple nodes within the kernel extending to the nodes outside the kernel.
Where is the number of nodes in cluster , and is the number of nodes in cluster . [see Additional file 1 for Algorithm 3--Multistage Kernel Extension (MKE) Algorithm and for its Time Complexity Analysis].
Results and discussion
For the protein-protein interaction network data of all species, yeast protein-protein interaction network data is relatively complete, so the yeast protein-protein interaction network is selected as the main study object of the experiment. The experiment tests on Krogan dataset  and Collins dataset  to compare with other algorithms, and then analyses the biological significance of the predicted protein complexes. After removing the self-interactions loop links of the protein nodes and the multilateral links of protein nodes in the pre-process, Krogan dataset and Collins dataset contain 3672, 1622 nodes and 14317, 9074 edges respectively.
Palla et al. (2005) have proposed algorithm CPM (Clique Percolation Method) which can identify the overlapped network cluster structures . The basic hypothesis of the algorithm is: network cluster is made up of multiple adjacent k-cliques, where k-clique is the maximally connected subgraph containing protein nodes. Provided that two k-cliques have common nodes, then the two k-cliques are thought to be adjacent. The CPM algorithm produces the maximally connected subgraph as the module by incessantly uniting adjacent k-cliques. Adamcsek et al. have employed algorithm CPM to develop a network module mining software called CFinder which can expediently dig protein complexes from the protein-protein interaction network. Compared with other graph clustering algorithms, algorithm CPM is a deterministic method, and it can find overlapped protein complexes from the protein-protein interaction network. Algorithm MCL (Markov Clustering) is a fast and scalable unsupervised clustering algorithm, and its basic idea is that: this method first simulates random walk in the graph, then divides the protein-protein interaction network into disjoint dense subgraphs, and finally extracts complexes from the protein-protein interaction network . In the experiment, the maximal size of cliques of the CFinder is set to be 3. The reference protein complexes dataset adopted by algorithm MCL and CFinder comes from reference . And for algorithm MKE, it derives from reference .
Analysing extended level parameter
In order to evaluate the predicted protein complexes, 408 protein complexes are artificially extracted to generate a elaborated catalogue from the published small scale experimental data, and a reference protein complexes dataset is created by filtering out 236 protein complexes with size at least 3, and the average size is 6.7. Meanwhile, since protein complex with size less than 3 is meaningless, the size of all predicted protein complexes analyzed in this paper is at least 3.
Different datasets have different network topology, and the organizational structures of clusters in different datasets vary too. Therefore we need to adjust the extended level parameter of the algorithm to optimize the results of the algorithm on a given dataset.
Accuracy of algorithms
Various performance indicators of different algorithm on Krogan and Collins datasets
In Table 1 On Krogan dataset, the MKE algorithm separately finds 35 protein complexes and matches 48 protein complexes more than CFinder. Even though the value of Sn of CFinder is 0.227 higher than MKE, the value of PPV of MKE is notably higher than CFinder. With respect to MCL, the predicted clusters is much greater than that MKE predicted, but MKE matches more than that MCL matches, which indicates that in MCL algorithm, multiple predicted protein complexes match one reference protein complex. Just like CFinder, the value of Sn of MCL is 0.027 higher than MKE, but the value of PPV of MKE is 0.114 higher than MCL.
On Collins dataset, the performance of CFinder and MCL exhibits high similarity as on Krogan dataset. Since the values of Sn and PPV of CFinder are extremely uneven, it results in lower values of Acc, which are 0.133 and 0.082 lower than Acc of MKE algorithm respectively on Krogan and Collins dataset. Consequently, on the whole, the MKE algorithm outperforms CFinder. Although Sn and PPV of MCL are relatively balanced on both dataset and there is no significantly difference on the values of Acc between MCL and MKE, MCL is slightly inferior to MKE algorithm.
Function enrichment analysis
Let denote the total number of nodes in protein-protein interaction network and C represent the number of proteins within the predicted protein complex, let k and F denote the number of proteins with a given function in protein complex and in PPI network, respectively. If P-value of the predicted protein complex is very low, then it explains that the probability of occurrence of these proteins in the network together exhibiting a given function as a protein complex is very small.
The five protein complexes with minimal p-value by MKE mining algorithm
Gene Ontology term
26 out of 26 genes, 100.0%
RNA splicing, via transesterification reactions with bulged adenosine as nucleophile
15 out of 15 genes, 100.0%
19 out of 21 genes, 90.5%
19 out of 23 genes, 82.6%
modification-dependent protein catabolic process
17 out of 17 genes, 100.0%
transcription from RNA polymerase II promoter
67 out of 93 genes, 72.0%
25 out of 25 genes, 100.0%
transcription from RNA polymerase II promoter
24 out of 24 genes, 100.0%
14 out of 15 genes, 93.3%
mRNA 3'-end processing
16 out of 16 genes, 100.0%
Semantic similarity and co-localization enrichment
Where, the number of predicted protein complexes and reference cellular compartment protein complexes are respectively and , is the size of the predicted protein complex , is the number of nodes that are found both in reference cellular compartment protein complex and predicted protein complex .
The GO (Gene Ontology) semantic similarity of the protein complex refers to the average degree of association of all protein pairs . The GO semantic similarity of the protein complex set can be obtained by calculating weighted average of all protein complexes. In general, the protein complex with higher GO semantic similarity shows that the probability of proteins within the protein complex expressing the similar function is greater. This paper employs genome co-localization reference dataset compiled by literature . In a protein complex set predicted by a given algorithm, the more protein complexes positioning in the same cellular compartment indicates the stronger recognition capability of the algorithm. This paper adopts the genome co-localization reference dataset from literature  and the ProCope  tool to analyze the GO semantic similarity and co-localization enrichment on the results predicted by each algorithm on Krogan and Collins dataset.
GO semantic similarity and Co-localization enrichment analysis by algorithm MKE
Therefore, by and large, although MKE algorithm is not better than all of the selected algorithms, MKE performs better than algorithm CFinder on the aspects of GO semantic similarity and co-localization enrichment, and can effectively detect the protein complexes with biological significance in the protein-protein interaction network.
Due to the complexity of structure and the limitations of the experimental validation of the protein-protein interaction network, there is no convincing and strict definition regarding the verification standard of the protein complex up to now. Therefore, for protein complex mining, the detecting standard of the protein complex should be first confirmed. That is to say, what is the structure of the protein complex needs to be defined.
The formation of the protein-protein interaction network follows its intrinsic law and the PPI network gradually develops by some protein complexes with inherent links. The criticality is an important property of the protein complex. Critical proteins can always be discovered within protein complexes, which are the high-degree nodes with many adjacent nodes. The protein nodes with higher degree tend to exhibit the criticality on biological properties as kernels and play an important role in the protein complex. Thus one or multiple critical nodes can be taken as kernels around which there are a lot of adjacent protein nodes closely connecting with each other, and the periphery of these adjacent proteins have also some adjacent nodes. All these nodes construct a relatively independent set which is able to implement some relatively independent biological functions. In other words, such protein sets are most likely to construct protein complexes.
Inspired by the community formation law of the complex social network and the centrality-lethality rule, and combining the idea of critical protein nodes detection, this paper proposes a new protein complex mining algorithm MKE based on multistage kernel extension. MKE is the first algorithm to identify the innermost kernel of the protein complex, namely taking the critical nodes with high degree and more adjacent nodes as the first level kernel of the protein complex. Then MKE expands the first level kernel to be the second level kernel of the protein complex by adding the weighted best neighbour node into the current kernel, and repeatedly goes on expansion stage-by-stage to construct protein complex, and then MKE merges overlapped protein complexes to form the protein complex set. MKE has better accuracy compared with the classical clique percolation method and markov clustering algorithm. MKE also performs better than classical clique percolation method both on Gene Ontology semantic similarity and co-localization enrichment and can effectively identify protein complexes with biological significance in the PPI network.
This research is supported by the Self-determined Research Funds of CCNU from the Colleges' Basic Research and Operation of MOE(No. CCNU14A02008, No. CCNU13C01001), the International Cooperation Project of Hubei Province (No. 2014BHE0017), the Program of Introducing Talents of Discipline to Universities (under grant No. B07042), the National Natural Science Foundation of China (under grant No. 31371275) and the Natural Science Foundation of Hubei Province (under grant 2013CKB024).
Publication of this article has been funded by NSF IIP 1160960, NSF IIP 1332024.
This article has been published as part of BMC Bioinformatics Volume 15 Supplement 12, 2014: Selected articles from the IEEE International Conference on Bioinformatics and Biomedicine (BIBM 2013): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcbioinformatics/supplements/15/S12.
- Ma XK, Gao L: Discovering protein complexes in protein interaction networks via exploring the weak ties effect. BMC Systems Biology. 2012, 6 (Suppl 1):Google Scholar
- Dezsl Z, Oltvai AD, and BarabásiO AL: Bioinformatics analysis of experimentally determined protein complexes in the yeast saccharomyces cerevisiae. Genome Res. 2003, 13: 2450-2454. 10.1101/gr.1073603.View ArticleGoogle Scholar
- Girvan M, Newman MEJ: Community Structure in Social and Biological Networks. Proc Natl Acad Sci USA. 2002, 99: 7821-7826. 10.1073/pnas.122653799.PubMed CentralView ArticlePubMedGoogle Scholar
- Antonio DS, Hirotomo F, O'Meara Paul: Topology of small-world networks of Protein-Protein complex structures. Bioinformatics. 2005, 21 (8): 1311-1315. 10.1093/bioinformatics/bti167.View ArticleGoogle Scholar
- Stefan W: Scale-free behavior in protein domain networks. Molecular Biology And Evolution. 2001, 18 (9): 1694-1702. 10.1093/oxfordjournals.molbev.a003957.View ArticleGoogle Scholar
- Wuchty Stefan, Ravasz Erszébet, Barabási Albert-László: The architecture of biological networks. Complex Systems Science in Biomedicine. 2006, 165-181.View ArticleGoogle Scholar
- Traver Hart G, Lee L, Edward RM: A high accuracy consensus map of yeast Protein complexes reveals modular nature of gene essentiality. BMC Bioinformatics. 2007, 8 (1): 236-10.1186/1471-2105-8-236.PubMed CentralView ArticlePubMedGoogle Scholar
- Elena Zotenko, Julián Mestre, O'Leary Dianne P, Przytycka Teresa M: Why Do Hubs in the Yeast Protein Interaction Network Tend To Be Essential: Reexamining the Connection between the Network Topology and Essentiality. PLoS Computational Biology. 2008, 4 (8): e1000140-10.1371/journal.pcbi.1000140.View ArticleGoogle Scholar
- Jeong H, Mason SP, Barabási AL: Lethality and centrality in Protein networks. Nature. 2001, 411 (6833): 41-42. 10.1038/35075138.View ArticlePubMedGoogle Scholar
- Krogan N, Gerard C, Yu Haiyuan: Global landscape of protein complexes in the yeast Saccharomyces cerevisiae. Nature. 2006, 440: 637-643. 10.1038/nature04670.View ArticlePubMedGoogle Scholar
- Collins SR, Patrick K: Toward a comprehensive atlas of the physical interactome of Saccharomyces cerevisiae. Mol Cell. 2007, 6: 439-450. ProteomicsGoogle Scholar
- Palla G, Derenyi I, Farkas I, Vicsek T: Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005, 435 (7043): 814-818. 10.1038/nature03607.View ArticlePubMedGoogle Scholar
- Satuluri V, Srinivasan PA, and Duygu U: Markov Clustering of Protein Interaction Networks with Improved Balance and Scalability. Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology. 2010, Niagara Falls, NY, USA, 247-256.View ArticleGoogle Scholar
- Mewes HW, Amid C, Arnold R: MIPS: analysis and annotation of proteins from whole genomes. Nucl Acids Res. 2004, 32 (sup. 1): D41-44.PubMed CentralView ArticlePubMedGoogle Scholar
- Pu S, Wong J, Turner B, Cho E, Wodak SJ: Up-to-date catalogues of yeast protein complexes. Nucleic Acids Res. 2009, 37: 825-831. 10.1093/nar/gkn1005.PubMed CentralView ArticlePubMedGoogle Scholar
- Brohee S, van Helden J: Evaluation of clustering algorithms for protein protein interaction networks. BMC Bioinformatics. 2006, 7: 488-10.1186/1471-2105-7-488.PubMed CentralView ArticlePubMedGoogle Scholar
- Nepusz Tamás, Yu Haiyuan, Paccanaro Alberto: Detecting overlapping protein complexes in protein-protein interaction networks. Nature. 2012, 471-472. Methods 9Google Scholar
- Boyle EI, Weng S, Gollub J, Jin H, Botstein D, Cherry JM, Sherlock G: GO::TermFinder-open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics. 2004, 20 (18): 3710-3715. 10.1093/bioinformatics/bth456.PubMed CentralView ArticlePubMedGoogle Scholar
- Friedel C, Krumsiek J, and Zimmer R: Bootstrapping the interactome: unsupervised identification of protein complexes in yeast. Computational Molecular Biology. 2008, 4955: 3-16. 10.1007/978-3-540-78839-3_2.View ArticleGoogle Scholar
- Schlicker A, Domingues F, Rahnenfuhrer J: A new measure for functional similarity of gene Products based on gene ontology. BMC Bio-informatics. 2006, 7 (l): 302-View ArticleGoogle Scholar
- Huh W-K, Falvo JV, Gerke LC, Carroll AS, Howson RW, Weissman JS, O'Shea EK: Global analysis of protein localization in budding yeast. Nature. 2003, 425: 686-691. 10.1038/nature02026.View ArticlePubMedGoogle Scholar
- Krumsiek J, Friedel CC, Zimmer R: ProCope-Protein complex Prediction and evaluation. Bio-informatics. 2008, 24 (18): 2115-2116.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.