- Research article
- Open Access
Protein network prediction and topological analysis in Leishmania major as a tool for drug target selection
BMC Bioinformatics volume 11, Article number: 484 (2010)
Leishmaniasis is a virulent parasitic infection that causes a worldwide disease burden. Most treatments have toxic side-effects and efficacy has decreased due to the emergence of resistant strains. The outlook is worsened by the absence of promising drug targets for this disease. We have taken a computational approach to the detection of new drug targets, which may become an effective strategy for the discovery of new drugs for this tropical disease.
We have predicted the protein interaction network of Leishmania major by using three validated methods: PSIMAP, PEIMAP, and iPfam. Combining the results from these methods, we calculated a high confidence network (confidence score > 0.70) with 1,366 nodes and 33,861 interactions. We were able to predict the biological process for 263 interacting proteins by doing enrichment analysis of the clusters detected. Analyzing the topology of the network with metrics such as connectivity and betweenness centrality, we detected 142 potential drug targets after homology filtering with the human proteome. Further experiments can be done to validate these targets.
We have constructed the first protein interaction network of the Leishmania major parasite by using a computational approach. The topological analysis of the protein network enabled us to identify a set of candidate proteins that may be both (1) essential for parasite survival and (2) without human orthologs. These potential targets are promising for further experimental validation. This strategy, if validated, may augment established drug discovery methodologies, for this and possibly other tropical diseases, with a relatively low additional investment of time and resources.
Leishmaniasis is a complex infectious disease caused by several species of the Leishmania genus, affecting more than 2 million of people around the world in 88 countries. In addition to endemic countries, there have been increasing numbers of cases in non-endemic countries due to tourism [1–5]. The parasite is transmitted to human or animal reservoirs by the female insect of the genus Lutzomyia in the New World and Phlebotomus in the Old World . Leishmaniasis has three main clinical presentations: cutaneous, mucocutaneous and visceral. The visceral form affects mainly children, who can die if adequate treatment is not provided in a timely manner. The cutaneous and mucocutaneous forms can cause severe disabilities in adults, affecting productivity in rural areas. At present, there are no available vaccines for this disease in spite of multiple research efforts . The main measures for controlling the disease rely upon chemotherapy and vector control, which are tightly related given that human beings may act as reservoirs for the parasites in some endemic areas (antropozoonotic transmission). In spite of these measures, the number of cases continue to increase in many endemic countries, such as Colombia .
Current anti-leishmanial therapy has been unsuccessful due to toxicity, varying sensitivity of different Leishmania species, diversity of host immune responses, and different pharmacokinetics of the drug employed. The classical treatment for all forms of leishmaniasis has been pentavalent antimony in the form of sodium stibogluconate (Pentostam, Glaxo-Smith-Kline) or meglumine antimoniate (Glucantime, Rhone-Polenc). Severe side effects, including death, are associated with these compounds [8, 9], and increasing resistance to antimonials is currently a major problem in many endemic countries [2, 10]. Several drugs, such as Pentamidine and Amphotericin B, have also been used for leishmaniasis treatment. However, the presence of side effects, route of administration (injection rather than a pill), high cost, and differences in efficacy against the different clinical forms of the disease constrain their widespread use as drugs of choice. More recently, Miltefosine, an oral drug, originally developed as an antineoplastic compound, has been used successfully for treatment of visceral and cutaneous leishmaniasis [11, 12], but with variable efficacy in Central and South America . Moreover, a phase IV trial in India has shown an increment in the relapse rate with Miltefosine, indicating that drug resistance may develop quickly [14, 15]. For all these reasons, there is an urgent need for new, safe, and cheap anti-leishmanial compounds.
Drug discovery efforts, through public private partnerships, for the primary protozoal parasitic diseases of the developing world --malaria, leishmaniasis, and trypanosomiasis -- have renewed the interest in developing new drugs and vaccines that can be accessible to the affected, primarily poor, population . The drug discovery process begins with a search for drug targets that must fulfill two main requirements in the case of infectious diseases; (1) to be essential for the parasite survival and (2) to be specific, in that the target should not have a counterpart in the human host that can give rise to toxic effects. However, there is no consensus yet on the best biological indicators of essentiality. Indicators such as expression level and subcellular localization have been used to classify proteins as druggable. However, these assumptions often do not account for the complexity of the underlying biological network of interactions among those proteins .
New research initiatives have been undertaken to collect genome sequences along with high-throughput expression and proteomic data from different organisms. This constitutes an important source of biological information that can be employed efficiently in the search for new drugs for a large number of human and veterinary diseases. Bioinformatics tools have enabled researchers to extract and manipulate this biological information with the goal of understanding protein function. Unfortunately, the knowledge of the functions of proteins in their native form has not yet provided us with an understanding of the complexity of cellular behavior, thus there is not yet a clear definition of essentiality. Proteins inside the cell typically do not function in their native state alone, but rather by interacting in concert with other proteins, generating a high dimensional network with a complicated structure. Because of the networked nature of protein function, topological analysis of the protein network may help to identify essential proteins that can be potentially drug or vaccine targets. Recent studies carried out with experimental protein interaction networks of Saccharomyces cerevissiae and Caenorhabditis elegans, [18, 19] have confirmed the effectiveness of topological metrics in predicting protein essentiality, demonstrating strong correlation with knockout and knockdown data. These studies have also expanded to organisms of medical importance, such as the protozoan parasite Plasmodium falciparum, in the interest of discovering new drug and vaccine targets. This data is available through the system PlasmoID . Topological analysis has also been useful in detecting important proteins, even when the protein network has been predicted using an orthology-based method, as in the case of the human interactome.
In this work, we predicted the protein network of Leishmania major using protein sequences via three methods, iPfam, PSIMAP and PEIMAP. We analyzed the predicted protein network with the metrics of connectivity and betweenness centrality, in order to identify essential proteins. Protein interaction data were analyzed to detect GO enriched clusters, to determine the possible pathways of detected targets, and to infer the biological processes performed by proteins with unknown functional description. The list of putative protein targets is a starting point for experimental validation by in vitro assays and further discovery of new anti-leishmanial drugs.
Protein network prediction using PSIMAP, iPfam and PEIMAP
Predictions of protein-protein interactions (PPIs) were generated using the pipeline previously designed and applied in Xanthomonas oryzae, employing three different methods: PSIMAP, iPfam, and PEIMAP.
PSIMAP http://psimap.com/ infers interactions between proteins by using interacting domain pairs from known PDB (Protein Data Bank) structures. We extracted protein sequences of Leishmania major from the GeneDB database ftp://ftp.sanger.ac.uk/pub/databases/L.major_sequences/DATASETS/LmjFwholegenome_20070731_V5.2.pep. We aligned these sequences using PSI-BLAST  against the SCOP 1.71 database with an E-value cutoff of 0.0001, as described previously in . We predicted a total of 158,984 interactions for 3,184 proteins by applying PSIMAP  domain pairs to the domain assignment. The original definition of interaction in this database is based on atomic distance between domains in the structures of protein complexes.
We analyzed iPfam interactions using domain assignments from Pfam release 18.0  using the tool hmmpfam with an E-value cutoff of 0.01. By integrating them with Pfam domain interaction pairs from iPfam , a total of 50,398 predicted protein-protein interactions were constructed from 2,336 Leishmania proteins.
We aligned Leishmania proteins with the PEIMAP database http://peimap.kobic.re.kr using BLASTP  with a minimal cutoff of 40% sequence identity and 70% length coverage. The PEIMAP database includes protein-protein interaction (PPI) information from six source databases: DIP,  BIND,  IntAct,  MINT,  HPRD,  and BioGrid . A total of 14,839 interactions were extracted involving 718 Leishmania proteins.
Selecting confident predicted protein interactions
We used the 'combined score' method, applied in  and also used in the STRING database . This method takes into account the reliability of each method (PEIMAP, PSIMAP and iPfam), assuming independence among them. The score is calculated according to the formula:
Where score is the confidence score, E is the set of methods under analysis (PEIMAP, PSIMAP, iPfam); R i is the reliability of method i, n is the number of interactions predicted by method i. The reliability score of PEIMAP comes from previous reported data  that takes into account the reliability of each experimental method for detecting protein interactions. The reliability score of iPfam is extracted from the score between two Pfam domains from iPfam database. Finally, the reliability score of PSIMAP uses the calculated distance between interacting structural domains (SCOP).
The final score was further normalized to the range of 0.0 to 1.0 combining all the scores. We selected 1,366 Leishmania proteins participating in 33,861 high-confidence PPIs, (confidence score >0.7), combining the results from the three methods employed (Additional file 1: Cytoscape network of Leishmania interactome). To evaluate the confidence of the metric results, the clustering coefficient and mean shortest path were compared against 1,000 random networks generated with the Random Network Plugin in Cytoscape , and empirical p-values were computed.
Detection of essential proteins with topological metrics and homology filtering with human proteome
Power law fit for the protein network was calculated using Network Analyzer v.2.6.1 . Network topology metrics, such as betweenness centrality, connectivity, and the Double Scoring Scheme (DSS) were used to detect essential genes, using the Hubba server http://hub.iis.sinica.edu.tw/Hubba. This method takes into account weighted edges (confidence scores). The calculations were done over the largest component of the network, with 0.7 confidence cutoff. This cutoff was chosen to better fit the data with a power law distribution of the network. The detected targets were filtered by discarding Leishmania orthologs to human proteins.
Clustering and GO enrichment analysis
We conducted cluster analysis of the largest component in the network in order to detect protein complexes and pathways. We used the Markov Clustering (MCL) algorithm [40, 41], which has been demonstrated as a robust and fast algorithm to detect clusters in protein networks , using the implementation in the NeAT tool . For proteins of unknown function in the GeneDB database http://www.genedb.org/Homepage/Lmajor, we predicted their possible biological roles by evaluating the results of GO enrichment analysis, using the BinGO plugin for Cytoscape.
Results and Discussion
We constructed a protein-protein interaction (PPI) map, combining the results generated by PEIMAP, iPfam and PSIMAP. Despite the absence of protein interaction data for Leishmania major and the fact that protein interaction data from single organisms may contain some false positives that can bias the results, the use of interaction data from different species can help to reduce the noise in the predicted network . Comparison to random networks and utilization of experimental evidence that confirms the essentiality of some of the predicted targets are indirect ways of validating the calculated PPI map. Other studies have successfully applied this approach to discovering drug targets using computational methods to predict protein networks, e.g. blast rice fungus, M. tuberculosis, and Homo sapiens[45–47]. The predicted Leishmania major interactome can be a starting point for future experimental PPI maps in Leishmania, particularly given the fact that many interactions may require post-translational modifications that may not occur in yeast , thus making it difficult to perform yeast-two-hybrid assays in this organism. The entire predicted network comprises 3,991 nodes and 190,708 interactions (including self loops and duplicate edges). The reduced coverage is likely due to the inability to perform domain assignment to several proteins in Leishmania. Only 18.0% of the Leishmania proteome is conserved across species (as defined in the CluSTr database: http://www.ebi.ac.uk/integr8/ClustrAnalysisPageOnly.do?orgProteomeID = 21780). This is a common limitation of orthology-based methods for protein network prediction.
It has been proposed that biological networks follow a power law distribution that corresponds to scale-free topology . This is a global property of biological networks and it is important for a reliable prediction of essentiality when the metrics of connectivity and betweenness centrality are used. We performed the fitting of the node degree distribution to a power law using the least squares method, to determine if our predicted network was consistent with scale free topology. This resulted in an exponent of -0.867 (R2= 0.556) for the 0.60 confidence network. However, the calculated distribution for the 0.70 of confidence network showed an appreciable increase of the R2 coefficient to 0.758 and the exponent to -1.199. This result does not correlate well with power law distribution, possibly because subnetworks can have a different degree of distribution compared to the entire interactome. Moreover, it has been pointed out that geometric models could fit better than power law distribution [50, 51]. In spite of these limitations, we chose the 0.70 confidence cutoff, given that the network generated by applying this cuttoff fits better with a scale-free topology. This also enabled us to claim with more confidence that a detected hub and bottleneck node may be essential for the network.
Indentifying putative drug targets
Once the power law distribution is partially confirmed, other topological characteristics can be biologically meaningful. With this in mind, we conducted local topology analysis to identify hubs and bottlenecks that could be putative drug targets. We calculated connectivity and betweenness centrality over the 1,366-node network with 33,861 interactions (> 0.70 confidence). For all of these calculations we used the largest component and excluded isolated components from the larger original network, mainly because betweenness centrality, which calculates the number of shortest paths through a particular node, may generate an infinite number of shortest paths from isolated nodes, which can become confusing and make interpretation more difficult. The clustering coefficient and the mean shortest path of the network were compared against 1,000 random networks, (Table 1). We found that our protein network is more highly connected when its clustering coefficient is compared against the clustering coefficient values of the randomly generated networks. These results suggest that our network exhibits a modular architecture like other biological networks. This makes us more confident that the clusters might correlate with biological pathways. The mean shortest path is also significantly different from that of the random networks.
It has been shown that measures of connectivity  and betweenness centrality  improve the identification of essential proteins in protein networks . Betweenness centrality correlates more closely with essentiality than connectivity, exposing critical nodes that usually belong to the group of scaffold proteins or proteins involved in crosstalk between signalling pathways (called bottlenecks). This metric has also been proposed in the new paradigm of network pharmacology as a good feature for investigating potential drug targets . In the Leishmania major network, we selected the top 10% of the connectivity ranking as hub nodes and 20% of the betweenness centrality ranking for bottlenecks, according to previous methods for selecting such cutoffs [54, 56]. In addition, a recently developed tool, HUBBA , provides an alternative way of prediction of essential nodes by the combination of two metrics: DMNC (Density of Maximum Neighborhood Component) and MNC (Maximum Neighborhood Component). Together, they are referred to as the Double Scoring Scheme (DSS). We applied the DSS system to our high confidence network with the goal of extending the range of potential drug targets. We chose the cutoff of the top 10 proteins identified by this tool, because that cutoff identifies the group with the highest probability to be essential (close to 100%). However, we found that this group overlaps with the group of detected hubs.
In this first detection, which combines the results from connectivity, betweenness centrality, and DSS, we identified 384 potential targets, shown in Additional file 2: table S1. Once detected, targets need to be checked for orthologs in the human proteome, given that some drugs that bind conserved sites would perturb the corresponding human protein with possible toxic consequences. Utilizing the list of Leishmania orthologs to human proteins from the TDR database, we filtered the list of targets, removing those with homology to human proteins. The ortholog detection in the TDR database was performed using the OrthoMCL algorithm, which has shown high sensitivity compared to other methods, feature that it is critical to identify all of the possible human orthologs of Leishmania proteins. Once the Leishmania-human ortholog proteins were ruled out, the total number of potential proteins targets was reduced to 142 (Additional file 3: table S2). The network visualization of the targets is shown in Figure 1.
Gene Ontology enrichment analysis and function prediction
It has been shown that detecting modular structures inside a biological network provides insights into the functional organization of cellular processes in living organisms . In addition, it has been recognized that cluster detection combined with functional enrichment analysis enables the prediction of the biological function of proteins associated with a cluster . We applied the MCL algorithm to generate clusters in the network, setting an inflation value of 1.8 and considering edge weights (confidence scores) for the calculations. Functional enrichment was carried out using BinGO, importing the Leishmania major annotation from Gene Ontology (GO) http://geneontology.org. We generated 63 clusters for the network. For each of those clusters, we assigned the most significant GO biological process. These results are shown in Additional file 4: table S3.
Close neighbours in protein interaction networks are frequently involved in similar processes and it has been shown that 70-80% of proteins in a cluster share at least one function. This implies that any unclassified protein could be tentatively assigned the function of its neighbours [60, 61]. We found that 263 proteins without functional description in the GeneDB database are related to well-defined clusters. We assigned a biological process to those proteins based on the probability of membership in a specific GO enriched cluster. By this method, we predicted new protein roles for Leishmania major that were previously unknown using current annotation procedures (Additional file 5: table S4).
The largest cluster contains 15% of the proteins in the network. They participate mainly in protein amino acid phosphorylation (GO:0006468) (p-value < 0.00001). Within the group of detected targets with no human counterpart, we found that 64% of the targets were also enriched in the protein amino acid phosphorylation process (Figure 2, Additional file 6: table S5), followed by proteins involved in nucleosome assembly (GO:0006334) 8%, nucleic acid metabolic process (GO:0006139) 4%, electron transport (GO:0006118) 4%, transport (GO:0006810) 4%, and protein amino acid alkylation (GO:0006139) 2%. The remaining proteins were distributed across processes with one protein per process and classified as 'other'; these accounted for 14% of the target proteins. This analysis suggests phosphoproteins as the main group to characterize and explore as drug targets. Proteins involved in nucleic acid metabolism also should be explored as possible drug targets, given that Leishmania does not have the enzymatic machinery to synthesize purines de novo. Interestingly, proteins associated with nucleosome assembly appear as alternative options.
Experimental evidence of the essentiality of predicted targets
As mentioned above, there is a significant proportion of phosphoproteins in the group of essential genes. This is plausible, given that these proteins are important regulators of differentiation and cell proliferation in many eukaryotes. However, it has been pointed out that the Leishmania kinome has particular distinctions from other eukaryotic kinomes (for a good review see reference ). We identified 91 kinases that were predicted as essential proteins in the network with no homology to the human kinome. This is an interesting and new group of potential targets for future drug screening in this organism, perhaps by using transfectant parasites as in the methodology developed by our group . Within this group of kinases, LMPK [GeneDB:LmjF36.6470] has been experimentally shown as essential in Leishmania mexicana with orthologs in L. amazonensis, L. major, L. tropica, L. aethiopica, L. donovani, L. infantum, and L. braziliensis. There is a growing interest in this protein as a drug and vaccine candidate, given its importance in parasite proliferation at the amastigote stage.
A previous study has reconstructed the metabolic network of Leishmania major from literature and carried out flux balance analysis to predict potential drug targets . However, when we compared the list of single predicted knockouts found by modelling with the list targets derived using topological methods, we did not find any overlap. This could be due to the fact that metabolic networks connect proteins by the metabolites that they catalyze and not by direct interaction. However, when we analyzed the double knockouts list, we found that the protein [GeneDB:LmjF36.1360] adenylate kinase, was predicted to be essential in our network and was also present in the double knockout pair of the metabolic network [GeneDB:LmjF36.1360,LmjF25.2370]. This is highlighted in red on Figure 3. This implies that redundancy in metabolic networks can also be detected by computing betweenness centrality in protein networks. Inhibition of this protein caused low growth in L. donovani promastigotes  and homology searching identified orthologs in L.braziliensis, L.infantum, T. brucei, and T. cruzi,. This would be advantageous in developing a drug for a wide spectrum of tropical diseases. Additionally, a DrugBank http://www.drugbank.ca search showed that the drug Gemcitabine could also have an inhibitory effect upon this protein, illustrating the potential use of this drug for tropical diseases besides its current use in cancer.
[GeneDB:LmjF35.1180 and GeneDB:LmjF35.0830] are fumarate reductase and fumarate reductase-like proteins that have been predicted as essential in our network. Neither has a human ortholog. Some reports have shown that compounds such as chalcones  and aurones  have a very potent inhibitory effect on these enzymes, making them interesting compounds for future drug development.
Three ABC transporters that were Leishmania specific - [GeneDB:LmjF34.0670, GeneDB:LmjF27.0470, GeneDB:LmjF32.2060] - were also predicted as essential. They confer resistance to antimonials and pentamidine by extruding the drug outside of the cell. Some research groups are investigating inhibitors for this family of transporters , with the goal of reverting the resistant phenotype to a susceptible phenotype. Based upon our analysis, we also identify these proteins as putative drug targets because of their essential role in the homeostasis of the parasite intracellular environment.
A final example that corroborates our findings with experimental data is the detection of sterol 24-c-methyltransferases [GeneDB:LmjF36.2390, GeneDB:LmjF36.2380] as essential and exclusive in our network. Those enzymes are involved in biosynthesis of ergosterol, which is a target pathway in Leishmania and fungi given its exclusivity and essentiality. Also, a recent study identified methyl-transferase as a promising drug target in Cryptococcus neoformans. Moreover, this enzyme has been recently tested as an effective vaccine candidate in visceral leishmaniasis .
Finally, we looked at the expression level of exclusive predicted targets in the microarray data reported by Leifso et al.,, and we did not find any significant overexpression of the predicted essential genes at the amastigote stage. This could be expected, given that few genes have been found to be up or down-regulated across promastigote and amastigote stages. This suggests that essentiality could not be related to gene expression in the case of Leishmania, given regulation of protein abundance probably occurs at post-transcriptional level.
This work constitutes the first attempt to explore protein interaction networks in the Leishmania major parasite by utilizing in silico methods. We have provided a putative list of essential proteins; some of them backed experimental evidence reported in literature. Of special interest are the predicted essential kinases that constitute an important group of Leishmania proteins to be explored as sources of new drug targets, given that they are important for parasite survival while having no homology to the human kinome. Further experimental studies are required to identify specific inhibitors. These results will aid future drug discovery efforts for this disease, enabling drug development in a more timely and cost-effective manner.
Herwaldt BL: Leishmaniasis. Lancet 1999, 354(9185):1191–1199. 10.1016/S0140-6736(98)10178-2
Lawn SD, Armstrong M, Chilton D, Whitty CJ: Electrocardiographic and biochemical adverse effects of sodium stibogluconate during treatment of cutaneous and mucosal leishmaniasis among returned travellers. Trans R Soc Trop Med Hyg 2006, 100(3):264–269. 10.1016/j.trstmh.2005.03.012
Antinori S, Gianelli E, Calattini S, Longhi E, Gramiccia M, Corbellino M: Cutaneous leishmaniasis: an increasing threat for travellers. Clin Microbiol Infect 2005, 11(5):343–346. 10.1111/j.1469-0691.2004.01046.x
Scarisbrick JJ, Chiodini PL, Watson J, Moody A, Armstrong M, Lockwood D, Bryceson A, Vega-Lopez F: Clinical features and diagnosis of 42 travellers with cutaneous leishmaniasis. Travel Med Infect Dis 2006, 4(1):14–21. 10.1016/j.tmaid.2004.11.002
Stark D, van Hal S, Lee R, Marriott D, Harkness J: Leishmaniasis, an emerging imported infection: report of 20 cases from Australia. J Travel Med 2008, 15(5):351–354. 10.1111/j.1708-8305.2008.00223.x
Noazin S, Modabber F, Khamesipour A, Smith PG, Moulton LH, Nasseri K, Sharifi I, Khalil EA, Bernal ID, Antunes CM, et al.: First generation leishmaniasis vaccines: a review of field efficacy trials. Vaccine 2008, 26(52):6759–6767. 10.1016/j.vaccine.2008.09.085
Piscopo TV, Mallia AC: Leishmaniasis. Postgrad Med J 2006, 82(972):649–657. 10.1136/pgmj.2006.047340
Palumbo E: Current treatment for cutaneous leishmaniasis: a review. Am J Ther 2009, 16(2):178–182. 10.1097/MJT.0b013e3181822e90
Sadeghian G, Ziaei H, Sadeghi M: Electrocardiographic changes in patients with cutaneous leishmaniasis treated with systemic glucantime. Ann Acad Med Singapore 2008, 37(11):916–918.
Grogl M, Thomason TN, Franke ED: Drug resistance in leishmaniasis: its implication in systemic chemotherapy of cutaneous and mucocutaneous disease. Am J Trop Med Hyg 1992, 47(1):117–126.
Le Pape P: Development of new antileishmanial drugs--current knowledge and future prospects. J Enzyme Inhib Med Chem 2008, 23(5):708–718. 10.1080/14756360802208137
Berman JJ: Treatment of leishmaniasis with miltefosine: 2008 status. Expert Opin Drug Metab Toxicol 2008, 4(9):1209–1216. 10.1517/17425255.4.9.1209
Soto J, Arana BA, Toledo J, Rizzo N, Vega JC, Diaz A, Luz M, Gutierrez P, Arboleda M, Berman JD, et al.: Miltefosine for new world cutaneous leishmaniasis. Clin Infect Dis 2004, 38(9):1266–1272. 10.1086/383321
Zerpa O, Ulrich M, Blanco B, Polegre M, Avila A, Matos N, Mendoza I, Pratlong F, Ravel C, Convit J: Diffuse cutaneous leishmaniasis responds to miltefosine but then relapses. Br J Dermatol 2007, 156(6):1328–1335. 10.1111/j.1365-2133.2007.07872.x
Agrawal S, Rai M, Sundar S: Management of visceral leishmaniasis: Indian perspective. J Postgrad Med 2005, 51(Suppl 1):S53–57.
Seringhaus M, Paccanaro A, Borneman A, Snyder M, Gerstein M: Predicting essential genes in fungal genomes. Genome Res 2006, 16(9):1126–1135. 10.1101/gr.5144106
Cascante M, Boros LG, Comin-Anduix B, de Atauri P, Centelles JJ, Lee PW: Metabolic control analysis in drug discovery and disease. Nat Biotechnol 2002, 20(3):243–249. 10.1038/nbt0302-243
Jeong H, Mason SP, Barabasi AL, Oltvai ZN: Lethality and centrality in protein networks. Nature 2001, 411(6833):41–42. 10.1038/35075138
Lee I, Lehner B, Crombie C, Wong W, Fraser AG, Marcotte EM: A single gene network accurately predicts phenotypic effects of gene perturbation in Caenorhabditis elegans. Nat Genet 2008, 40(2):181–188. 10.1038/ng.2007.70
LaCount DJ, Vignali M, Chettier R, Phansalkar A, Bell R, Hesselberth JR, Schoenfeld LW, Ota I, Sahasrabudhe S, Kurschner C, et al.: A protein interaction network of the malaria parasite Plasmodium falciparum. Nature 2005, 438(7064):103–107. 10.1038/nature04104
Rao A, Yeleswarapu SJ, Raghavendra G, Srinivasan R, Bulusu G: PlasmoID: A P. falciparum protein information discovery tool. In Silico Biol 2009, 9(4):195–202.
Jonsson PF, Bates PA: Global topological features of cancer proteins in the human interactome. Bioinformatics 2006, 22(18):2291–2297. 10.1093/bioinformatics/btl390
Kim JG, Park D, Kim BC, Cho SW, Kim YT, Park YJ, Cho HJ, Park H, Kim KB, Yoon KO, et al.: Predicting the interactome of Xanthomonas oryzae pathovar oryzae for target selection and DB service. BMC Bioinformatics 2008, 9: 41. 10.1186/1471-2105-9-41
Park D, Lee S, Bolser D, Schroeder M, Lappe M, Oh D, Bhak J: Comparative interactomics analysis of protein family interaction networks using PSIMAP (protein structural interactome map). Bioinformatics 2005, 21(15):3234–3240. 10.1093/bioinformatics/bti512
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389–3402. 10.1093/nar/25.17.3389
Hubbard TJ, Murzin AG, Brenner SE, Chothia C: SCOP: a structural classification of proteins database. Nucleic Acids Res 1997, 25(1):236–239. 10.1093/nar/25.1.236
Park J, Lappe M, Teichmann SA: Mapping protein family interactions: intramolecular and intermolecular protein family interaction repertoires in the PDB and yeast. J Mol Biol 2001, 307(3):929–938. 10.1006/jmbi.2001.4526
Finn RD, Mistry J, Schuster-Bockler B, Griffiths-Jones S, Hollich V, Lassmann T, Moxon S, Marshall M, Khanna A, Durbin R, et al.: Pfam: clans, web tools and services. Nucleic Acids Res 2006, (34 Database):D247–251. 10.1093/nar/gkj149
Finn RD, Marshall M, Bateman A: iPfam: visualization of protein-protein interactions in PDB at domain and amino acid resolutions. Bioinformatics 2005, 21(3):410–412. 10.1093/bioinformatics/bti011
Xenarios I, Rice DW, Salwinski L, Baron MK, Marcotte EM, Eisenberg D: DIP: the database of interacting proteins. Nucleic Acids Res 2000, 28(1):289–291. 10.1093/nar/28.1.289
Bader GD, Donaldson I, Wolting C, Ouellette BF, Pawson T, Hogue CW: BIND--The Biomolecular Interaction Network Database. Nucleic Acids Res 2001, 29(1):242–245. 10.1093/nar/29.1.242
Hermjakob H, Montecchi-Palazzi L, Lewington C, Mudali S, Kerrien S, Orchard S, Vingron M, Roechert B, Roepstorff P, Valencia A, et al.: IntAct: an open source molecular interaction database. Nucleic Acids Res 2004, (32 Database):D452–455. 10.1093/nar/gkh052
Zanzoni A, Montecchi-Palazzi L, Quondam M, Ausiello G, Helmer-Citterich M, Cesareni G: MINT: a Molecular INTeraction database. FEBS Lett 2002, 513(1):135–140. 10.1016/S0014-5793(01)03293-8
Peri S, Navarro JD, Kristiansen TZ, Amanchy R, Surendranath V, Muthusamy B, Gandhi TK, Chandrika KN, Deshpande N, Suresh S, et al.: Human protein reference database as a discovery resource for proteomics. Nucleic Acids Res 2004, (32 Database):D497–501. 10.1093/nar/gkh070
Stark C, Breitkreutz BJ, Reguly T, Boucher L, Breitkreutz A, Tyers M: BioGRID: a general repository for interaction datasets. Nucleic Acids Res 2006, (34 Database):D535–539. 10.1093/nar/gkj109
von Mering C, Jensen LJ, Snel B, Hooper SD, Krupp M, Foglierini M, Jouffre N, Huynen MA, Bork P: STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res 2005, (33 Database):D433–437.
Chua HN, Sung WK, Wong L: Exploiting indirect neighbours and topological weight to predict protein function from protein-protein interactions. Bioinformatics 2006, 22(13):1623–1630. 10.1093/bioinformatics/btl145
Shannon P, Markiel A, Ozier O, Baliga NS, Wang JT, Ramage D, Amin N, Schwikowski B, Ideker T: Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 2003, 13(11):2498–2504. 10.1101/gr.1239303
Assenov Y, Ramirez F, Schelhorn SE, Lengauer T, Albrecht M: Computing topological parameters of biological networks. Bioinformatics 2008, 24(2):282–284. 10.1093/bioinformatics/btm554
Stijn van Dongen : Graph clustering by flow simulation. PhD thesis. University of Utrecht; 2000. [http://www.library.uu.nl/digiarchief/dip/diss/1895620/inhoud.htm]
Enright AJ, Van Dongen S, Ouzounis CA: An efficient algorithm for large-scale detection of protein families. Nucleic Acids Res 2002, 30(7):1575–1584. 10.1093/nar/30.7.1575
Brohee S, van Helden J: Evaluation of clustering algorithms for protein-protein interaction networks. BMC Bioinformatics 2006, 7: 488. 10.1186/1471-2105-7-488
Brohee S, Faust K, Lima-Mendez G, Vanderstocken G, van Helden J: Network Analysis Tools: from biological networks to clusters and pathways. Nat Protoc 2008, 3(10):1616–1629. 10.1038/nprot.2008.100
Sharan R, Suthram S, Kelley RM, Kuhn T, McCuine S, Uetz P, Sittler T, Karp RM, Ideker T: Conserved patterns of protein interaction in multiple species. Proc Natl Acad Sci USA 2005, 102(6):1974–1979. 10.1073/pnas.0409522102
He F, Zhang Y, Chen H, Zhang Z, Peng YL: The prediction of protein-protein interaction networks in rice blast fungus. BMC Genomics 2008, 9: 519. 10.1186/1471-2164-9-519
Raman K, Kalidas Y, Chandra N: targetTB: A target identification pipeline for Mycobacterium tuberculosis through an interactome, reactome and genome-scale structural analysis. BMC Syst Biol 2008, 2(1):109.
Hwang S, Son SW, Kim SC, Kim YJ, Jeong H, Lee D: A protein interaction network associated with asthma. J Theor Biol 2008, 252(4):722–731. 10.1016/j.jtbi.2008.02.011
Myler , Fasel : Leishmania After the Genome. Caister Academic Press; 2008.
Barabasi AL, Oltvai ZN: Network biology: understanding the cell's functional organization. Nat Rev Genet 2004, 5(2):101–113. 10.1038/nrg1272
Han JD, Dupuy D, Bertin N, Cusick ME, Vidal M: Effect of sampling on topology predictions of protein-protein interaction networks. Nat Biotechnol 2005, 23(7):839–844. 10.1038/nbt1116
Przulj N, Corneil DG, Jurisica I: Modeling interactome: scale-free or geometric? Bioinformatics 2004, 20(18):3508–3515. 10.1093/bioinformatics/bth436
Batada NN, Hurst LD, Tyers M: Evolutionary and physiological importance of hub proteins. PLoS Comput Biol 2006, 2(7):e88. 10.1371/journal.pcbi.0020088
Joy MP, Brock A, Ingber DE, Huang S: High-betweenness proteins in the yeast protein interaction network. J Biomed Biotechnol 2005, 2005(2):96–103. 10.1155/JBB.2005.96
Yu H, Kim PM, Sprecher E, Trifonov V, Gerstein M: The importance of bottlenecks in protein networks: correlation with gene essentiality and expression dynamics. PLoS Comput Biol 2007, 3(4):e59. 10.1371/journal.pcbi.0030059
Hopkins AL: Network pharmacology: the next paradigm in drug discovery. Nat Chem Biol 2008, 4(11):682–690. 10.1038/nchembio.118
Hsing M, Byler KG, Cherkasov A: The use of Gene Ontology terms for predicting highly-connected 'hub' nodes in protein-protein interaction networks. BMC Syst Biol 2008, 2: 80. 10.1186/1752-0509-2-80
Lin CY, Chin CH, Wu HH, Chen SH, Ho CW, Ko MT: Hubba: hub objects analyzer--a framework of interactome hubs identification for network biology. Nucleic Acids Res 2008, (36 Web Server):W438–443. 10.1093/nar/gkn257
Chen F, Mackey AJ, Vermunt JK, Roos DS: Assessing performance of orthology detection strategies applied to eukaryotic genomes. PLoS ONE 2007, 2(4):e383. 10.1371/journal.pone.0000383
Rives AW, Galitski T: Modular organization of cellular networks. Proc Natl Acad Sci USA 2003, 100(3):1128–1133. 10.1073/pnas.0237338100
Sharan R, Ulitsky I, Shamir R: Network-based prediction of protein function. Mol Syst Biol 2007, 3: 88. 10.1038/msb4100129
Titz B, Schlesner M, Uetz P: What do we learn from high-throughput protein interaction data? Expert Rev Proteomics 2004, 1(1):111–121. 10.1586/147894126.96.36.199
Chavali AK, Whittemore JD, Eddy JA, Williams KT, Papin JA: Systems analysis of metabolism in the pathogenic trypanosomatid Leishmania major. Mol Syst Biol 2008, 4: 177. 10.1038/msb.2008.15
Naula C, Parsons M, Mottram JC: Protein kinases as drug targets in trypanosomes and Leishmania. Biochim Biophys Acta 2005, 1754(1–2):151–159.
Varela MR, Munoz DL, Robledo SM, Kolli BK, Dutta S, Chang KP, Muskus C: Leishmania (Viannia) panamensis: an in vitro assay using the expression of GFP for screening of antileishmanial drug. Exp Parasitol 2009, 122(2):134–139. 10.1016/j.exppara.2009.02.012
Wiese M: A mitogen-activated protein (MAP) kinase homologue of Leishmania mexicana is essential for parasite survival in the infected host. EMBO J 1998, 17(9):2619–2628. 10.1093/emboj/17.9.2619
Wiese M, Gorcke I: Homologues of LMPK, a mitogen-activated protein kinase from Leishmania mexicana, in different Leishmania species. Med Microbiol Immunol 2001, 190(1–2):19–22.
Villa H, Perez-Pertejo Y, Garcia-Estrada C, Reguera RM, Requena JM, Tekwani BL, Balana-Fouce R, Ordonez D: Molecular and functional characterization of adenylate kinase 2 gene from Leishmania donovani. Eur J Biochem 2003, 270(21):4339–4347. 10.1046/j.1432-1033.2003.03826.x
Chen M, Zhai L, Christensen SB, Theander TG, Kharazmi A: Inhibition of fumarate reductase in Leishmania major and L. donovani by chalcones. Antimicrob Agents Chemother 2001, 45(7):2023–2029. 10.1128/AAC.45.7.2023-2029.2001
Kayser O, Chen M, Kharazmi A, Kiderlen AF: Aurones interfere with Leishmania major mitochondrial fumarate reductase. Z Naturforsch [C] 2002, 57(7–8):717–720.
Perez-Victoria JM, Di Pietro A, Barron D, Ravelo AG, Castanys S, Gamarro F: Multidrug resistance phenotype mediated by the P-glycoprotein-like transporter in Leishmania: a search for reversal agents. Curr Drug Targets 2002, 3(4):311–333. 10.2174/1389450023347588
Nes WD, Zhou W, Ganapathy K, Liu J, Vatsyayan R, Chamala S, Hernandez K, Miranda M: Sterol 24-C-methyltransferase: an enzymatic target for the disruption of ergosterol biosynthesis and homeostasis in Cryptococcus neoformans. Arch Biochem Biophys 2009, 481(2):210–218. 10.1016/j.abb.2008.11.003
Goto Y, Bogatzki LY, Bertholet S, Coler RN, Reed SG: Protective immunization against visceral leishmaniasis using Leishmania sterol 24-c-methyltransferase formulated in adjuvant. Vaccine 2007, 25(42):7450–7458. 10.1016/j.vaccine.2007.08.001
Leifso K, Cohen-Freue G, Dogra N, Murray A, McMaster WR: Genomic and proteomic expression analysis of Leishmania promastigote and amastigote life stages: the Leishmania genome is constitutively expressed. Mol Biochem Parasitol 2007, 152(1):35–46. 10.1016/j.molbiopara.2006.11.009
DP thanks Jea-Woon Ryu, who helped to predict the Leishmania interactome. AFF thanks Jorge Zuluaga and Willington Vega for programming assistance. AFF also thanks Robert McMaster for sharing microarray data. This work was supported by Colciencias, contract number 538 and project ID 111549326124 and the CODI-sustainability Program, Programa de Estudio y Control de Enfermedades Tropicales-PECET and Centro Internacional de Desarrollo de Productos-CIDEPRO, Universidad de Antioquia. JHM is supported by NIH Grant P41 RR-01081.
AFF generated the idea, carried out topology and clustering analysis and wrote the manuscript. DP, JB. BCK helped with predicting the interactome and calculating confidence scores. AK helped with the topology and clustering analysis, JHM and JE reviewed it critically. CM supervised the project, provided biological information about Leishmania and wrote the manuscript. All the authors have read and approved the final manuscript.
Electronic supplementary material
Additional file 1: . Leishmania major interactome in Cytoscape format with the annotation and topological metrics as Cytoscape attributes. (ZIP 762 KB)
Additional file 2: Table S1: List of targets detected by connectivity and betweenness centrality but not filtered for human homology. (XLS 80 KB)
Additional file 4: Table S3: Clusters IDs from the whole network with overrepresented GO codes and p-values. (XLS 24 KB)
Additional file 5: Table S4: List of hypothetical proteins with predicted biological process derived from the clustering and enrichment analysis. (XLS 48 KB)
About this article
Cite this article
Flórez, A.F., Park, D., Bhak, J. et al. Protein network prediction and topological analysis in Leishmania major as a tool for drug target selection. BMC Bioinformatics 11, 484 (2010). https://doi.org/10.1186/1471-2105-11-484
- Gene Ontology
- Visceral Leishmaniasis
- Cluster Coefficient
- Betweenness Centrality