In silico approach to screen compounds active against parasitic nematodes of major socio-economic importance
© Khanna and Ranganathan; licensee BioMed Central Ltd. 2011
Published: 30 November 2011
Infections due to parasitic nematodes are common causes of morbidity and fatality around the world especially in developing nations. At present however, there are only three major classes of drugs for treating human nematode infections. Additionally the scientific knowledge on the mechanism of action and the reason for the resistance to these drugs is poorly understood. Commercial incentives to design drugs that are endemic to developing countries are limited therefore, virtual screening in academic settings can play a vital role is discovering novel drugs useful against neglected diseases. In this study we propose to build robust machine learning model to classify and screen compounds active against parasitic nematodes.
A set of compounds active against parasitic nematodes were collated from various literature sources including PubChem while the inactive set was derived from DrugBank database. The support vector machine (SVM) algorithm was used for model development, and stratified ten-fold cross validation was used to evaluate the performance of each classifier. The best results were obtained using the radial basis function kernel. The SVM method achieved an accuracy of 81.79% on an independent test set. Using the model developed above, we were able to indentify novel compounds with potential anthelmintic activity.
In this study, we successfully present the SVM approach for predicting compounds active against parasitic nematodes which suggests the effectiveness of computational approaches for antiparasitic drug discovery. Although, the accuracy obtained is lower than the previously reported in a similar study but we believe that our model is more robust because we intentionally employed stringent criteria to select inactive dataset thus making it difficult for the model to classify compounds. The method presents an alternative approach to the existing traditional methods and may be useful for predicting hitherto novel anthelmintic compounds.
List of successful targets in helminths and corresponding drug class known to be active against those target.
Nicotinic acetylcholine receptor beta 1
62% identity with human NAch receptor beta 2
Glutamate-gated chloride channel
54% similarity with human glutamate receptor
Transferases transferring alkyl or aryl groups
61% identity with human Mu isoform
96% similarity to human tubulin beta
Gamma-aminobutyric acid receptor
At present however, only a couple of drugs are being used to control most worm infections in humans and animals. There are only three major classes of anthelmintic drugs available in the market. Benzimidazoles are broad spectrum anthelmintics and inhibit ß-tubulin resulting in impaired microtubule formation during cell division . The benzimidazoles have greater affinity for tubulin in helminth cells than the tubulin found in the cells of mammals as first reported by Friedman and Plazer . They found that fenbendazole was 250 times and mebendazole was 400 times more potent inhibitors of colchicine binding to A. suum embryonic tubulin than to mammalian tubulin and concluded that benzimidazoles clearly exhibit higher affinity to helminth tubulins. However, direct binding studies by Kohler and Bachmann  failed to find a significant change in benzimidazole affinity using mebendazole and intestinal A. suum tubulin. The authors surmised that differential pharmacokinetic behaviour of mebendazole could be responsible for the difference in drug susceptibility between host and parasite. Macrocyclic lactones form the second class of anthelmintics, interacting with a range of ion channels including glutamate-gated , γ-aminobutyric acid-gated  and acetylcholine-gated  chloride channels. Levamisole, pyrantel and morantel belong to the third class and bind to the nicotinic acetylcholine receptors causing muscle paralysis due to extended muscle contraction and spastic paralysis of the parasite . Given the diversity in the chemical structures of these classes, predicting novel anthelmintics is a challenging task.
Nematodes infect the majority of the farm animals, and consequently, present a huge risk to livestock industry and exacerbate global food shortages. It is therefore not surprising that most of the anthelmintic drugs were originally developed to treat animal infections but were subsequently approved for human use with little or no modification. However, due to the disproportionate use of anthelmintics, currently the livestock industry is facing a very serious challenge with drug resistance in farm animals [9, 10]. Furthermore, with a limited number of drugs being used, worm strains are able to develop drug resistance easily. In fact, there have also been reports of resistance for the present day anthelmintic drugs in humans . Hence, there is an urgent need to discover novel safe and efficacious classes of anthelmintics with a new mode of action.
Recent efforts in anthelmintic drug discovery
An excellent review on the current anthelmintics and existing research gaps that need to be addressed in order to discover novel anthelminthic drugs are summarized recently by Keiser and Utzinger . Kaminsky et al. reported a new class of synthetic anthelmintics, amino–acetonitrile derivatives (AADs) that are active against a variety of livestock pathogenic nematode species. The authors reported that the optimized AADs were able to eliminate fourth larval stages of H. contortus, T. colubriformis in sheep and Cooperia oncophora, Ostertagia ostertagi in cattle at a single oral dose of 20 mg racemate kg-1. The authors surmised that a unique group of nematode specific nAChR protein from acr-23 gene is responsible for AAD efficacy. Hu et al. have demonstrated that the mechanism of action of a novel anthelminthic drug, tribendimidine, approved recently in P.R. China. They concluded that tribendimidine is an L-subtype nAChR agonist, similar to levamisole pyrantel. The anthelminthic properties of cyclooctadepsipeptides have also been reported recently in vitro and in vivo[15, 16]. Mefloquine is an antimalarial drug and has been used successfully for past four decades to treat prophylaxis of malaria. However, recent research revealed promising antischistosomal properties of mefloquine in Schistosoma mansoni- and Schistosoma japonicum-infected mouse models [17, 18]. Ponce-Marrero et al.  introduced a novel approach for in silico design of new anthelmintic drugs using linear discriminant analysis to obtain a quantitative model that classified anthelmintic drug-like from non-anthelmintic compounds. The developed model correctly classified 88.18% of the compounds in external test set. The model was then used for virtual screening and several compounds from Merck Index and Negwer’s handbook were identified by the model as anthelmintic. Subsequently in vivo test were carried out to validate the predictions.
Overview of the ligand-based virtual screening methods
Antiparasitic drugs historically have been discovered by experimental screening against intact parasites, but due to the enormity of the task and availability of better computational facilities there has been a shift towards computational screening. Computational screening (also known as virtual screening) has inherent advantage over traditional and even experimental high throughput screening (HTS) due to its massive parallel processing ability; millions of compounds per week can be tested. Virtual screening (VS) has been widely used to discover new leads by computationally identifying compounds with higher probability of strong binding affinity to the target protein. Successful studies have led to the identification of molecules either resembling the native ligands of a particular target or novel compounds [20, 21]. VS methods can be classified into structure-based and ligand-based approaches based on the amount of structural and bioactivity data available. If the 3D structure of the receptor is known, a structure-based VS methods that can be used is high-throughput docking  but where the information on the receptor is scant, ligand-based methods  like similarity searching and machine learning techniques are commonly used. Docking involves a complex optimization task of finding the most favourable 3D binding conformation of the ligand to the receptor molecule. Being computationally intensive, docking is not suitable for very large virtual screening experiments. On the other hand, ligand-based methods are popular because they are computationally inexpensive and easy to use. Furthermore, the assumption that structurally similar molecules exhibit similar biological activity than dissimilar or less similar molecules is generally valid. Thus, ligand-based methods are increasingly playing an important role at the beginning of the drug discovery projects especially where little 3D information is available for the receptor. Particularly interesting are machine learning based approaches such as neural networks, genetic algorithms and support vector machines (SVM). SVM is a powerful classification technique that has found numerous applications in chemistry such as drug design, quantitative structure property prediction and chemical data mining. Many studies in the past have shown SVM to be one of the best methods for correctly classifying molecules [24–26]. Zernov et al.  used SVM and neural networks to predict the drug-likeness and agrochemical-likeness for large compound collections. They showed that for both kinds of data, SVM outperformed all neural networks under the same training conditions. Warmuth et al. investigated a large collection of compounds to find those that bind to the target of interest in as few iterations of biochemical testing as possible. The authors compared various search strategies including maximum margin hyperplane, generated by SVM. They concluded that the strategies based on SVM clearly outperform the simpler ones. Similarly, Burbidge et al.  carried out a comparative study that involved prediction of the inhibition of dihydrofolate reductase by pyrimidines, using SVM, ANN and decision trees. They found that SVM outperformed the other methods, except in a manually capacity-controlled ANN, which required significantly longer training time. Nonetheless, ligand-based VS still remains an unproven approach in the discovery of antiparasitic medicines .
In this investigation, we have developed an in silico classification model using SVM to predict potential anthelmintic leads targeted towards parasitic nematodes. Our model has an estimated accuracy of ~82.0% for the test dataset. We have applied this model to a large public database to predict novel anthelmintic compounds and identified a set of 45 compounds, of which six are promising as potential therapeutic agents.
Preparation of the dataset
Composition of the datasets used in this study.
Prediction set (from ChEMBL)
Descriptor calculation and selection
The determination of relevant features is an important step in any machine learning process . Moreover, with hundreds of descriptors available it is essential to choose the best subset of descriptors because many of the descriptors are noisy and some are irrelevant to the target activity. Feature selection is the effective way to remove noisy or irrelevant descriptors and reduce the dimensionality of the feature space to avoid overfitting. This leads to simple and robust computational models with improved prediction accuracy.
There are two main approaches for feature selection in a supervised learning context. The first one is the filter approach . It consists of selecting the best subset of features in an independent way, with ad hoc criteria. Filter methods are fast and can be easily implemented; however, there is no guarantee that the best subset of descriptors has been selected. The second method is the wrapper approach  which evaluates the performance of a predetermined learning algorithm and uses it as an evaluation criterion to select the optimum subset of features.
List of final 14 descriptors used in this analysis.
The heat of formation (kcal/mol)
The energy (eV) of the Highest Occupied Molecular Orbital
Water accessible surface area of all atoms with positive partial charge
Water accessible surface area of all atoms with negative partial charge
Water accessible surface area of all polar
Electrostatic component of the potential energy.
Kier molecular flexibility index
Log of the aqueous solubility (mol/L).
The square root of the third largest eigenvalue of the covariance matrix of the atomic coordinates.
Lowest hydrophilic energy
H-bond donor capacity
The penalty constant C serves as a regularization parameter and represents the trade-off between minimizing the training set error and maximizing the margin. Higher number of support vectors is due to a small C and vice versa. If we use a very small C value, then almost all the samples would influence the model equally to build a decision boundary regardless of their position. As a result, virtually all the samples become support vectors. On the other hand, if we use a large C it may cause overfitting.
Two parameters viz., γ which determines the capacity of the RBF kernel and the regularization parameter, C are required for optimization of SVM classifiers. To optimize the SVM parameters, C and γ, we carried out an extensive grid search to build accurate models. The resulting optimized parameters were C = 1.4 and γ = 0.43.
The prediction accuracy of the models developed was tested using ten-fold cross-validation technique. In a ten-fold cross-validation, the dataset was split into ten subsets of equal proportions. One of the subsets was used as the test set while the rest were used for training the classifier. The trained classifier was tested using the test set. This was repeated ten times using a different subset for testing and thus ensuring that every compound was used in prediction once.
The prediction results from SVM were evaluated for the test dataset using the following statistical measures.
TP, true positive – the number of correctly classified active compounds.
TN, true negative – the number of correctly classified non-active compounds.
FP, false positives – the number of incorrectly classified non-active compounds.
FN, false negative – the number of incorrectly classified active compounds.
Using the variables above, a series of metrics were computed sensitivity (SN), specificity (SP), balanced accuracy (BA), F−measure and Matthews correlation coefficient (MCC).
Results and discussion
The main aim of this study was to classify and predict novel compounds active against parasitic nematodes. The various molecular descriptors (333 in total) were calculated initially, using MOE . After removing insignificant attributes (standard deviation ≤ 0.3) and applying a correlation test with a cutoff value of 0.8 we were able to reduce the total number of attributes to 113. Subsequently the SDA algorithm was applied and finally a set of 14 descriptors was selected for the development of classification model (details in Methods section).
Performance measure of SVM classifier in training and test dataset.
The machine learning systems such as this could clearly reduce the cost involved in experimental methods involved in drug discovery pipeline. As the SVM algorithm has been effectively applied in various classification problems, we investigated the utility of SVM approach for the prediction of potential anthelmintic lead compounds. The accuracy of the model on the training dataset may indicate the effectiveness of a prediction model however; it may not be able to accurately show how the model will perform on novel compounds. Therefore, it is critical to test the model on an independent dataset, not used in training. In our case we trained and optimized the SVM classifier separately using the entire training set and evaluated the model on the test set. As shown in Table 4, the SVM model obtained an accuracy of 81.79% for the test set. On careful examination of our prediction result, we find that structural similarity of many false positives to the compounds in the active set is quite high, which may suggest a lower accuracy figure for the test set, due to our stringent threshold values. Further, we also note that a few false negatives are at the borderline and are thus classified as inactive by our model. To best of our knowledge, there are not many reported studies on the prediction of anthelmintic compounds therefore we were able to compare our results with only one study. We find that our results are comparable to that study. Marrero-Ponce et al. used linear discriminant analysis to classify anthelmintic drug-like from non-anthelmintic compounds. The authors reported the accuracy of around 90.4 % in the training set while 88.2% in the test set which is slightly higher than ours. However, we believe our model is more robust because our selection criterion to pick inactive compounds was quite stringent. We selected molecules within the Tanimoto range of 0.25 to 0.75 of the compounds present in the active set which would make it relatively difficult to classify than if chosen randomly. The idea was to build a robust model that can classify compounds into separate groups even with structural similarity. Further, we surmise that since DrugBank covers most of the FDA approved drugs, the inclusion of DrugBank compounds in our inactive dataset would allow us to navigate to the unexplored regions of drug-like chemical space.
The number of unique scaffolds found in active and inactive sets along with the percentage relative to the dataset size.
Size of the dataset
Percentage (relative to dataset size)
We were able to compile an extensive dataset of anthelmintic compounds for the development and validation of support vector machine model. We thoroughly tested the SVM approach for identifying the potential compounds with anthelmintic activity. From our results we conclude that SVM method is well suited for the prediction of anthelmintic (or antiparastic) compounds. We were also able to identify a number of interesting compounds with potential activity against parasitic nematodes however; experimental validation of the predicted compounds is needed.
Conflict of interest
We thank Dr. Dominique Gorse for useful discussions during this study. VK is grateful to Macquarie University for the award of MQRES research scholarship.
This article has been published as part of BMC Bioinformatics Volume 12 Supplement 13, 2011: Tenth International Conference on Bioinformatics – First ISCB Asia Joint Conference 2011 (InCoB/ISCB-Asia 2011): Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/12?issue=S13.
- Ranganathan S, Menon R, Gasser RB: Advanced in silico analysis of expressed sequence tag (EST) data for parasitic nematodes of major socio-economic importance-fundamental insights toward biotechnological outcomes. Biotechnol Adv 2009, 27(4):439–448. 10.1016/j.biotechadv.2009.03.005View ArticlePubMedGoogle Scholar
- Lacey E: Mode of action of benzimidazoles. Parasitol Today 1990, 6(4):112–115. 10.1016/0169-4758(90)90227-UView ArticlePubMedGoogle Scholar
- Friedman PA, Platzer EG: Interaction of anthelmintic benzimidazoles with Ascaris suum embryonic tubulin. Biochim Biophys Acta 1980, 630(2):271–278. 10.1016/0304-4165(80)90431-6View ArticlePubMedGoogle Scholar
- Kohler P, Bachmann R: Intestinal tubulin as possible target for the chemotherapeutic action of mebendazole in parasitic nematodes. Mol Biochem Parasitol 1981, 4(5–6):325–336. 10.1016/0166-6851(81)90064-5View ArticlePubMedGoogle Scholar
- Cully DF, Vassilatis DK, Liu KK, Paress PS, Van der Ploeg LH, Schaeffer JM, Arena JP: Cloning of an avermectin-sensitive glutamate-gated chloride channel from Caenorhabditis elegans. Nature 1994, 371(6499):707–711. 10.1038/371707a0View ArticlePubMedGoogle Scholar
- Holden-Dye L, Walker RJ: Avermectin and avermectin derivatives are antagonists at the 4-aminobutyric acid (GABA) receptor on the somatic muscle cells of Ascaris; is this the site of anthelmintic action? Parasitology 1990, 101(Pt 2):265–271.View ArticlePubMedGoogle Scholar
- Bokisch AJ, Walker RJ: The action of Avermectin (MK 936) on identified central neurones from Helix and its interaction with acetylcholine and gamma-aminobutyric acid (GABA) responses. Comp Biochem Physiol C 1986, 84(1):119–125. 10.1016/0742-8413(86)90176-3View ArticlePubMedGoogle Scholar
- Martin RJ, Verma S, Levandoski M, Clark CL, Qian H, Stewart M, Robertson AP: Drug resistance and neurotransmitter receptors of nematodes: recent studies on the mode of action of levamisole. Parasitology 2005, 131(Suppl):S71–84.PubMedGoogle Scholar
- Sutherland IA, Leathwick DM: Anthelmintic resistance in nematode parasites of cattle: a global issue? Trends Parasitol 2011, 27(4):176–181. 10.1016/j.pt.2010.11.008View ArticlePubMedGoogle Scholar
- James CE, Hudson AL, Davey MW: Drug resistance mechanisms in helminths: is it survival of the fittest? Trends in Parasitology 2009, 25(7):328–335. 10.1016/j.pt.2009.04.004View ArticlePubMedGoogle Scholar
- Geerts S, Gryseels B: Drug resistance in human helminths: current situation and lessons from livestock. Clin Microbiol Rev 2000, 13(2):207–222. 10.1128/CMR.13.2.207-222.2000PubMed CentralView ArticlePubMedGoogle Scholar
- Keiser J, Utzinger J, Xiao-Nong RBRO, Jürg U: The drugs we have and the drugs we need against major helminth infections. In Advances in Parasitology. Volume 73. Academic Press; 2010:197–230.Google Scholar
- Kaminsky R, Ducray P, Jung M, Clover R, Rufener L, Bouvier J, Weber SS, Wenger A, Wieland-Berghausen S, Goebel T, et al.: A new class of anthelmintics effective against drug-resistant nematodes. Nature 2008, 452(7184):176–180. 10.1038/nature06722View ArticlePubMedGoogle Scholar
- Hu Y, Xiao SH, Aroian RV: The new anthelmintic tribendimidine is an L-type (levamisole and pyrantel) nicotinic acetylcholine receptor agonist. PLoS Negl Trop Dis 2009, 3(8):e499. 10.1371/journal.pntd.0000499PubMed CentralView ArticlePubMedGoogle Scholar
- Harder A, von Samson-Himmelstjerna G: Cyclooctadepsipeptides--a new class of anthelmintically active compounds. Parasitol Res 2002, 88(6):481–488. 10.1007/s00436-002-0619-2View ArticlePubMedGoogle Scholar
- Harder A, Schmitt-Wrede HP, Krucken J, Marinovski P, Wunderlich F, Willson J, Amliwala K, Holden-Dye L, Walker R: Cyclooctadepsipeptides--an anthelmintically active class of compounds exhibiting a novel mode of action. Int J Antimicrob Agents 2003, 22(3):318–331. 10.1016/S0924-8579(03)00219-XView ArticlePubMedGoogle Scholar
- Keiser J, Chollet J, Xiao S-H, Mei J-Y, Jiao P-Y, Utzinger Jr, Tanner M: Mefloquine-”an aminoalcohol with promising antischistosomal properties in mice. PLoS Negl Trop Dis 2009, 3(1):e350. 10.1371/journal.pntd.0000350PubMed CentralView ArticlePubMedGoogle Scholar
- Xiao S-H, Keiser J, Chen M-G, Tanner M, Utzinger J, Xiao-Nong RBRO, Jürg U: Research and development of antischistosomal drugs in the people's republic of China: a 60-year review. In Advances in Parasitology. Volume 73. Academic Press; 2010:231–295.Google Scholar
- Marrero-Ponce Y, Castillo-Garit JA, Olazabal E, Serrano HS, Morales A, Castanedo N, Ibarra-Velarde F, Huesca-Guillen A, Jorge E, del Valle A, et al.: TOMOCOMD-CARDD, a novel approach for computer-aided 'rational' drug design: I. Theoretical and experimental assessment of a promising method for computational screening and in silico design of new anthelmintic compounds. J Comput Aided Mol Des 2004, 18(10):615–634. 10.1007/s10822-004-5171-yView ArticlePubMedGoogle Scholar
- Reddy S, Pati P, Kumar P, Pradeep HN, Sastry N: Virtual screening in drug discovery -- a computational perspective. Current protein & peptide science 2007, 8(4):329–351.View ArticleGoogle Scholar
- Freitas RF, Oprea TI, Montanari CA: 2D QSAR and similarity studies on cruzain inhibitors aimed at improving selectivity over cathepsin L. Bioorganic & Medicinal Chemistry 2008, 16(2):838–853. 10.1016/j.bmc.2007.10.048View ArticleGoogle Scholar
- Sousa Sr, Fernandes P, Ramos M: Protein-ligand docking: Current status and future challenges. Proteins 2006, 65(1):15–26. 10.1002/prot.21082View ArticlePubMedGoogle Scholar
- Geppert H, Vogt M, Bajorath J: Current trends in ligand-based virtual screening: molecular representations, data mining methods, new application areas, and performance evaluation. Journal of Chemical Information and Modeling 2010, 50(2):205–216. 10.1021/ci900419kView ArticlePubMedGoogle Scholar
- Zernov VV, Balakin KV, Ivaschenko AA, Savchuk NP, Pletnev IV: Drug discovery using support vector machines. The case studies of drug-likeness, agrochemical-likeness, and enzyme inhibition predictions. Journal of Chemical Information and Computer Sciences 2003, 43(6):2048–2056. 10.1021/ci0340916PubMedGoogle Scholar
- Warmuth MK, Liao J, Ratsch G, Mathieson M, Putta S, Lemmen C: Active learning with support vector machines in the drug discovery process. Journal of Chemical Information and Computer Sciences 2003, 43(2):667–673. 10.1021/ci025620tPubMedGoogle Scholar
- Burbidge R, Trotter M, Buxton B, Holden S: Drug design by machine learning: support vector machines for pharmaceutical data analysis. Computers & Chemistry 2001, 26(1):5–14.View ArticleGoogle Scholar
- Woods D, Williams T: The challenges of developing novel antiparasitic drugs. Invertebrate Neuroscience 2007, 7(4):245–250. 10.1007/s10158-007-0055-1View ArticlePubMedGoogle Scholar
- Tropsha A: Best practices for QSAR model development, validation, and exploitation. Molecular Informatics 2010, 29(6–7):476–488. 10.1002/minf.201000061View ArticleGoogle Scholar
- Wang Y, Xiao J, Suzek TO, Zhang J, Wang J, Bryant SH: PubChem: a public information system for analyzing bioactivities of small molecules. Nucleic Acids Research 2009, 37(suppl 2):W623-W633.PubMed CentralView ArticlePubMedGoogle Scholar
- Holden-Dye L, Walker RJ: Anthelmintic drugs. WormBook 2007, 1–13.Google Scholar
- Mayer AM, Hamann MT: Marine pharmacology in 2001--2002: marine compounds with anthelmintic, antibacterial, anticoagulant, antidiabetic, antifungal, anti-inflammatory, antimalarial, antiplatelet, antiprotozoal, antituberculosis, and antiviral activities; affecting the cardiovascular, immune and nervous systems and other miscellaneous mechanisms of action. Comp Biochem Physiol C Toxicol Pharmacol 2005, 140(3–4):265–286. 10.1016/j.cca.2005.04.004View ArticlePubMedGoogle Scholar
- Mayer AM, Rodriguez AD, Berlinck RG, Hamann MT: Marine pharmacology in 2003–4: marine compounds with anthelmintic antibacterial, anticoagulant, antifungal, anti-inflammatory, antimalarial, antiplatelet, antiprotozoal, antituberculosis, and antiviral activities; affecting the cardiovascular, immune and nervous systems, and other miscellaneous mechanisms of action. Comp Biochem Physiol C Toxicol Pharmacol 2007, 145(4):553–581. 10.1016/j.cbpc.2007.01.015PubMed CentralView ArticlePubMedGoogle Scholar
- Mayer AM, Rodriguez AD, Berlinck RG, Hamann MT: Marine pharmacology in 2005–6: Marine compounds with anthelmintic, antibacterial, anticoagulant, antifungal, anti-inflammatory, antimalarial, antiprotozoal, antituberculosis, and antiviral activities; affecting the cardiovascular, immune and nervous systems, and other miscellaneous mechanisms of action. Biochim Biophys Acta 2009, 1790(5):283–308. 10.1016/j.bbagen.2009.03.011PubMed CentralView ArticlePubMedGoogle Scholar
- Wishart D, Knox C, Guo AC, Cheng D, Shrivastava S, Tzur D, Gautam B, Hassanali M: DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucleic Acids Research 2008, 36(Database issue):D901–906.PubMed CentralPubMedGoogle Scholar
- Trotter MWB, Holden SB: Support vector machines for ADME property classification. Qsar & Combinatorial Science 2003, 22(5):533–548. 10.1002/qsar.200310006View ArticleGoogle Scholar
- Overington J: ChEMBL. An interview with John Overington, team leader, chemogenomics at the European Bioinformatics Institute Outstation of the European Molecular Biology Laboratory (EMBL-EBI). Interview by Wendy A. Warr. J Comput Aided Mol Des 2009, 23(4):195–198. 10.1007/s10822-009-9260-9View ArticlePubMedGoogle Scholar
- Pipeline Pilot[http://accelrys.com/]
- Bemis GW, Murcko MA: The properties of known drugs. 1. Molecular frameworks. Journal of Medicinal Chemistry 1996, 39(15):2887–2893. 10.1021/jm9602928View ArticlePubMedGoogle Scholar
- Dutta D, Guha R, Wild D, Chen T: Ensemble feature selection: consistent descriptor subsets for multiple QSAR models. Journal of Chemical Information and Modeling 2007, 47(3):989–997. 10.1021/ci600563wView ArticlePubMedGoogle Scholar
- Duch W: Filter Methods. In Feature Extraction: Foundations and Applications. Volume 207. Edited by: Guyon I, Gunn S, Nikravesh M, Zadeh L. Berlin, Germany: Springer; 2006.View ArticleGoogle Scholar
- Marchiori E, Moore J, Soto A, Cecchini R, Vazquez G, Ponzoni I: A wrapper-based feature selection method for ADMET prediction using evolutionary computing. In Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. Volume 4973. Springer Berlin /Heidelberg; 2008:188–199. 10.1007/978-3-540-78757-0_17View ArticleGoogle Scholar
- Jennrich RI: Stepwise discriminant analysis. In Statistical methods for digital computers. Volume 3. Edited by: Enslein K, Ralston A, Wilf HS. New York: Wiley; 1977:76–96.Google Scholar
- Tanagra: free data mining software[http://eric.univ-lyon2.fr/~ricco/tanagra/en/tanagra.html]
- Cortes C, Vapnik V: Support-vector networks. Machine Learning 1995, 20(3):273–297.Google Scholar
- Jorissen RN, Gilson MK: Virtual screening of molecular databases using a support vector machine. Journal of Chemical Information and Modeling 2005, 45(3):549–561. 10.1021/ci049641uView ArticlePubMedGoogle Scholar
- Liew CY, Ma XH, Liu X, Yap CW: SVM model for virtual screening of Lck inhibitors. Journal of Chemical Information and Modeling 2009, 49(4):877–885. 10.1021/ci800387zView ArticlePubMedGoogle Scholar
- Byvatov E, Fechner U, Sadowski J, Schneider G: Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. J Chem Inf Comput Sci 2003, 43(6):1882–1889. 10.1021/ci0341161View ArticlePubMedGoogle Scholar
- Ivanciuc O: Applications of support vector machines in chemistry. Reviews in Computational Chemistry 2007, 23: 291–400.Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.