An effective docking strategy for virtual screening based on multi-objective optimization algorithm
© Li et al; licensee BioMed Central Ltd. 2009
Received: 23 September 2008
Accepted: 11 February 2009
Published: 11 February 2009
Development of a fast and accurate scoring function in virtual screening remains a hot issue in current computer-aided drug research. Different scoring functions focus on diverse aspects of ligand binding, and no single scoring can satisfy the peculiarities of each target system. Therefore, the idea of a consensus score strategy was put forward. Integrating several scoring functions, consensus score re-assesses the docked conformations using a primary scoring function. However, it is not really robust and efficient from the perspective of optimization. Furthermore, to date, the majority of available methods are still based on single objective optimization design.
In this paper, two multi-objective optimization methods, called MOSFOM, were developed for virtual screening, which simultaneously consider both the energy score and the contact score. Results suggest that MOSFOM can effectively enhance enrichment and performance compared with a single score. For three different kinds of binding sites, MOSFOM displays an excellent ability to differentiate active compounds through energy and shape complementarity. EFMOGA performed particularly well in the top 2% of database for all three cases, whereas MOEA_Nrg and MOEA_Cnt performed better than the corresponding individual scoring functions if the appropriate type of binding site was selected.
The multi-objective optimization method was successfully applied in virtual screening with two different scoring functions that can yield reasonable binding poses and can furthermore, be ranked with the potentially compromised conformations of each compound, abandoning those conformations that can not satisfy overall objective functions.
With the thriving development and confirmative significance of computational chemistry in drug discovery, more and more medicinal chemists and pharmacologists are using computational methods in their drug discovery research[1, 2], and numerous drugs developed based on the clues provided by computations (modeling and simulation) have entered clinical trials or have been launched into the market already. For the computational chemist, an attractive goal is to develop computer programs capable of automatically evaluating large-scale chemical libraries (databases). These computational methods are referred to as virtual screening (VS). In general, two strategies have been employed in virtual screening: (1), using pharmacophore-based database searching (PBDS) methods to identify potential hits from chemical libraries, mostly in the cases where three-dimensional (3D) structures of the targets are unknown; and (2), using molecular docking approaches to screen the libraries in cases where the 3D structures of the targets are available[4, 5]. Because more and more 3D structures of drug target proteins are available, VS with molecular docking approaches has become promising, as demonstrated by numerous recent examples[2, 6–10].
The core steps of structure-based virtual screening (SBVS) are docking and scoring. Since Kuntz et al. published the first docking algorithm DOCK in 1982, numerous docking programs have been developed during the past two decades [12–25]. Several comprehensive reviews of the advances of docking algorithms and applications have been published [26–30]. Scoring (ranking) the compounds retrieved from a database is performed simultaneously with the docking simulation. Molecular docking is a typical optimization problem, for it is difficult to obtain the global optimum solution. As the fitness during the optimization process, scoring function should be fast and accurate enough, allowing simultaneous ranking of the retrieved poses in the optimization process. Based on this idea, several scoring functions have been developed [31–33]. Unfortunately, there is no scoring function developed so far that can reliably and consistently predict a ligand-protein binding mode and the binding affinity at the same time[31, 32, 34]. Therefore, heuristic docking and consensus score strategies are often used in virtual screening [34–36].
Since a huge number of compounds in a database have to be thoroughly tested in the virtual screening process, several crucial issues have to be addressed. One is the computational cost; only docking programs capable of docking a flexible ligand within a reasonable time scale are acceptable for virtual screening. The other one is the ability to discriminate between true actives and inactive compounds; only those docking approaches able to distinguish the active molecules rapidly and accurately, are suitable for virtual screening applications in practice. Although the consensus score strategy has demonstrated an advantage over single scores, it is actually based on the results of single scoring optimization. This means that consensus scoring only re-scores a limited molecules or conformations (generally 30 or more), thereby inevitably losing a number of true positives[33, 34]. To some extent, consensus scoring seems to be far-fetched and artificial .
Most conformational optimization methods in docking program can only deal with a single objective, such as binding energy, shape complementarity, or chemical complementarity. However, real-world problems normally involve multiple objectives (possibly conflicting ones) or optimization criteria, which should be satisfied simultaneously, and suitable solutions to the overall problem cannot be found by using individual optimization algorithms with single objectives. For example, an optimization solution for the binding affinity (energy) between a ligand and a receptor is usually not the optimization solution for other criteria (e.g. shape or chemical complementarity, etc.). Similar problems in combinatorial library design and structural superposition of three-dimensional molecules have been reported[39, 40]. Thus, there is a need for an optimization algorithm compromising several objectives, which may result in more reasonable and robust binding modes between ligands and receptors. In fact, it is a problem of multi-objective optimization (MO), which tends to find a set of alternative good compromises, generally known as pareto-optimal solutions. These solutions are optimal because no other solutions in the search space superior to them when all objectives are considered. Then, the 'optimum' is chosen by the design which fits better in a certain application. There are more than twenty mathematical multi-objective optimization techniques[43, 44]. However, due to their inherent parallelism, evolutionary algorithms (EAs) and genetic algorithms (GAs) are still the top priority in terms of finding multiple pareto-optimal solutions for multi-objective optimization problems.
In this paper, two sets of MO methods, denoted MOSFOM (Multi-Objective Scoring Function Optimization Methodology), were adopted for the binding conformation search of a small molecule within the binding site of a protein using two scoring functions as the objectives. Different from consensus score, MOSFOM does not re-score or re-rank the candidates from the primary virtual screening with one or several other scoring functions, but scores all the molecules in a chemical database with two or more scoring functions simultaneously during the binding pose optimization. Testing results indicate that MOSFOM, which is able to enhance hit rates and greatly reduces the false-positive rate, is more robust and reasonable than the consensus strategy as an alternate tool for large-scale library virtual screening. Here, MOSFOM emphasizes a new strategy to obtain the most reasonable binding conformation and increase hit rates with several scoring functions rather than to accurately predict the binding free energy or the combination of several scoring functions. Consequently, MOSFOM can be used in the prioritization of ligands in high-throughput virtual screening.
Preparation of target proteins
Thrombin, the estrogen receptor alpha, and cyclooxygenase-2 (COX-2) have been used as target proteins for testing the newly developed docking algorithms in this study. The coordinates of the X-ray crystal structures of these three proteins were retrieved from the Protein Data Bank (PDB), including thrombin in complex with Mqpa at 2.2 Å resolution (PDB entry 1ETR), estrogen receptor(ER) in complex with 4-hydroxytamoxifen at 1.90 Å resolution (PDB entry 3ERT), and cyclooxygenase-2(COX-2) in complex with Sc-558 at 3.0 Å resolution (PDB entry 6COX). All water molecules were removed from the protein structures. After extraction of bound ligands, all hydrogen atoms and the Kollman all-atom charges were assigned to the proteins using the BIOPOLYMER module of Sybyl v6.8 (Tripos Associates, Inc. St. Louis, MO). Finally, for each protein target, the binding site was defined as the residues around the bound ligand within 6.5 Å. Gasteiger and Marsili charges were assigned to the extracted ligand of each protein.
Preparation of compounds libraries
Number of the active compounds for each target used in this study
Range RB (Mean)
Theoretical Maximum of Enrich Factor
The energy score and contact score of DOCK [52–54] were used in this work. Energy score based on the AMBER force field was composed of steric and electrostatic terms. ε(r) = 4r was used for the coefficient of the dielectric for the Coulomb potential, and Lennard-Jones 6–12 was used for Van der Waals (VDW) potential. Contact score is a summation of the heavy atom contacts between the ligand and the receptor, which provides a simple assessment of shape complementarity; if two atoms approach close enough to bump into one another, then the interaction can be penalized by a certain amount. In this study, a 4.5-Å contact distance cutoff was used, with 50 penalized for each clash.
Calculation of the enrichment factor
where Hitss is the number of active compounds in the sampled subset, Hitst is the total number of active compounds in the database and N is the number of compounds. In general, the enrichment factor is against random screening; it is evident that the maximum enrichment is determined by the total number of active compounds and the total number of molecules in the database. For instance, there are 695 active compounds among the total 10695(10000+695) molecules in the database for the COX-2 case, i.e. the achievable maximum is 10695/695 ≈ 15. If 5% of active compounds were found among the top 1% of the database, then the enrichment factor would be fivefold over random (EF = 5) at the 1% of the database.
There are three criteria to evaluate the effectiveness of a docking program as indicated by Wei et al. : the value, the location of the vertex, and the percentage of active compounds retrieved, which represent the ability to find active compound of the docking program and scoring function.
A steady-state MOEA based on the ε-dominance concept is a pragmatic and fast multi-objective evolutionary algorithm for finding well-spread pareto-optimal solutions. An EA population P(t) and an archival population E(t) (t is the iteration counter) were used as two co-evolving populations in ε-MOEA. After initialization, two solutions from P(t) and E(t) were chosen for mating. Then, each of these offspring solutions was compared with the archive and the EA population for possible inclusion. For the case of j objectives, the search space and objective space were divided into hyper-boxes (a number of grids, each having the size ε j in the j th objective) to ensure that a hyper-box could be occupied by only one solution through comparing with an identification array of the archive, which guarantees the diversity of the archive. The total number of pareto-optimal solutions, i.e. the final size of suitable solutions, can be controlled approximately by adjusting the ε j value (see refs  and  for more details).
y is the objective vector, which consists of energy score f1(x) and contact score f2(x), respectively. Of course, more scoring function can be used in this method, but no more scoring function source can be accessed. Here the scoring functions of DOCK were just used to deal with this problem. X is denoted as the decision space, and Y is called the objective space.
Selection of the optimum and ranking in ε-MOEA case
As stated above, a set of pareto-optimal solutions were obtained by using ε-MOEA, which simultaneously satisfied energy score and shape complementarity. There are two ways to select an optimal solution from the set of pareto-optimal solutions: MOEA_Nrg or 'energy score ≻ contact score' with the lowest energy conformation and acceptable shape complementarity; and MOEA_Cnt or 'contact score ≻ energy score' with the best shape complementarity conformation and acceptable energy score. The results of MOEA_Nrg and MOEA_Cnt were all compared with their corresponding individual scoring functions, respectively (see Results)
Briefly, a new fast flexible docking program (GAsDock) [25, 61] was developed using a multi-population genetic algorithm based on information entropy[62, 63]. In comparison with other docking methods, information entropy was employed in the genetic algorithm of GAsDock and the size of the narrowed space was used as the convergence criterion, ensuring that GAsDock can converge rapidly and steadily. A detailed description of the algorithm has been presented elsewhere.
where λ i is the weight of each objective. (λ i ≥ 0 and ). By varying the weights, a set of noninferior solutions was generated. s is a positive real variable, when s → ∞, the minimization problem converges to the maximal f i (x), and the minimum is the pareto solution (see additional file 1: Docking example and proof of method)
A knowledge-based method was adopted to obtain appropriate weights for virtual screening, i.e. a set of conformations of the ligand, which are obtained through adjusting the weights, is compared with the ligand in the X-ray crystal structure, and the weights with minimum RMSD are selected. These are considered to best reflect the test values for the current target system, i.e. the most reasonable solution. During this process, useful information can be acquired simultaneously such as the adaptability of different scoring functions to the current system, that is, which scoring function contributes more to the current system, and which scoring function is more sensitive.
As a result, various weights are obtained for different test systems in virtual screening (Figure 1b). For virtual screening of a database, each system only adopts one set of weights, which makes the comparison more equitable. Different from ε-MOEA, EFMOGA yields one solution corresponding to the weight in the pareto-optimal solutions rather than a set of pareto-optimal solutions. Certainly, compared with single objective optimization, the solution obtained here may be worse than any single objective solution, but more reasonable than any extremum obtained by a single objective method because it has considered multiple-objective functions. During the preparation of this manuscript, Grosdidier et al. have developed a new docking software, EADock, based on a multi-objective optimization algorithm, and the results indicate that EADock can accurately predict binding modes for ligand-protein complexes with two fitness functions.
As mentioned above, our study is not aimed at calculating absolute values for the free energy of binding and for the affinity, but focuses on the ranking of acceptable conformations for the multiple solution space. Because of the different preferences in the selection of pareto-optimal solutions with ε-MOEA, the results of three different approaches will be compared with individual scoring functions, namely one with preferred energy score for ε-MOEA (MOEA_Nrg), the other with preferred contact score for ε-MOEA (MOEA_Cnt), and the third for EFMOGA.
Docking to the buried lipophilic site of COX-2
Docking to the relatively large hydrophobic site of Estrogen Receptor
As an example of a well-understood target with therapeutic relevance whose crystallographic structural data were publicly available, estrogen receptor was chosen. Like COX-2, the binding site of estrogen receptor is a relatively larger, fully-enclosed lipophilic cavity (Figure 2b), which is little opened to solvent, there are acceptor groups at either end that can form hydrogen bonds with ligands, but it is predominantly hydrophobic on the whole[49, 65].
Similar to COX-2, approximately 5.7% of active compounds were retrieved using MOEA_Nrg at the top 2% of the database (Figure 5b, Figure 4b), which was about twofold over that of single energy score. Relative to single contact score, MOEA_Cnt retrieved 9.5% of the active compounds at the top 2% of the database, but only less than 5% of the active compounds were retrieved using single contact score. It is encouraging that EFMOGA represents better performance than energy score and contact score at the top 2% of the database, with 8% of active compounds retrieved.
Docking to the intermediate polarity site of Thrombin
In contrast to COX-2 and estrogen receptor, the binding site of thrombin is an intermediate polarity site, with a large hydrophobic pocket (a smaller proximal and a larger distal pocket) which is a more exposed binding site to solvent, in addition, there is a S1 specificity pocket, which is a narrow and restricted pocket comprising the carboxylate group of Asp189 at the bottom, most thrombin inhibitors form hydrogen bonds with Asp189 and also to Gly216 (Figure 2c). Due to many polar groups in the binding site, it is difficult to distinguish active compounds for DOCK energy score which is most reliable for the apolar active site, unlike those scoring functions with extra consideration of hydrogen bonding interactions such as PMF or FlexX score[18, 34, 65–67].
Performance of MOSFOM and Characteristics of binding site and preferential selection of scoring function
Simple as it is, contact score performed unexpectedly well for all the three systems in our study. Especially at the top 5%, contact score performed better than energy score, possibly arising from molecular diversity of the randomly selected ACD database, which distributes well not only in heavy atoms but also molecular torsions. Contact score can rapidly seek out those whose geometry shape can satisfy the binding site (containing shape and volume), which is especially obvious to those with completely buried and narrow cavity. However, energy score presented a weak ability to distinguish the active compounds from the decoys with shape satisfaction.
It is encouraging that compared with individual score, every method of MOSFOM presented an increase not only in enrichment but also hit rates of the active compounds retrieved in some ways. EFMOGA with different weights in different systems obtained better results than single score at the top 5% (especially at the top 2%) of the database for all three cases in this study (see Results), but gradually faded afterwards. It can be easily understood that those active compounds with high affinity and good shape complementarity were first retrieved, and then those emphasizing particularly on the larger weight in different test system were obtained, consecutively. Therefore, if not knowing which scoring function will work better for the test system in advance, more enrichment can be obtained with EFMOGA in the most interesting section (generally 2%) of the database. It should be noted that EFMOGA, with different weights for different test systems through experimental protein-ligand complex, is more reasonable and different from the simple linear combination (LC) of several scoring functions or ScreenScore through a combination of scoring terms. EFMOGA seems more pertinent and advantageous, and will be a tendency for scoring and ranking to consider different scoring functions focusing on diverse aspects for different systems.
Different from EFMOGA, ε-MOEA selected the preferential solution from a pareto-optimal solutions family (parts of the pareto frontier) through multi-objective optimization using multiple scoring functions (see Methods). With different preferences, MOEA_Nrg and MOEA_Cnt all outperformed their corresponding individual scoring function, i.e., MOEA_Nrg corresponding to energy score and MOEA_Cnt corresponding to contact score. A higher enrichment and hit rates of the active compounds were obtained using MOEA_Nrg than using individual energy score in COX-2 and ER system, but there is an inverse phenomenon for thrombin case. There are several strong hydrogen bond interactions between the ligand and the residues in S1 pocket for thrombin system. If we select the one with lowest energy score from the docked conformations with acceptable shape complementarity, more compounds will have good shape complementarity conformations but are only slightly different in energy score, since the energy score does not pay more additional attention to the H-bond interaction. In this case, it will be more difficult to distinguish the active compounds. On the contrary, MOEA_Cnt performed well for the thrombin and ER case, where an open cavity exists, accommodating enough orientations or conformations of a ligand, therefore, it seems more reasonable that MOEA_Cnt selected the best shape complementarity conformation with differences in energy score. So, if one does know which scoring function will work better for the test system in advance, more enrichment and better performance can be obtained using ε-MOEA through relative preference.
Limitations of MOSFOM, further improvements and prospect in bioinformatics and drug design
More scoring functions, which pay special attention to hydrogen bonding interaction, or chemical complementarity, solvent effect, should be involved in multi-objective optimization, however, there are only energy score and contact score available for us. Although MOSFOM performed better than individual score, there will be more choices for MOFSOM in specific test system if more interactions were considered. We can determine different weights of scoring functions because EFMOGA can be determined to deal with different cases and maybe prefer others (e.g. chemical complementarity or hydrogen bonds) for ε-MOEA if we know which scoring function will work well for specific test system in advance. Therefore, to develop scoring functions with focus on hydrogen bond, icon or solvent effect and to utilize multi-objective optimization to acquire more reasonable conformations is one of our future research interest.
At the same time, ignorance of the impact of protein flexibility is another limitation for MOSFOM, small changes of the receptor flexibility result in larger variety of binding affinities, and docking to a single receptor conformation will significantly reduce the chance of finding the correct pose. Considering protein flexibility to acquire correct protein-ligand binding conformation is crucial for medicinal chemists to find out the structure-activity relationship. We will fulfil this work in the future. Also, computational time of multi-objective optimization is another issue to be solved. There are numerous multi-objective optimization methods, however, they are not suitable for large-scale virtual screening because of the huge time consumption, in this paper, ε-MOEA, which is a steady-state MOEA based on the ε-dominance concept, fulfilled virtual screening quickly within about one minute for one molecule, and EFMOGA, which is a multi-population MOGA based on information entropy, accomplished one docking within 40 seconds using one CPU on SGI origin3800 hardware. However, more improvements are needed to find a set of well-distributed solutions close to the true pareto frontier pragmatically and efficiently.
As stated above, multi-objective optimization approach has been adopted in the library design program MoSELECT[39, 71–73] and quantitative structure activity relationships program MoQSAR. We believe that MOEA or MOGA also can be applied to biological sequence alignment, protein fold recognition, conformational generation and ADME/TOX (absorption, distribution, metabolism, excretion and toxicity) with more and more factors taken into account in bioinformatics and drug design.
Development of a fast and accurate scoring function in virtual screening is still a hot and pending issue in current research, different scoring functions focus on different aspects of ligand binding, and no single scoring can satisfy all the systems well, therefore, consensus score was put forward.[35, 36] With several other scoring functions, consensus score re-assessed those conformations optimized using a primary scoring function, but it is not really robust from the viewpoint of optimization. All of these give rise to another heuristic thinking for us, is it possible and more rational to find a most reasonable conformation in the process of optimization using two or more scoring functions in virtual screening? Since present scoring functions can not satisfy all the cases, multi-objective optimization method can be a good compromise.
In this study, we present two kinds of multi-objective optimization method, called MOSFOM, in virtual screening which simultaneously considers energy score and contact score. A set of pareto-optimal solutions were obtained which can simultaneously satisfy energy and shape complementarity using ε-MOEA method, then through two different preferences, the binding conformations of pareto-optimal solutions with lowest energy scoring or best shape complementarity were ranked among all selected conformations. However, EFMOGA acquires different weights of each scoring function for different systems based on experimental X-ray complex structure, that is, for different system, EFMOGA regards the varying weight as the degree of contribution of individual scoring function, which generally focuses on one aspect for a special type of binding sites, this means the scoring function with bigger weight is more suitable for the binding site. We used 10000 random compounds as the decoys, active compounds selected from MDDR database were randomly merged into the decoys, respectively. To ensure justness of comparison, GAsDock, based on an improved genetic algorithm, was used as a benchmark for single-objective optimization. Results show that MOSFOM can enhance the enrichment and greatly increase the hit rates compared with individual score (see Results). For three different kinds of binding sites, MOSFOM represents excellent ability of distinction of active compounds with energy and shape complementarity. EFMOGA specially performed well at the top 2% of the database, MOEA_Nrg and MOEA_Cnt performed better than individual scoring function if a proper type of binding site was selected.
Multi-objective optimization method was successfully applied in virtual screening with two different scoring functions, which can gain reasonable binding pose, and can be furthermore ranked with those potentially compromised conformations of each compound which abandon those conformations that can not satisfy overall objective functions. By analyzing the characteristics of binding site in advance, the most effective multi-objective optimization method is adopted, which is meaningful for all current scoring functions since they can not suit all cases. More specific scoring functions and protein flexibility will be taken into account in our future work.
The authors thank Prof. K. Deb for the source code of ε-MOEA. We also thank Prof. Rolf Hilgenfeld for careful reading this manuscript, and the referees for helpful suggestions and comments. This work was supported by the Special Fund for Major State Basic Research Project (2004CB518900, 2009CB918501 and 2009CB918502), the National Natural Science Foundation of China (grant 20803022, 10572033, 30672539 and 20721003), the Shanghai Committee of Science and Technology(Grant 07dz22004), and the 863 Hi-Tech Program of China (grant 2007AA02Z304). H.L. was also supported by Knowledge Innovation Program of the Chinese Academy of Sciences (grant SIMM0709QN-09).
- Jorgensen WL: The many roles of computation in drug discovery. Science 2004, 303(5665):1813–1818. 10.1126/science.1096361View ArticlePubMedGoogle Scholar
- Shoichet BK: Virtual screening of chemical libraries. Nature 2004, 432(7019):862–865. 10.1038/nature03197PubMed CentralView ArticlePubMedGoogle Scholar
- Congreve M, Murray CW, Blundell TL: Keynote review: Structural biology and drug discovery. Drug Discov Today 2005, 10(13):895–907. 10.1016/S1359-6446(05)03484-7View ArticlePubMedGoogle Scholar
- Xiong B, Gui CS, Xu XY, Luo C, Chen J, Luo HB, Chen LL, Li GW, Sun T, Yu CY, et al.: A 3D model of SARS_CoV 3CL proteinase and its inhibitors design by virtual screening. Acta Pharmacol Sin 2003, 24(6):497–504.PubMedGoogle Scholar
- Lengauer T, Lemmen C, Rarey M, Zimmermann M: Novel technologies for virtual screening. Drug Discov Today 2004, 9(1):27–34. 10.1016/S1359-6446(04)02939-3View ArticlePubMedGoogle Scholar
- Bohdan Waszkowycz, Perkins TimDJ, Sykes RichardA, Li J: Large-scale virtual screening for discovering leads in the postgenomic era. IBM Systems J 2001, 40(2):360–378.View ArticleGoogle Scholar
- Doman TN, McGovern SL, Witherbee BJ, Kasten TP, Kurumbail R, Stallings WC, Connolly DT, Shoichet BK: Molecular docking and high-throughput screening for novel inhibitors of protein tyrosine phosphatase-1B. J Med Chem 2002, 45(11):2213–2221. 10.1021/jm010548wView ArticlePubMedGoogle Scholar
- Chen L, Gui C, Luo X, Yang Q, Gunther S, Scandella E, Drosten C, Bai D, He X, Ludewig B, et al.: Cinanserin is an inhibitor of the 3C-like proteinase of severe acute respiratory syndrome coronavirus and strongly reduces virus replication in vitro. J Virol 2005, 79(11):7095–7103. 10.1128/JVI.79.11.7095-7103.2005PubMed CentralView ArticlePubMedGoogle Scholar
- Liu H, Li Y, Song MK, Tan XJ, Cheng F, Zheng SX, Shen JH, Luo XM, Ji RY, Yue JM, et al.: Structure-based discovery of potassium channel blockers from natural products: Virtual screening and electrophysiological assay testing. Chem Biol 2003, 10(11):1103–1113. 10.1016/j.chembiol.2003.10.011View ArticlePubMedGoogle Scholar
- Alvarez JC: High-throughput docking as a source of novel drug leads. Curr Opin Chem Biol 2004, 8(4):365–370. 10.1016/j.cbpa.2004.05.001View ArticlePubMedGoogle Scholar
- Kuntz ID, Blaney JM, Oatley SJ, Langridge R, Ferrin TE: A geometric approach to macromolecule-ligand interactions. J Mol Biol 1982, 161(2):269–288. 10.1016/0022-2836(82)90153-XView ArticlePubMedGoogle Scholar
- Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ: Automated Docking Using a Lamarckian Genetic Algorithm and an Empirical Binding Free Energy Function. J Comp Chem 1998, 19(14):1639–1662. Publisher Full Text 10.1002/(SICI)1096-987X(19981115)19:14<1639::AID-JCC10>3.0.CO;2-BView ArticleGoogle Scholar
- Jones G, Willett P, Glen RC, Leach AR, Taylor R: Development and validation of a genetic algorithm for flexible docking. J Mol Biol 1997, 267(3):727–748. 10.1006/jmbi.1996.0897View ArticlePubMedGoogle Scholar
- Halgren TA, Murphy RB, Friesner RA, Beard HS, Frye LL, Pollard WT, Banks JL: Glide: a new approach for rapid, accurate docking and scoring. 2. Enrichment factors in database screening. J Med Chem 2004, 47(7):1750–1759. 10.1021/jm030644sView ArticlePubMedGoogle Scholar
- Dixon JS: Evaluation of the CASP2 docking section. Proteins 1997, 1: 198–204. Publisher Full Text 10.1002/(SICI)1097-0134(1997)1+<198::AID-PROT26>3.0.CO;2-IView ArticlePubMedGoogle Scholar
- Perola E, Xu K, Kollmeyer TM, Kaufmann SH, Prendergast FG, Pang YP: Successful virtual screening of a chemical database for farnesyltransferase inhibitor leads. J Med Chem 2000, 43(3):401–408. 10.1021/jm990408aView ArticlePubMedGoogle Scholar
- Claussen H, Buning C, Rarey M, Lengauer T: FlexE: efficient molecular docking considering protein structure variations. J Mol Biol 2001, 308(2):377–395. 10.1006/jmbi.2001.4551View ArticlePubMedGoogle Scholar
- Rarey MKB, Lengauer T, Klebe G: A fast flexible docking method using an incremental construction algorithm. J Mol Biol 1996, 261: 470–489. 10.1006/jmbi.1996.0477View ArticlePubMedGoogle Scholar
- Baxter CA, Murray CW, Clark DE, Westhead DR, Eldridge MD: Flexible docking using Tabu search and an empirical estimate of binding affinity. Proteins 1998, 33(3):367–382. 10.1002/(SICI)1097-0134(19981115)33:3<367::AID-PROT6>3.0.CO;2-WView ArticlePubMedGoogle Scholar
- Colin McMartin RSB: QXP: Powerful, rapid computer algorithms for structure-based drug design. J Comput Aided Mol Des 1997, 11: 333–344. 10.1023/A:1007907728892View ArticleGoogle Scholar
- Welch W, Ruppert J, Jain AN: Hammerhead: fast, fully automated docking of flexible ligands to protein binding sites. Chem Biol 1996, 3(6):449–462. 10.1016/S1074-5521(96)90093-9View ArticlePubMedGoogle Scholar
- Miller MD, Kearsley SK, Underwood DJ, Sheridan RP: FLOG: a system to select 'quasi-flexible' ligands complementary to a receptor of known three-dimensional structure. J Comput Aided Mol Des 1994, 8(2):153–174. 10.1007/BF00119865View ArticlePubMedGoogle Scholar
- Venkatachalam CMJX, Oldfield T, Waldman M: LigandFit: a novel method for the shape-directed rapid docking of ligands to protein active sites. J Mol Graph Model 2003, 21(4):289–307. 10.1016/S1093-3263(02)00164-XView ArticlePubMedGoogle Scholar
- Liu M, Wang S: MCDOCK: a Monte Carlo simulation approach to the molecular docking problem. J Comput Aided Mol Des 1999, 13(5):435–451. 10.1023/A:1008005918983View ArticlePubMedGoogle Scholar
- Li H, Li C, Gui C, Luo X, Chen K, Shen J, Wang X, Jiang H: GAsDock: a new approach for rapid flexible docking based on an improved multi-population genetic algorithm. Bioorg Med Chem Lett 2004, 14(18):4671–4676. 10.1016/j.bmcl.2004.06.091View ArticlePubMedGoogle Scholar
- Buckley PA, Baz Jackson J, Schneider T, White SA, Rice DW, Baker PJ: Protein-protein recognition, hydride transfer and proton pumping in the transhydrogenase complex. Structure Fold Des 2000, 8(8):809–815. 10.1016/S0969-2126(00)00171-4View ArticlePubMedGoogle Scholar
- Lyne PD: Structure-based virtual screening: an overview. Drug Discov Today 2002, 7(20):1047–1055. 10.1016/S1359-6446(02)02483-2View ArticlePubMedGoogle Scholar
- Kitchen DB, Decornez H, Furr JR, Bajorath J: Docking and scoring in virtual screening for drug discovery: methods and applications. Nat Rev Drug Discov 2004, 3(11):935–949. 10.1038/nrd1549View ArticlePubMedGoogle Scholar
- Bossuyt J, Taylor BE, James-Kracke M, Hale CC: The cardiac sodium-calcium exchanger associates with caveolin-3. Ann N Y Acad Sci 2002, 976: 197–204.View ArticlePubMedGoogle Scholar
- Shoichet BK, McGovern SL, Wei BQ, Irwin JJ: Lead discovery using molecular docking. Curr Opin Chem Biol 2002, 6(4):439–446. 10.1016/S1367-5931(02)00339-3View ArticlePubMedGoogle Scholar
- Ferrara P, Gohlke H, Price DJ, Klebe G, Brooks CL 3rd: Assessing scoring functions for protein-ligand interactions. J Med Chem 2004, 47(12):3032–3047. 10.1021/jm030489hView ArticlePubMedGoogle Scholar
- Gohlke HKG: Approaches to the description and prediction of the binding affinity of small-molecule ligands to acromolecular receptors. Angew Chem Int Ed Engl 2002, 41(15):2644–2676. 10.1002/1521-3773(20020802)41:15<2644::AID-ANIE2644>3.0.CO;2-OView ArticlePubMedGoogle Scholar
- Boehm H-J, Stahl M: The Use of Scoring Functions in Drug Discovery Applicationg.New York: Wiley-VCH; 2002., 18: [http://www3.interscience.wiley.com/cgi-bin/summary/103020910/SUMMARY?CRETRY=1&SRETRY=0]Google Scholar
- Bissantz C, Folkers G, Rognan D: Protein-based virtual screening of chemical databases. 1. Evaluation of different docking/scoring combinations. J Med Chem 2000, 43(25):4759–4767. 10.1021/jm001044lView ArticlePubMedGoogle Scholar
- Charifson PS, Corkery JJ, Murcko MA, Walters WP: Consensus scoring: A method for obtaining improved hit rates from docking databases of three-dimensional structures into proteins. J Med Chem 1999, 42(25):5100–5109. 10.1021/jm990352kView ArticlePubMedGoogle Scholar
- Clark RD, Strizhev A, Leonard JM, Blake JF, Matthew JB: Consensus scoring for ligand/protein interactions. J Mol Graph Model 2002, 20(4):281–295. 10.1016/S1093-3263(01)00125-5View ArticlePubMedGoogle Scholar
- Xing L, Hodgkin E, Liu Q, Sedlock D: Evaluation and application of multiple scoring functions for a virtual screening experiment. J Comput Aided Mol Des 2004, 18(5):333–344. 10.1023/B:JCAM.0000047812.39758.abView ArticlePubMedGoogle Scholar
- Fonseca CM, Fleming PJ: An overview of evolutionary algorithms in multiobjecitive optimization. Evol Comput 1995, 3(1):1–16. 10.1162/evco.1918.104.22.168View ArticleGoogle Scholar
- Gillet VJKW, Willett P, Fleming PJ, Green DV: Combinatorial library design using a multiobjective genetic algorithm. J Chem Inf Comput Sci 2002, 42(2):375–385.View ArticlePubMedGoogle Scholar
- Handschuh S, Wagener M, Gasteiger J: Superposition of three-dimensional chemical structures allowing for conformational flexibility by a hybrid method. J Chem Inf Comput Sci 1998, 38(2):220–232.View ArticlePubMedGoogle Scholar
- Zitzler E: Evolutionary Algorithms for Multiobjective Optimization: Methods and Applications. PhD thesis. Zurich, Switzerland,: Swiss Federal Institute of Technology (ETH); 1999.Google Scholar
- Coello CAC: An Empirical Study of Evolutionary Techniques for Multiobjective Optimization in Engineering Design. PhD thesis. New Orleans, Louisiana,: Tulane University; 1996.Google Scholar
- Zitzler E, Deb K, Thiele L: Comparison of Multiobjective Evolutionary Algorithms: Empirical Results. Evol Comput 2000, 8(2):173–195. 10.1162/106365600568202View ArticlePubMedGoogle Scholar
- Coello CAC: An undated survey of GA-based multiobjective optimization techniques. ACM Comput surveys: 2000 2000, 109–143. 10.1145/358923.358929Google Scholar
- Deb K: Multi-Objective Optimization using Evolutionary Algorithms. Chichester, UK,: John Wiley & Sons; 2001.Google Scholar
- Brooijmans N, Kuntz ID: Molecular recognition and docking algorithms. Annu Rev Biophys Biomol Struct 2003, 32: 335–373. 10.1146/annurev.biophys.32.110601.142532View ArticlePubMedGoogle Scholar
- Berman HM, Battistuz T, Bhat TN, Bluhm WF, Bourne PE, Burkhardt K, Feng Z, Gilliland GL, Iype L, Jain S, et al.: The Protein Data Bank. Acta Crystallogr D Biol Crystallogr 2002, 58(Pt 6 No 1):899–907. 10.1107/S0907444902003451View ArticlePubMedGoogle Scholar
- Brandstetter H, Turk D, Hoeffken HW, Grosse D, Sturzebecher J, Martin PD, Edwards BF, Bode W: Refined 2.3 A X-ray crystal structure of bovine thrombin complexes formed with the benzamidine and arginine-based thrombin inhibitors NAPAP, 4-TAPAP and MQPA. A starting point for improving antithrombotics. J Mol Biol 1992, 226: 1085. 10.1016/0022-2836(92)91054-SView ArticlePubMedGoogle Scholar
- Shiau AK, Barstad D, Loria PM, Cheng L, Kushner PJ, Agard DA, Greene GL: The structural basis of estrogen receptor/coactivator recognition and the antagonism of this interaction by tamoxifen. Cell 1998, 95: 927. 10.1016/S0092-8674(00)81717-1View ArticlePubMedGoogle Scholar
- Kurumbail RG, Stevens AM, Gierse JK, McDonald JJ, Stegeman RA, Pak JY, Gildehaus D, Miyashiro JM, Penning TD, Seibert K, Isakson PC, Stallings WC: Structural basis for selective inhibition of cyclooxygenase-2 by anti-inflammatory agents. Nature 1996, 384: 644. 10.1038/384644a0View ArticlePubMedGoogle Scholar
- Gasteiger J, Marsili M: Iterative partial equalization of orbital electronegativity – a rapid access to atomic charges. Tetrahedron 1980, 36: 3219–3228. 10.1016/0040-4020(80)80168-2View ArticleGoogle Scholar
- Meng EC, Shoichet BK, Kuntz ID: Automated docking with grid-based energy evaluation. J Comp Chem 1992, 13: 504–524.View ArticleGoogle Scholar
- Ewing TJA, Kuntz ID: Critical evaluation of search algorithms used in automated molecular docking. J Comp Chem 1997, 18(9):1175–1189. Publisher Full Text 10.1002/(SICI)1096-987X(19970715)18:9<1175::AID-JCC6>3.0.CO;2-OView ArticleGoogle Scholar
- Ewing TJ, Makino S, Skillman AG, Kuntz ID: DOCK 4.0: search strategies for automated molecular docking of flexible molecule databases. J Comput Aided Mol Des 2001, 15(5):411–428. 10.1023/A:1011115820450View ArticlePubMedGoogle Scholar
- Weiner SJ, Kollman PA, Nguyen DT, Case DA: A New force filed for molecular mechanical simulation of nucleic acids and proteins. J Comp Chem 1986, 7: 230–252. 10.1002/jcc.540070216View ArticleGoogle Scholar
- Wei BQ, Baase WA, Weaver LH, Matthews BW, Shoichet BK: A model binding site for testing scoring functions in molecular docking. J Mol Biol 2002, 322(2):339–355. 10.1016/S0022-2836(02)00777-5View ArticlePubMedGoogle Scholar
- Laumanns M, Thiele L, Deb K, Zitzler E: Combining convergence and diversity in evolutionary multi-objective optimization. Evol Comput 2002, 10(3):263–282. 10.1162/106365602760234108View ArticlePubMedGoogle Scholar
- Deb K, Mohan M, Mishra S: A Fast Multi-objective Evolutionary Algorithm for Finding Well-Spread Pareto-Optimal Solutions. KanGAL Report 2003., No. 2003002:Google Scholar
- Gohlke H, Hendlich M, Klebe G: Knowledge-based scoring function to predict protein-ligand interactions. J Mol Biol 2000, 295(2):337–356. 10.1006/jmbi.1999.3371View ArticlePubMedGoogle Scholar
- Knegtel RM, Wagener M: Efficacy and selectivity in flexible database docking. Proteins 1999, 37(3):334–345. 10.1002/(SICI)1097-0134(19991115)37:3<334::AID-PROT3>3.0.CO;2-9View ArticlePubMedGoogle Scholar
- Kang L, Li H, Jiang H, Wang X: An improved adaptive genetic algorithm for protein-ligand docking. J Comput Aided Mol Des 2009, 23(1):1–12. 10.1007/s10822-008-9232-5View ArticlePubMedGoogle Scholar
- Mackay DJC: Information Theory, Inference, and Learning Algorithms. Cambridge: Cambridge University Press; 2003.Google Scholar
- Adami C: Information Theory in Molecular Biology. Phys Life Rev 2004, 1: 3–22. 10.1016/j.plrev.2004.01.002View ArticleGoogle Scholar
- Grosdidier A, Zoete V, Michielin O: EADock: docking of small molecules into protein active sites with a multiobjective evolutionary optimization. Proteins 2007, 67(4):1010–1025. 10.1002/prot.21367View ArticlePubMedGoogle Scholar
- Schulz-Gasch T, Stahl M: Binding site characteristics in structure-based virtual screening: evaluation of current docking tools. J Mol Model 2003, 9(1):47–57.PubMedGoogle Scholar
- Muegge I, Martin YC, Hajduk PJ, Fesik SW: Evaluation of PMF scoring in docking weak ligands to the FK506 binding protein. J Med Chem 1999, 42(14):2498–2503. 10.1021/jm990073xView ArticlePubMedGoogle Scholar
- Muegge I, Martin YC: A general and fast scoring function for protein-ligand interactions: a simplified potential approach. J Med Chem 1999, 42(5):791–804. 10.1021/jm980536jView ArticlePubMedGoogle Scholar
- Marsden PM, Puvanendrampillai D, Mitchell JB, Glen RC: Predicting protein-ligand binding affinities: a low scoring game? Org Biomol Chem 2004, 2(22):3267–3273. 10.1039/b409570gView ArticlePubMedGoogle Scholar
- Stahl M, Rarey M: Detailed analysis of scoring functions for virtual screening. J Med Chem 2001, 44(7):1035–1042. 10.1021/jm0003992View ArticlePubMedGoogle Scholar
- Murray CW, Baxter CA, Frenkel AD: The sensitivity of the results of molecular docking to induced fit effects: application to thrombin, thermolysin and neuraminidase. J Comput Aided Mol Des 1999, 13(6):547–562. 10.1023/A:1008015827877View ArticlePubMedGoogle Scholar
- Wright T, Gillet VJ, Green DV, Pickett SD: Optimizing the size and configuration of combinatorial libraries. J Chem Inf Comput Sci 2003, 43(2):381–390.View ArticlePubMedGoogle Scholar
- Gillet VJ, Willett P, Fleming PJ, Green DV: Designing focused libraries using MoSELECT. J Mol Graph Model 2002, 20(6):491–498. 10.1016/S1093-3263(01)00150-4View ArticlePubMedGoogle Scholar
- Gillet VJ: Designing combinatorial libraries optimized on multiple objectives. Methods Mol Biol 2004, 275: 335–354.View ArticlePubMedGoogle Scholar
- Nicolotti O, Gillet VJ, Fleming PJ, Green DV: Multiobjective optimization in quantitative structure-activity relationships: deriving accurate and interpretable QSARs. J Med Chem 2002, 45(23):5069–5080. 10.1021/jm020919oView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.