Development, evaluation and application of 3D QSAR Pharmacophore model in the discovery of potential human renin inhibitors
© John et al; licensee BioMed Central Ltd. 2011
Published: 14 December 2011
Skip to main content
© John et al; licensee BioMed Central Ltd. 2011
Published: 14 December 2011
Renin has become an attractive target in controlling hypertension because of the high specificity towards its only substrate, angiotensinogen. The conversion of angiotensinogen to angiotensin I is the first and rate-limiting step of renin-angiotensin system and thus designing inhibitors to block this step is focused in this study.
Ligand-based quantitative pharmacophore modeling methodology was used in identifying the important molecular chemical features present in the set of already known active compounds and the missing features from the set of inactive compounds. A training set containing 18 compounds including active and inactive compounds with a substantial degree of diversity was used in developing the pharmacophore models. A test set containing 93 compounds, Fischer randomization, and leave-one-out methods were used in the validation of the pharmacophore model. Database screening was performed using the best pharmacophore model as a 3D structural query. Molecular docking and density functional theory calculations were used to select the hit compounds with strong molecular interactions and favorable electronic features.
The best quantitative pharmacophore model selected was made of one hydrophobic, one hydrogen bond donor, and two hydrogen bond acceptor features with high a correlation value of 0.944. Upon validation using an external test set of 93 compounds, Fischer randomization, and leave-one-out methods, this model was used in database screening to identify chemical compounds containing the identified pharmacophoric features. Molecular docking and density functional theory studies have confirmed that the identified hits possess the essential binding characteristics and electronic properties of potent inhibitors.
A quantitative pharmacophore model of predictive ability was developed with essential molecular features of a potent renin inhibitor. Using this pharmacophore model, two potential inhibitory leads were identified to be used in designing novel and future renin inhibitors as antihypertensive drugs.
Hypertension is a major factor concerning various cardiovascular diseases such as congestive cardiac failure, stroke, and myocardial infarction and affects up to 30% of the adult population in most countries . Renin is an aspartyl protease and catalytically similar to other enzymes such as pepsin, cathepsin and chymosin etc . Renin cleaves the angiotensinogen to angiotensin-I which is then converted to angiotensin-II by the action of angiotensinogen converting enzyme (ACE). Angiotensin-II is a biologically active vasopressor recognized by its receptors which is one of the cascades of events that leads to the increase in blood pressure. Renin is synthesized as prorenin, a proenzyme, which is transformed into mature renin by the cleavage of 43 amino acids long prosegment from the N-terminal end. This conversion of prorenin to renin occurs in the juxtaglomerular cells of kidney followed by the release of renin into the circulation . Renin blocks the first and rate-limiting step which is the conversion of angiotensinogen to angiotensin-I. Renin is a very specific enzyme towards its only known substrate, angiotensinogen, and this remarkable specificity makes it a very attractive and ideal target to block the renin-angiotensin system (RAS) . Inhibition of renin prevents the formation of both angiotensin-I and II but this is not the case in ACE inhibitors and angiotensin receptor blockers, which increase angiotensin-I or/and II level, respectively. Only renin inhibitors will render the complete RAS quiescent by suppressing the first step of the cascade of events. Thus, inhibition of renin would favor more complete blockade of the system . Potent inhibitors of this enzyme could therefore provide a new alternative way to treat hypertension without inhibiting other biological substances. Aspartyl protease class of enzymes contains two aspartic acid residues that are necessary for the activity. Renin enzyme has a bilobal structure similar to other aspartic proteases and an active site at the interface. The two important aspartate residues Asp32 and Asp215 catalyze the proteolytic function of renin are donated from each lobes of the enzyme . The active site of renin appears as a long, deep cleft that can accommodate seven amino acid units of the substrate, angiotensinogen, and cleaves the peptide bond between Leu10 and Val11 within angiotensinogen to generate angiotensin-I . The approaches followed to develop early renin inhibitors were based on two methodologies. One is to develop similar peptides to prorenin as this segment covers the active site of renin prior to the maturation. The second is based on the N-terminal portion of the substrate, angiotensinogen, for this binds the active site of renin. But these approaches produced only weak inhibitors . The first synthetic renin inhibitor was pepstatin. First-generation renin inhibitors were peptide analogues of the prosegment of renin or substrate analogues of the amino-terminal sequence of angiotensinogen containing the renin cleavage site .Crystal structure analyses of renin-inhibitor complexes and computational molecular modeling were later used to design selective nonpeptide renin inhibitors that lacked the extended peptide-like backbone of previous inhibitor sand had improved pharmacokinetic properties . Aliskiren is the first of these new nonpeptide inhibitors to be approved by the FDA for the treatment of hypertension but its synthesis include many steps. This invites much simpler compounds to be designed as potent renin inhibitors . Aliskiren belongs to the third generation of renin inhibitors where the large (high molecular weight) first and second generation inhibitors could not be exploited as drugs despite of their potency in vitro . To date, only few compounds were successfully developed with potent renin inhibition profiles, high efficacy, and safety. Thus designing inhibitors of high potential for renin inhibition is the most effective way to block the RAS completely. This study was focused to identify novel scaffolds with the potential to turn as the new category of renin inhibitors.
A high-correlation quantitative pharmacophore model was generated, in this study, using the observed structure-activity relationship of known renin inhibitors. We have successfully applied pharmacophore modeling, database screening, molecular docking, and density functional theory (DFT) calculation methodologies in identifying lead candidates to be employed in potent renin inhibitor design and thereby new category of anti-hypertensive agents.
The two-dimensional (2D) chemical structures of all the compounds in the data set were sketched using ChemSketch, version 12 (ACD Inc., Toronto, Canada) and subsequently converted to 3D structures in Accelrys Discovery Studio 2.5 (DS). These 3D compounds were further checked for the added hydrogens and minimized using smart minimizer that performs 1000 steps of steepest descent followed by conjugate gradient algorithms with a convergence gradient of 0.001 kcal mol-1. After energy minimization, multiple acceptable conformers were generated for every training set compound within DS Diverse Conformation Generation module using the Poling algorithm. This step was necessary to produce a good set of representative conformations of different conformation space accessible to a molecule within a given energy range. A maximum of 255 conformations were generated for each compound within an energy range of 20 kcal mol-1 above the global energy minimum [20–22].
Among the two types of ligand-based pharmacophore modeling methodologies, common feature pharmacophore modeling utilizes the common chemical features present only in the most active compounds whereas the 3D QSAR pharmacophore methodology uses the chemical features of most active and inactive compounds along with their biological activity. In this study, we have employed 3D QSAR-based pharmacophore methodology to generate pharmacophore models that can be used to estimate the activity of newly designed compounds. Feature mapping protocol as available in DS was used to identify the features that are present in the training set compounds. Uncertainty value was set to 2 and the minimum inter-feature distance was set to 2Å from the default value of 2.97 Å. As identified by the feature mapping protocol, hydrogen bond acceptor (HBA), hydrogen bond donor (HBD), hydrophobic aliphatic (HY-AL), hydrophobic aromatic (HY-AR) and ring aromatic (RA) features were used with other default values to generate ten pharmacophore models using 3D QSAR pharmacophore generation of DS. Each feature of the resulting models occupies a certain weight that is proportional to its relative contribution to biological activity. HypoGen therefore constructs pharmacophore models correlating best with biological activities and consisting of as few features as possible. The HypoGen pharmacophore model generation process is performed in three steps such as the constructive phase, the subtractive phase and the optimization phase [23, 24]. Hypotheses that are common to the most active set of compounds are identified during the constructive phase. HypoGen calculates all possible pharmacophore configurations using all combinations of pharmacophore features for each of the conformations of the two most active compounds. Additionally, the hypotheses must fit a minimum subset of features of the remaining most active compounds in order to be considered. A large database of pharmacophore configurations is generated at the end of the constructive phase. In the subtractive phase, all pharmacophore configurations that are also present in the least active set of molecules are removed. All compounds whose activity is by default 3.5 orders of magnitude less than that of the most active compound are considered to represent the least active molecules. The value 3.5 is adjustable depending on the activity range of the training set. During the optimization phase, the hypothesis score is improved. Hypotheses are scored based on errors in activity estimates from regression and complexity. The optimization involves a variation of features and/or locations to optimize activity prediction via a simulated annealing approach. When the optimization process no longer improves the score, HypoGen stops and reports the top scoring 10 unique pharmacophores. The generated pharmacophore models were evaluated for their reliability based on the cost parameters. The overall costs of a model consist of three cost components, namely, the weight cost, the error cost, and the configuration cost. The weight component is a value that increases in a Gaussian form as this function weights in a model deviate from the ideal value of two. The error cost represents the difference between estimated and measured activities of the training set. The configuration cost quantifies the entropy of the hypothesis space.
In addition, the following three cost values are calculated during the generation of pharmacophore models: the fixed cost, the total cost, and the null cost. The fixed cost is the lowest possible cost representing a hypothetical simplest model that fits all data perfectly. Fixed costs are calculated by adding the minimum achievable error and weight cost and the constant configuration cost. Another cost parameter, the null cost, represents the maximum cost of a pharmacophore with no features and estimates activity to be the average of activity data of training set molecules. The null cost value is equal to the maximum occurring error cost. For every pharmacophore generation ten total cost values and each of fixed cost and null cost values are calculated by the pharmacophore generation protocol in the unit of bits. For a meaningful pharmacophore model, the fixed cost should be lower and the null cost should be higher and the total cost value should be closer to the fixed cost and away from the null cost value [25, 26]. HypoGen further estimates the activity of each training set compound using regression parameters. The parameters were computed by regression analysis using the relationship of geometric fit value versus the negative logarithm of activity. The better the geometric fit the greater the activity prediction of the compound. Along with these cost values, other statistical values such as correlation coefficient and root mean square deviation (RMSD) were calculated. The best pharmacophore model was selected based on the large cost difference, high correlation coefficient and lower RMSD.
The main purpose to validate the generated pharmacophore models is to investigate their ability to estimate the activity of new compounds identified through database screening or designed de novo. The selected pharmacophore model was validated using three methods based on the derived cost components, ability in test set prediction, Fischer randomization test results, and leave-one-out method. A larger difference between the fixed and null costs than that between the fixed and total costs signifies the quality of a pharmacophore model. All of these cost values are reported in bits and a difference of 40-60 bits between the total and null costs suggests a 75-90% chance of representing a true correlation in the data [27, 28]. Ninety three diverse compounds were used as the test set to validate the pharmacophore model. Fischer randomization is another approach for pharmacophore model validation. The 95% confidence level was selected in this validation study and 19 random spreadsheets were constructed. This validation method checks the correlation between the chemical structures and biological activity. This method generates pharmacophore models using the same parameters as those used to develop the original pharmacophore model by randomizing the activity data of the training set compounds. Finally the cross validation of the model was performed by using the leave-one-out methodology. In this method, 18 pharmacophore models were generated with the same parameters used for generating original pharmacophore model but leaving one compound at a time from 18 training set compounds to ensure the influence of every single training set compound in the generation of selected pharmacophore model [29, 30].
The best pharmacophore model validated using different methods was used as a 3D query in database screening to retrieve chemical compounds that fit all the pharmacophoric features. A chemical compound must fit all the features to be picked as hits. Search 3D Database protocol with Best/Flexible search option was employed in database screening. Three chemical databases of diverse chemical compounds were screened for novel chemical scaffolds to be used in potent renin inhibitor design. The identified database hits were screened using various filters based on estimated activity, Lipinski’s rule of five , and ADMET properties [32–35].
Compounds satisfying all the filters were subjected to molecular docking studies. The GOLD (Genetic Optimization for Ligand Docking) program from Cambridge Crystallographic Data Centre, UK uses a genetic algorithm to dock the small molecules into the protein active site was used in molecular docking . GOLD allows for a full range of flexibility for the ligands and partial flexibility of the protein. Protein coordinates from the crystal structure complex of renin with aliskiren (PDB ID: 2V0Z), one of the most active inhibitors, determined at a resolution of 2.20 Å were used to define the active site. The active site was defined with a 10 Å radius around the bound inhibitor. All the water molecules except two catalytically important 184 and 250 were removed from the protein and hydrogens were added. The ten top-scoring conformations of every ligand were saved at the end of the calculation. Early termination option was used to skip the genetic optimization calculation when any five conformations of a particular compound were predicted within an RMSD value of 1.5 Å. The GOLD fitness score is calculated from the contributions of hydrogen bond and van der Waals interactions between the protein and ligand, intramolecular hydrogen bonds and strains of the ligand [37, 38]. Protein-ligand interactions were analyzed using DS and Molegro Virtual Docker  programs. The novelty of the final hits was confirmed using SciFinder  and PubChem  structure search tools.
The final hits along with some most and least active compounds were used as input and all DFT calculations were carried out using Gaussian version 3.0 program. The geometry optimization of a set of compounds was carried out using the Becke3 Lee-Yang-Parr correlation functional (B3LYP), at the 6-31G* level [42–45]. The orbital energies of frontier orbitals, namely, highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) were calculated for a set of compounds. The calculation was performed to evaluate the electronic properties of final hits to be compared with the compounds in the training set .
Statistical results of the top 10 pharmacophore hypotheses generated by HypoGen algorithm.
HBA, HBA, HBD, HY-AL
HBD, HY-AL, HY-AL, HY-AR
HBA, HBA, HY-AL, HY-AR
HBA, HBD, HY-AR, HY-AR
HBA, HBA, HY-AL, HY-AR
HBD, HBD, HY-AL, HY-AL
HBA, HBD, HY-AL, HY-AR
HBA, HBD, HY-AL, HY-AR
HBD, HBD, HY-AL, HY-AR
HBA, HBA, HBD, HY-AL
Experimental and estimated IC50 values of the training set compounds based on best pharmacophore hypothesis Hypo1.
Results of Fischer’s randomization test.
The final validation was performed using leave-one-out method, this method is used to verify whether the correlation between the experimental and predicted activities is mainly depend on one particular molecule in the training set. This is done by recomputing the pharmacophore model by excluding one molecule at a time. Consequently, 18 HypoGen calculations were carried out under the same conditions, used in the generation of original pharmacophore model Hypo1, by deriving 18 new training sets, each composed of 17 molecules. The result is positive if none of the correlation coefficients of newly generated pharmacophore models is higher or too lower to that of Hypo1. From our results it was observed that none of the 18 new models generated by this method has shown any meaningful difference compared to Hypo1 (data not shown). This result supports and increases the confident level of Hypo1 that its correlation coefficient does not depend on one particular compound in the training set. Based on these validation results, Hypo1 was used as 3D query in database screening to identify the diverse chemical compounds to be utilized in potent renin inhibitor design.
The interaction between the protein and the ligand molecules were observed using DS and Molegro Virtual Docker. The novelty of the two final hit compounds was confirmed using SciFinder search and PubChem search.
In the present work, a quantitative pharmacophore model, Hypo1, was developed based on the training set compounds with a high diversity in terms of chemical structures and biological activity values. The best pharmacophore model was selected based on various parameters such as cost difference, correlation co-efficient and validation results. Hypo1 was generated with one HY-AL, one HBD and two HBA features with a high correlation coefficient value of 0.944. The validation methods included test set prediction, Fischer randomization, and leave-one-out method. The external test set containing 93 compounds was used in validating the ability of Hypo1 in predicting the activities of compounds that are not included in training set. Hypo1 has predicted this test set with a high correlation value of 0.903. The second validation based on Fischer randomization has proved that Hypo1 was not generated by a chance correlation in the training set. The leave-one-out validation proved that the correlation coefficient of Hypo1 did not depend on one particular compound in the training set. All these validation procedures have shown the strength of the selected model, Hypo1, in predicting the active compounds. After observing the validation results, Hypo1 was used in database screening to identify hits that can be used in potent renin inhibitor design. The identified hit compounds were further filtered based on the binding mode and molecular interactions at the active site of renin. The final hits reported as potential lead compounds have scored high estimated activity, favorable drug-like properties and strong molecular interactions with the catalytic residues at the active site. The DFT calculations were performed to study the electronic properties of the hit compounds and thereby to validate the quality of the pharmacophore model, Hypo1. The final hits, HTS05096 and AW00695, showed the minimum energy gaps which represent the more reactivity of the hit compounds when compare to the most active compounds. This provided the confidence on the inhibitory property of the final hit compounds. Thus, these hits can be utilized in designing future class of novel renin inhibitors.
This research was supported by Basic Science Research Program (2009-0073267), Pioneer Research Center Program (2009-0081539), and Management of Climate Change Program (2010-0029084)through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (MEST) of Republic of Korea. And this work was also supported by the Next-Generation Bio Green 21 Program (PJ008038) from Rural Development Administration (RDA)of Republic of Korea.
This article has been published as part of BMC Bioinformatics Volume 12 Supplement 14, 2011: 22nd International Conference on Genome Informatics: Bioinformatics. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2105/12?issue=S14.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.