Multitarget QSAR modelling in the analysis and design of HIVHCV coinhibitors: an insilico study
 Qi Liu†^{1},
 Han Zhou†^{1},
 Lin Liu^{1},
 Xi Chen^{2},
 Ruixin Zhu^{1}Email author and
 Zhiwei Cao^{1}Email author
DOI: 10.1186/1471210512294
© Liu et al; licensee BioMed Central Ltd. 2011
Received: 13 December 2010
Accepted: 20 July 2011
Published: 20 July 2011
Abstract
Background
HIV and HCV infections have become the leading global publichealth threats. Even more remarkable, HIVHCV coinfection is rapidly emerging as a major cause of morbidity and mortality throughout the world, due to the common rapid mutation characteristics of the two viruses as well as their similar complex influence to immunology system. Although considerable progresses have been made on the study of the infection of HIV and HCV respectively, few researches have been conducted on the investigation of the molecular mechanism of their coinfection and designing of the multitarget coinhibitors for the two viruses simultaneously.
Results
In our study, a multitarget Quantitative StructureActivity Relationship (QSAR) study of the inhibitors for HIVHCV coinfection were addressed with an insilico machine learning technique, i.e. multitask learning, to help to guide the coinhibitor design. Firstly, an integrated dataset with 3 HIV inhibitor subsets targeted on protease, integrase and reverse transcriptase respectively, together with another 6 subsets of 2 HCV inhibitors targeted on NS3 serine protease and NS5B polymerase respectively were compiled. Secondly, an efficient multitarget QSAR modelling of HIVHCV coinhibitors was performed by applying an accelerated gradient method based multitask learning on the whole 9 datasets. Furthermore, by solving the L1infinity regularized optimization, the Druglike index features for compound description were ranked according to their joint importance in multitarget QSAR modelling of HIV and HCV. Finally, a drug structureactivity simulation for investigating the relationships between compound structures and binding affinities was presented based on our multiple target analysis, which is then providing several novel clues for the design of multitarget HIVHCV coinhibitors with increasing likelihood of successful therapies on HIV, HCV and HIVHCV coinfection.
Conclusions
The framework presented in our study provided an efficient way to identify and design inhibitors that simultaneously and selectively bind to multiple targets from multiple viruses with high affinity, and will definitely shed new lights on the future work of inhibitor synthesis for multitarget HIV, HCV, and HIVHCV coinfection treatments.
Background
Human immunodeficiency virus (HIV1) is the cause of acquired immunodeficiency syndrome (AIDS) which has infected more than 60 million people around the world [1, 2]. Meanwhile, Hepatitis C virus (HCV), which is served as a serious cause of chronic liver disease, has infected 150200 million people worldwide [3]. Nowadays HIV and HCV infections have become global publichealth threats. Even more remarkable, HIVHCV coinfection is rapidly emerging as a major cause of morbidity and mortality throughout the world, since that both of the viruses share the same routes of transmission [3, 4]. It is shown that infection with the HCV is the most common coinfection in people with HIV, and hepatitis C is categorized as an HIVrelated opportunistic illness. Complications related to HIVHCV coinfection are becoming an increasingly important medical issue [4].
The current strategies for developing HIV/HCV antiviral agents depend essentially on disrupting the replication of the 2 viruses, and various inhibitors have been designed to target and block the functions of the enzymes necessary in the replication cycle of HIV/HCV. Among them, HIV inhibitors commonly target on protease, integrase and reverse transcriptase (RT), while HCV inhibitors target on NS5B polymerase and NS3 serine protease [5–18]. These inhibitors have been considered as attractive targets for therapeutic intervention in HIV/HCV infected patients.
For HIV and HCV therapy, single antiretroviral drug, alone or in simply combination with each other, is no longer recommended for clinical use owing to (1) the complicated infection mechanism of these two viruses; (2) the severe side effects of the joint using and (3) the rapid emergence of drugresistant strains after initiation of therapy. Hence, drugs targeting on different targets with high therapeutic and reduced side effects are expected to be more effective at suppressing viral growth. For HIV, The multitarget antiretroviral drugs can succeed in inhibiting several HIV proteins simultaneously and efficiently. There has existed several pioneering work in multitarget drug discovery for HIV infection, such as the multitarget antiretroviral drug Cosalane [13], which was developed to inhibit several HIV1 proteins simultaneously. Compared to HIV, the multiple target HCV drug treatment is still in its infancy. Nevertheless, the combination use of singletarget HCV drugs has become a new chance in this field, such as the combination using of NS5B polymerase inhibitor (GS9190) and NS3 protease inhibitor (GS9256), which were shown to be safe, welltolerated and show dose dependant antiviral activity [19, 20].
Since for both HIV and HCV the smallmolecule compounds used to design the drugs are needed to be assayed in vitro and in vivo, the popular insilico Quantitative StructureActivity Relationship (QSAR) modelling is applied extensively in HIV/HCV inhibitor studies due to its charming "blackbox" characteristics as well as its well prediction ability. Normally the QSAR modelling can be viewed as a computational technique to elucidate a quantitative correlation between chemical structure and biological activity [21]. Recently, considerable QSAR studies have been made for HIV/HCV inhibitors studies [5–18]. However, these studies were mainly focused on specific types of targets or specific diseases individually. Few studies have been performed on the multitarget HIVHCV coinfection QSAR modelling. Although the ways in which coinfection with HIV and HCV affect the body are still poorly understood, it has been indicated that both HIV1 Protease and HCV NS3 Protease are responsible for cleaving the viral polyproteins during the course of their action to produce the individual proteins of the mature viruses. Similarly, HIV1 reverse transcriptase and HCV NS5B can be affected by either nucleoside inhibitor that terminates nucleic acid synthesis or nonnucleoside inhibitor that impairs enzymatic function [22, 23]. All these evidences have indicated that it is possible to design certain inhibitors that aim at both HIV targets and HCV targets simultaneously. From this point of view, multitarget coinfection QSAR modelling for HIV and HCV is attractive and promising, due to that it is easy to achieve and expected to provide useful clues on how to synthesize such coinhibitors with improved affinities.
In our previous study, we presented a multitarget QSAR modelling on HIV1 inhibitors individually [31]. In this study we desire to extend this model to investigate the multitarget QSAR modelling of HIV and HCV jointly and simultaneously, and aim at providing useful clues on the design of HIVHCV coinhibitors. The QSAR modelling of HIVHCV coinfection inhibitors (coinhibitors for short) was addressed by applying an efficient accelerated gradient method based multitask learning (MTL) model provided by us formerly in machine learning community [24]. QSAR studies were performed on 9 datasets of HIV and HCV inhibitors. By using our MTL framework, the correlations among different set of inhibitors were utilized and an efficient multitarget QSAR modelling of HIVHCV coinhibitors was obtained. According to the importance of each descriptor in QSAR model, the Druglike index (DL) features [25] for inhibitor description were ranked, and a drug structureactivity simulation were performed to investigate the relationships between compound structures and binding affinities based on the ranked molecule descriptors.
Methods
A Dataset
Dataset descriptions.
Dataset ID  Target type  Number of inhibitors  Activity measurement 

1  HIV1 Reverse Transcriptase  79  EC_{50} [37] 
2  HIV1 Integrase  213  IC_{50} [6] 
3  HIV1 Protease  106  pKi [1] 
4  HCV NS5B Polymerase  67  IC_{50} [7] 
5  HCV NS5B Polymerase  45  IC_{50} [8] 
6  HCV NS5B Polymerase  41  EC_{50} [9] 
7  HCV NS3 Serine Protease  42  pKi [10] 
8  HCV NS3 Serine Protease  53  pKi [9] 
9  HCV NS3 Serine Protease  34  EC_{50} [11] 
Similar to our previous study [31], the inhibitors were represented with 2 kinds of feature spaces referring to 32dimensional General Descriptor (GD) features and 28dimensional Druglike index (DL) features. Although there are numerous types of descriptors to describe a chemical compound, none of a set of descriptors can guarantee to behave overwhelming better than others. Therefore, the widely applicable set of descriptors, i.e., the GD [25] was selected, together with the DL descriptor [26, 27] as a complement.
Detailed biological meaning of GD and DL descriptors can be referred in our previous work [31]. It should be noted that: (1) normally, general descriptors characterize physical prosperities of compounds, while druglike index descriptors characterize simple topological indices of compounds. These two kinds of descriptors are expected to present a comprehensive description of the compounds from the views of their intrinsic characteristics as well as their druglike properties. (2) The GD descriptor is generated in a hybridized way thus its current features haven't kept their original means for compound structure description. Therefore it cannot be biologically explained easily. On the other side, DL holds its original meanings, thus will be applied in our following feature ranking and explanations.
Where [L] is the concentration of free radio ligand used and K_{ D } is its equilibrium dissociation constant for the receptor [29].
It should be noted that the QSAR data were provided by different research groups under different platforms/protocols with different activity measurements. Normally QSAR modeling achieved by such single target data is often not reliable due to the insufficiency of samples. However, since we want to investigate the multitarget QSAR relationship of the HIVHCV coinfection, these data can be integrated in an elegant multitarget QSAR model taking the advantages of the multitask learning [30], which would expect to exploit the possible synergies between different datasets and obtain a better QSAR model to guide the synthesis of certain inhibitors with enhanced activities for HIV and HCV simultaneously. Details will be shown in the following.
B Methodology
Computational framework for multitarget modelling
In current study, a novel accelerated gradient descent algorithm based MTL model was performed for multitarget QSAR modeling on our integrated datasets simultaneously. Our inhouse experiments indicated that this MTL model is more efficient than our formerly adopted one for multitarget QSAR modeling [31] and it is scaled up well for large scale QSAR modeling in both convergence speed and learning accuracy. A joint L1infinity regularization based feature selection procedure was performed on the DL feature space to reveal the most common features across multitarget HIVHCV coinfection QSAR modeling. Based on such model, a drug structureactivity simulation for investigating the relationships between compound structures and binding affinities was further presented to validate our selected important features for efficient coinhibitor synthesis and design.
Multitask learning for QSAR modelling of HIVHCV coinhibitors
Multitask learning has been developed in machine learning research to situations where multiple related learning tasks are accomplished together. It has been proven to be more effective than learning each task independently when there are explicit or hidden interrelationship among the tasks that can be exploited. The intuition underlying the framework is that the multiple related tasks can benefit each other by sharing the data and features across the tasks, which can often boost the learning performance of each single task [30]. Also it provides an efficient mechanism for crosstask feature selection, thus could uncover the common dominate features for all the tasks simultaneously. Such computational ability is inherently suitable for our multitarget QSAR modeling, in which each single QSAR model could be viewed as a task and the leading features for synthesizing coinhibitors with improved activity will be identified under such schema.
It should be noted that the QSAR modelling is the process by which chemical structure is quantitatively correlated with a welldefined process, such as biological activity or chemical reactivity. And this procedure is generally formulated as a regression model [32] to predict the compound activity based on a given set of molecule descriptors. Although various statistical and machine learning methods have been proposed in the last few years for QSAR modeling [32], few studies have been tried in the multitarget QSAR scenario. In our study the multitarget QSAR modeling will be elegantly formulated as a multitask regression framework to reveal useful clues for multitarget drug screening and synthesizing for HIVHCV coinfections.
where z = (x, y, k), W =[w_{1}, w_{2}, ..., w_{ M }] ∈ R ^{ d × M } and W^{ j } be the j th row of W.
The first term in Equation (3) is the average of the empirical error across the tasks. The second term is the L1infinity regularization term that works on feature selection task in MTL, which can yield joint sparsity on both the feature level and task level and can lead to a more sparse solution [24].
As the main difficulty for solving the l_{1,∞} regularized formulation in formation (6) lies in the nonsmooth property of the l_{1,∞} regularizer, we present an accelerated gradient descent algorithm with the convergence rate O (1/t^{2}) by a variation of Nesterov's method calling a blackbox oracle in the projection step at each iteration [24]. By exploiting the structure of the l_{1,∞} ball, we find the blackbox oracle can be efficiently solved by a simple sorting procedure. Compared with Nesterov's algorithm, our method is suitable for largescale multitask learning problem since it only utilizes the first order information and is very easy to implement. Experiment results in our previous study have shown that our method significantly outperforms the most stateoftheart methods in both convergence speed and learning accuracy [25].
where · _{ F } denotes the Frobenius norm and 〈A, B〉 = Tr(A^{ T } B) denotes the matrix inner product.
Algorithm 1: Accelerated Gradient Algorithm
Initialization: L_{0} > 0, η > 1, W_{0} ∈ R^{d × M}, V_{0} = W_{0} and a_{0} = 1.
 1)
Set L = L _{t}
 2)
While F(q_{ L } (V_{ t } )) > Q_{ L } (q_{ L } (V_{ t } ), V_{ t } )
 3)Set L _{t+1}= L and compute
We stop the procedure when κ ≤ τ where τ is a prefixed constant.
where W^{ i } , V^{ i }denotes the i th row of the matrix W, V respectively. Therefore, (10) can be decomposed into d separate subproblems of dimension M.
and the vector of dual variables α satisfies the relation α = v  w. Equation (12) can be solved by an efficient projection onto the ball l_{1} according to [33]. With the primal dual relationship, we present Algorithm 2 for solving (11):
Algorithm 2: Algorithm for projection onto the l_{∞} ball
 1)
If v_{1} , set w = 0. Return.
 2)
Let u_{ i } be the absolute value of v_{ i } , i.e. u_{ i } = v_{ i } . Sort vector u in the decreasing order: u _{1} ≥ u _{2} ≥ ... ≥ u_{ M }
 3)
Find
Output:
Feature selection across multiple tasks
The value of β_{ i } indicates the weight of the corresponding feature, which gives us a quantitative way to evaluate the importance of various features for HIVHCV coinhibitor design and synthesize.
Domain of applicability of the model
Where X_{ i } is the rowvector descriptor of the query compound, X_{ i } is the n × k matrix containing k descriptor values and n training samples. The superscript T refers to the transpose of the matrix or vector. Generally, the warning leverage h* is fixed at 3 k/n, where n is the number of training compounds, and k is the number of descriptor. When a leverage is greater than the warning leverage h*, the predicted activity is the result of substantial extrapolation of the model and, therefore, it may not be reliable.
Based on the definition of leverage, Williams plot was used in our study to visualize the DOA of the QSAR model [35]. The Williams plot plots the standardized crossvalidated residuals (RES) versus leverage values (h), and can be used to obtain an immediate and simple graphical detection of both the response outliers (Y outliers) and the structurally influential chemicals (X outliers) of a model. Generally, the points with their values of Y axis fall outside the 3σ line (σ is the standard residuals unit of the compounds) can be referred as the Y outliers, while the points with their values of X axis fall outside the warning leverage h* line can be referred as the X outliers.
Multitarget HIVHCV coinhibitor design based on drug structureactivity prediction
After the feature ranking together with the examination of domain of application for multiple HIVHCV drug targets QSAR modelling, a drug structureactivity prediction [27] was performed for the analysis of the multiple drug data. The goal of this study is two folds: (1) It is used to computationally validate the ranking result by our multitask feature selection, and (2) It provides several useful modification strategies for further HIVHCV coinhibitor design.
Based on this computational prediction pipeline, we will identify what is the most efficient compound modification strategy to improve the molecule affinity targeting on multiple HIVHCV enzymes. Also, these identified strategies will be further explained by our joint feature ranking obtained under the multitarget QSAR paradigm.
Results and Discussions
Formulated as a multitask regression problem, the QSAR modelling of HIVHCV coinhibitors was performed based on the accelerated gradient descent based sparse multitask learning. Root mean squared error (RMSE) and squared correlation coefficient (R^{2}) were adopted as the performance evaluation for testing results. The definitions of these statistical parameters are provided as followed:
where n is the number of predicted drug molecules
y_{ i } is the observed molecule affinity
where P^{ avg } is the average value of over the n predicted molecule affinities.
It is obvious that for both GD and DI feature space, using multitask learning for QSAR modelling is superior to singletask learning on most target datasets, with the evaluation of RMSE and R^{2} and significant statistical confidence (data not shown). And the average correlation coefficient for data prediction under MTL is up to 0.6~0.7, which is a wellaccepted QSAR results. Such results proved that multitask learning can discover the latent commonalities across different types of inhibitors and take advantage of the synergy among multiple tasks when the label data on each single task are insufficient. These results also indicated that multitask learning provides an effective way to boost the learning performance of each single task by exploiting the available synergy between them, thus served as an efficient paradigm for multitarget QSAR modelling.
After building the QSAR model, the weight of the DL features for MTL on 3 HIV datasets, 6 HCV datasets and all 9 datasets respectively were calculated, Sparse MTL in this case was trained with 50% of the data for each task and tested with the remaining data. The features ranking were showed in Figure 8, 9 and 10. It should be noted that the GD feature space will not be adopted for feature ranking due to its indirectly mapping of biological meanings.
A joint feature ranking of the DL compound descriptors.
Features  Ranking values 

# of nonH  0.167148 
# of 2degree cyclic atoms  0.118946 
degree of cyclization  0.100268 
# of nonH polar bonds  0.056622 
# of rotatable bonds  0.049770 
# of carbons in cap fragments  0.047428 
# of cap fragments  0.043447 
# of 3degree cyclic atoms  0.040664 
# of N and O atoms  0.038882 
# of Hbond acceptors  0.038103 
# of fragments  0.033010 
maximum cap fragment size  0.032459 
# of 2degree acyclic atoms  0.027549 
# of 3degree acyclic atoms  0.021673 
# of 3level bonding patterns  0.018798 
total SSSR size  0.017162 
total number of 38 membered rings  0.017162 
# of cyclic fragments  0.016538 
# of 1level bonding patterns  0.016382 
# of Hbond donors  0.016217 
total number of 3 to 8 unsaturated rings  0.016072 
# of aromatic systems  0.014930 
# of N with # of H > 0  0.012320 
# of hydroxyl groups  0.009301 
maximum SSSR size  0.008863 
# of linkers  0.008807 
# of 2level bonding patterns  0.006729 
total number of 3 to 8 saturated rings  0.004750 
More # of nonH, # of nonH polar bonds and # of rotatable bonds could increase the potency of HIV Protease Inhibitors.
Precursor  Structure  pKi  

#17 pKi: 9.28 

 + 2.11% (× 25.3) 
#17 pKi: 9.28 

 + 2.04% (× 25.3) 
#17 pKi: 9.28 

 + 1.95% (× 25.2) 
More # of 2degree cyclic atoms, degree of cyclization and # of nonH polar bonds could increase the potency of HIV Reverse Transcriptase Inhibitors.
Precursor  Structure  pKi  

#5 log(1/EC50): 8.3 

 + 3.88% (× 27.2) 
#5 log(1/EC50): 8.3 

 + 3.77% (× 26.8) 
#5 log(1/EC50): 8.3 

 + 3.39% (× 26.4) 
#5 log(1/EC50): 8.3 

 + 3.39% (× 26.3) 
#5 log(1/EC50): 8.3 

 + 3.36% (× 26.3) 
#5 log(1/EC50): 8.3 

 + 3.31% (× 26.6) 
#5 log(1/EC50): 8.3 

 + 3.32% (× 26.3) 
More # of nonH, # of 2degree cyclic atoms, degree of cyclization and # of nonH polar bonds could increase the potency of HIV Integrase Inhibitors.
Precursor  Structure  pKi  

#11 pIC50: 5.82 

 + 0.75% (× 21.2) 
#189 pIC50: 5.53 

 + 0.74% (× 21.2) 
#188 pIC50: 4.43 

 + 0.73% (× 21.3) 
5 More # of nonH, # of nonH polar bonds and # of rotatable bonds could increase the potency of HCV NS5B Inhibitors.
Precursor  Structure  IC50(uM)NS5B  

#15 IC50(uM)NS5B: 1.64 

 + 2.51% (× 25.2) 
#15 IC50(uM)NS5B: 1.64 

 + 2.18% (× 24.5) 
#15 IC50(uM)NS5B: 1.64 

 + 1.63% (× 27.5) 
#15 IC50(uM)NS5B: 1.64 

 + 1.19% (× 24.1) 
#15 IC50(uM)NS5B: 1.64 

 + 0.98% (× 24.1) 
#15 IC50(uM)NS5B: 1.64 

 + 0.91% (× 24.3) 
#15 IC50(uM)NS5B: 1.64 

 + 0.79% (× 24.2) 
More # of nonH, # of 2degree cyclic atoms, degree of cyclization and # of nonH polar bonds could increase the potency of HCV NS3 Inhibitors.
Precursor  Structure  EC50(uM)NS3  

#1 EC50(uM)NS32: 0.35 

 + 1.94% (× 21.3) 
#1 EC50(uM)NS32: 0.35 

 + 0.55% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.50% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.50% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.37% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.37% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.37% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.33% (× 17.5) 
#1 EC50(uM)NS32: 0.35 

 + 0.23% (× 17.5) 
It should be noted that these features are well be commonly important for the multiple scaffolds with each inhibiting an individual target derived from our MTL model, and they can be necessarily integrated together to guide the synthesis of a single scaffold against multiple targets, which guarantee that one individual compound could hold its coinhibitor ability for both virus targets.
Conclusions
A Multitarget computational screening of HIVHCV coinhibitors with a MTL paradigm was carried out in our study. Compared to our previous work [31], it is improved mainly in two aspects: (1) It integrated both HIV and HCV data sources to enhance significantly the identification of lead inhibitors for HIVHCV coinhibitor drugs development. (2) A novel accelerated gradient descent algorithm based MTL model was incorporated into the multitarget QSAR modeling with more efficiency in both convergence speed and learning accuracy. In summary, the computational pipeline presented here provided an efficient way to identify and design inhibitors that simultaneously and selectively bind to multiple targets multiple viruses with high affinity.
Future researches on multitarget QSAR analysis could be done to address the compound description issue with more kinds of feature descriptors. Also the investigations on the integration and fusion mechanisms of multiview feature spaces in compound representation could be conducted. Recently developed transfer learning technologies [36] in machine learning community may help to handle such cases efficiently. Furthermore, the underline mechanisms of HIVHCV coinfection as well as the synthesis of the coinhibitors based on our study are definitely worthy for longterm perusing.
Notes
Declarations
Acknowledgements
The authors would like to thank Dr. Qiang Yang at Hong Kong University of Science and Technology for their suggestions. This work was supported in part by Project White Magnolia Funding, Shanghai (Grant No. 2010B127), Tongji Excellent Young Scientist Funding (Grant No. 2000219052), National Natural Science Foundation of China (Grant No. 30976611) and Research Fund for the Doctoral Program of Higher Education of China (Grant No.20100072120050)
Authors’ Affiliations
References
 Jorissen RN, Reddy G, Ali A, Altman MD, Chellappan S, Anjum SG, Tidor B, Schiffer CA, Rana TM, Gilson MK: Additivity in the analysis and design of HIV protease inhibitors. J Med Chem 2009, 52(3):737–754. 10.1021/jm8009525PubMed CentralView ArticlePubMed
 Johnston MI, Hoth DF: Present status and future prospects for HIV therapies. Science 1993, 260(5112):1286. 10.1126/science.7684163View ArticlePubMed
 Maier I, Wu GY: Hepatitis C and HIV coinfection: a review. Transfusion 1992, 7(20):2.
 Rockstroh JK, Spengler U: HIV and hepatitis C virus coinfection. The Lancet Infectious Diseases 2004, 4(7):437–444. 10.1016/S14733099(04)01059XView ArticlePubMed
 Solov'ev VP, Varnek A: AntiHIV Activity of HEPT, TIBO, and Cyclic Urea Derivatives: Structure Property Studies, Focused Combinatorial Library Generation, and Hits Selection Using Substructural Molecular Fragments Method. J Chem Inf Comput Sci 2003, 43(5):1703–1719.View ArticlePubMed
 Iyer M, Hopfinger AJ: Treating Chemical Diversity in QSAR Analysis: Modeling Diverse HIV1 Integrase Inhibitors Using 4D Fingerprints. J Chem Inf Model 2007, 47(5):1945–1960. 10.1021/ci700153gView ArticlePubMed
 Patel PD, Patel MR, KaushikBasu N, Talele TT: 3D QSAR and molecular docking studies of benzimidazole derivatives as hepatitis C virus NS5B polymerase inhibitors. J Chem Inf Model 2008, 48(1):42–55. 10.1021/ci700266zView ArticlePubMed
 Nittoli T, Curran K, Insaf S, DiGrandi M, Orlowski M, Chopra R, Agarwal A, Howe AYM, Prashad A, Floyd MB: Identification of anthranilic acid derivatives as a novel class of allosteric inhibitors of hepatitis C NS5B polymerase. J Med Chem 2007, 50(9):2108–2116. 10.1021/jm061428xView ArticlePubMed
 Harper S, Avolio S, Pacini B, Di Filippo M, Altamura S, Tomei L, Paonessa G, Di Marco S, Carfi A, Giuliano C: Potent inhibitors of subgenomic hepatitis C virus RNA replication through optimization of indoleNacetamide allosteric inhibitors of the viral NS5B polymerase. J Med Chem 2005, 48(14):4547–4557. 10.1021/jm050056+View ArticlePubMed
 Chen KX, Nair L, Vibulbhan B, Yang W, Arasappan A, Bogen SL, Venkatraman S, Bennett F, Pan W, Blackman ML: SecondGeneration Highly Potent and Selective Inhibitors of the Hepatitis C Virus NS3 Serine Protease. J Med Chem 2009, 52(5):1370–1379. 10.1021/jm801238qView ArticlePubMed
 Arasappan A, Padilla AI, Jao E, Bennett F, Bogen SL, Chen KX, Pike RE, Sannigrahi M, Soares J, Venkatraman S: Toward second generation hepatitis C virus NS3 serine protease inhibitors: discovery of novel P4 modified analogues with improved potency and pharmacokinetic profile. Journal of medicinal chemistry 2009, 52(9):2806–2817. 10.1021/jm801616eView ArticlePubMed
 Bogen SL, Pan W, Ruan S, Nair LG, Arasappan A, Bennett F, Chen KX, Jao E, Venkatraman S, Vibulbhan B: Toward the BackUp of Boceprevir (SCH 503034): Discovery of New Extended P4Capped Ketoamide Inhibitors of Hepatitis C Virus NS3 Serine Protease with Improved Potency and Pharmacokinetic Profiles. Journal of medicinal chemistry 2009, 52(12):3679–3688. 10.1021/jm801632aView ArticlePubMed
 Jenwitheesuk E, Horst JA, Rivas KL, Van Voorhis WC, Samudrala R: Novel paradigms for drug discovery: computational multitarget screening. Trends in pharmacological sciences 2008, 29(2):62–71. 10.1016/j.tips.2007.11.007PubMed CentralView ArticlePubMed
 Pauwels R: New nonnucleoside reverse transcriptase inhibitors (NNRTIs) in development for the treatment of HIV infections. Current Opinion in Pharmacology 2004, 4(5):437–446. 10.1016/j.coph.2004.07.005View ArticlePubMed
 Reddy G, Ali A, Nalam MNL, Anjum SG, Cao H, Nathans RS, Schiffer CA, Rana TM: Design and Synthesis of HIV1 Protease Inhibitors Incorporating Oxazolidinones as P2/P2 'Ligands in Pseudosymmetric Dipeptide Isosteres. J Med Chem 2007, 50(18):4316–4328. 10.1021/jm070284zView ArticlePubMed
 PrabuJeyabalan M, Nalivaika E, Schiffer CA: Substrate Shape Determines Specificity of Recognition for HIV1 Protease:: Analysis of Crystal Structures of Six Substrate Complexes. Structure 2002, 10(3):369–381. 10.1016/S09692126(02)007207View ArticlePubMed
 Chellappan S, Kairys V, Fernandes MX, Schiffer C, Gilson MK: Evaluation of the substrate envelope hypothesis for inhibitors of HIV1 protease. Proteins: Structure, Function, and Bioinformatics 2007, 68(2):561–567. 10.1002/prot.21431View Article
 Schultz AK, Zhang M, Leitner T, Kuiken C, Korber B, Morgenstern B, Stanke M: A jumping profile Hidden Markov Model and applications to recombination sites in HIV and HCV genomes. BMC bioinformatics 2006, 7(1):265. 10.1186/147121057265PubMed CentralView ArticlePubMed
 Tan SL, Pause A, Shi Y, Sonenberg N: Hepatitis C therapeutics: current status and emerging strategies. Nature Reviews Drug Discovery 2002, 1(11):867–881. 10.1038/nrd937View ArticlePubMed
 Franciscus A: Hepatitis C treatments in current clinical development.
 Leach AR: Molecular modelling: principles and applications. AddisonWesley Longman Ltd; 2001.
 Tomasselli AG, Heinrikson RL: Targeting the HIVprotease in AIDS therapy: a current clinical perspective1. Biochimica et Biophysica Acta (BBA)Protein Structure and Molecular Enzymology 2000, 1477(1–2):189–214. 10.1016/S01674838(99)002733View Article
 Seden K, Back D, Khoo S: New directly acting antivirals for hepatitis C: potential for interaction with antiretrovirals. Journal of Antimicrobial Chemotherapy 65(6):1079.
 Chen X, Pan W, Kwok JT, Carbonell JG: Accelerated gradient method for multitask sparse learning problem. IEEE 2009, 746–751.
 Labute P: A widely applicable set of descriptors. Journal of Molecular Graphics and Modelling 2000, 18(4–5):464–477. 10.1016/S10933263(00)000681View ArticlePubMed
 Xu J, Stevenson J: Druglike index: a new approach to measure druglike compounds and their diversity. J Chem Inf Comput Sci 2000, 40(5):1177–1187.View ArticlePubMed
 Vilar S, Cozza G, Moro S: Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery. Current Topics in Medicinal Chemistry 2008, 8(18):1555–1572. 10.2174/156802608786786624View ArticlePubMed
 Cortes A, Cascante M, Cardenas ML, CornishBowden A: Relationships between inhibition constants, inhibitor concentrations for 50% inhibition and types of inhibition: new ways of analysing data. Biochemical Journal 2001, 357(Pt 1):263.PubMed CentralView ArticlePubMed
 Wu G, Yuan Y, Hodge CN: Determining appropriate substrate conversion for enzymatic assays in highthroughput screening. Journal of Biomolecular Screening 2003, 8(6):694. 10.1177/1087057103260050View ArticlePubMed
 Liu Q, Xu Q, Zheng VW, Xue H, Cao Z, Yang Q: Multitask learning for crossplatform siRNA efficacy prediction: an insilico study. BMC bioinformatics 11(1):181.
 Liu Q, Che D, Huang Q, Cao Z, Zhu R: Multitarget QSAR Study in the Analysis and Design of HIV1 Inhibitors. Chinese Journal of Chemistry 28(9):1587–1592.
 Selassie C, Verma RP: History of quantitative structureactivity relationships. In Burger's Medicinal Chemistry and Drug Discovery. Volume 1. sixth edition. Drug Discovery, John Wiley&Sons, Inc; 2003.
 Duchi J, ShalevShwartz S, Singer Y, Chandra T: Efficient projections onto the l 1ball for learning in high dimensions. ACM 2008, 272–279.
 Weaver S, Gleeson MP: The importance of the domain of applicability in QSAR modeling. J Mol Graph Model 2008, 26(8):1315–1326. 10.1016/j.jmgm.2008.01.002View ArticlePubMed
 Tropsha A, Gramatica P, Gombar VK: The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models. QSAR & Combinatorial Science 2003, 22(1):69–77. 10.1002/qsar.200390007View Article
 Pan SJ, Yang Q: A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 2009.
 Darnag R, Schmitzer A, Belmiloud Y, Villemin D, Jarid A, Chait A, Seyagh M, Cherqaoui D: QSAR Studies of HEPT Derivatives Using Support Vector Machines. QSAR & Combinatorial Science 2009, 28(6–7):709–718. 10.1002/qsar.200810166View Article
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.