SHIVA - a web application for drug resistance and tropism testing in HIV

Riemenschneider, Mona; Hummel, Thomas; Heider, Dominik

doi:10.1186/s12859-016-1179-2

Software
Open access
Published: 22 August 2016

SHIVA - a web application for drug resistance and tropism testing in HIV

Mona Riemenschneider^1,2,
Thomas Hummel^1,2 &
Dominik Heider^1,2,3

BMC Bioinformatics volume 17, Article number: 314 (2016) Cite this article

2176 Accesses
19 Citations
12 Altmetric
Metrics details

Abstract

Background

Drug resistance testing is mandatory in antiretroviral therapy in human immunodeficiency virus (HIV) infected patients for successful treatment. The emergence of resistances against antiretroviral agents remains the major obstacle in inhibition of viral replication and thus to control infection. Due to the high mutation rate the virus is able to adapt rapidly under drug pressure leading to the evolution of resistant variants and finally to therapy failure.

Results

We developed a web service for drug resistance prediction of commonly used drugs in antiretroviral therapy, i.e., protease inhibitors (PIs), reverse transcriptase inhibitors (NRTIs and NNRTIs), and integrase inhibitors (INIs), but also for the novel drug class of maturation inhibitors. Furthermore, co-receptor tropism (CCR5 or CXCR4) can be predicted as well, which is essential for treatment with entry inhibitors, such as Maraviroc. Currently, SHIVA provides 24 prediction models for several drug classes. SHIVA can be used with single RNA/DNA or amino acid sequences, but also with large amounts of next-generation sequencing data and allows prediction of a user specified selection of drugs simultaneously. Prediction results are provided as clinical reports which are sent via email to the user.

Conclusions

SHIVA represents a novel high performing alternative for hitherto developed drug resistance testing approaches able to process data derived from next-generation sequencing technologies. SHIVA is publicly available via a user-friendly web interface.

Background

Successful antiretroviral therapy (ART) in HIV infected patients strongly depends on the effective suppression of viral replication. Although a broad range of antiretroviral drugs have been approved by the FDA, long-lasting reduction of virus load is still challenging today. The high mutation rate of the virus under drug pressure remains the major concern in ART. In general, three drugs of different drug classes, e.g., protease inhibitors (PIs), reverse transcriptase inhibitors (NRTIs and NNRTIs), or entry inhibitors, are administered to the patient in a combined therapy. However, as an adaption process the virus is able to evade ART regimes finally leading to therapy failure. Therefore, predicting drug resistances or co-receptor usage, respectively, is an essential procedure in the efficient suppression of virus load and though in prolonging a patient’s life. Overall, six drug classes are at present available that target diverse stages within the replication cycle of HI viruses. Nucleotide reverse transcriptase inhibitors (NRTIs) and non-nucleoside reverse transcriptase inhibitors (NNRTIs) are able to block the activity of the reverse transcriptase therewith disrupting the building process of viral DNA, which finally leads to the inhibition of the virus’ life cycle. Protease inhibitors (PIs) suppress the function of the protease enzyme, which is responsible for cutting the gag polypeptide into functional proteins. Integrase inhibitors (INIs) prevent the insertion of viral DNA into the host cell by inhibiting the viral integrase. Another prevention of virus replication can be achieved by inhibiting the entry of viruses into host cells. In general, the HI virus binds to the CD4 receptor of CD4 expressing cells, and additionally, to one of two co-receptors, either CXCR4 or CCR5. The engagement is crucial to activate the fusion of virus and host cell [1]. Entry-inhibitors, such as Maraviroc [2], are able to block the co-receptor binding to CCR5 and therefore inhibit virus replication. However, before administering entry-inhibitors to the patient it is necessary to predict co-receptor usage of the viral population, as entry-inhibitors are only effective against CCR5-tropic viruses. A rather new antiretroviral drug, Bevirimat (BVM) [3], has been evaluated for HIV therapy, albeit not yet approved by the FDA for clinical application. BVM is a maturation inhibitor that inhibits the maturation of HIV particles to infectious virions by preventing the final cleavage of the precursor protein p25 to p24 and p2. However, the emergence of drug resistances is the major impediment in effective ART. Thus, sophisticated computational algorithms have been developed to predict drug resistances on viral sequences. For example, geno2pheno [4, 5], HIVdb [6], and WebPSSM [7] are the most popular tools in resistance testing and co-receptor tropism prediction. However, there is still the great need to further improve prediction algorithms for HIV drug resistance and tropism prediction. For example, Dybowski et al. [8] have shown the potential of random forest [9] models on HIV tropism prediction, outperforming prediction performances of geno2pheno and other methods.

We developed a web service for HIV drug resistance prediction incorporating models for resistance testing of PIs, NNRTIs, NRTIs, INIs, BVM, as well as co-receptor tropism prediction. The algorithm design is able to handle up to several million sequences therefore allowing queries with next-generation sequencing (NGS) data [10]. Prediction results are provided as clinical reports presenting the results in a comprehensible and clearly presented way and sent via email to the user. Additionally, a data file containing resistance information for each sequence is generated thus providing also detailed information for all sequences within a patient. The graphical user interface makes the application handy for researchers as well as for clinicians. The application SHIVA is available as a web interface with public access at http://shiva.heiderlab.de.

Implementation

The web application is based on the JAVA framework Vaadin (https://vaadin.com) and can be accessed via a web browser. Drug resistance models that are incorporated into and provided by SHIVA were implemented in R (http://www.r-project.org), thus requiring communication and data transfer between JAVA and R, which is handled using the Rserve package. The analytical report generation is conducted using the JasperReports library. For processing biological data, e.g., for translating RNA sequences into protein sequences, the BioJava library [11] is used.

The models that were incorporated into SHIVA so far provide predictions for six antiretroviral drug classes comprising the following 23 drugs: PIs (Amprenavir, Atazanavir, Indinavir, Lopinavir, Ritonavir, Saquinavir, Nelfinavir, Darunavir), NRTIs (Lamivudine, Abacavir, Zidovudine, Didanosine, Tenofovir, Stavudine, Emtricitabine), NNRTIs (Efavirenz, Nevirapine, Delavirdine, Rilpivirine), INIs (Dolutegravir, Elvitegravir, Raltegravir), and the maturation inhibitor Bevirimat. Furthermore, a prediction model for co-receptor tropism prediction has been incorporated. For most of the drugs (except Darunavir, Emtricitabine, and the INIs) resistance prediction is performed by random forests models developed in our group recently [12, 13]. These models are based on independent binary classification for each drug, predicting sequences as resistant or susceptible using a drug-specific cutoff (see [12, 13]). The random forests have been trained with 500 trees and have been evaluated with a leave-one-out cross-validation scheme.

Random forests [9] are ensemble learning methods for classification and regression, consisting of multiple decision trees, which predict the drug resistance for a given sequence independently of each other. The predictions are then combined via majority voting to get a final decision, i.e., resistant or susceptible.

For feature representation, sequences were translated to numerical values using the hydrophobicity scores according to Kyte and Doolittle [14]. Due to varieties in sequence lengths, an interpolation of input sequences was conducted with Interpol [15] to vectors of lengths 99 for PIs and 240 for NNRTIs and NRTIs, respectively. Feature length of p2 sequences for BVM prediction was interpolated to 20. For the prediction of co-receptor usage, we incorporated TCUP 2.0 [16] into SHIVA, which is based on a stacking approach of two random forest models. One random forest model is trained on the numerical representation of the V3 loop using hydrophobicity values, the other one is trained on the electrostatic potential of the V3 loop. Independent predictions are then combined via stacking to get the final prediction result for each sequence. Besides using data-driven approaches for drug resistance, we also employed knowledge-based approaches for the INIs, Emtricitabine, and Darunavir based on the 2015 edition of the IAS-USA drug resistance mutations list [17].

Results and discussion

The web application SHIVA (Fig. 1) has been developed to provide researchers and clinicians an easy-to-use interface to our recently developed drug resistance and tropism prediction models. At the moment, there are drug resistance models for 23 drugs (PIs, NRTIs, NNRTIs, INIs, and BVM) available. Moreover, it can be used to predict co-receptor tropism based on V3 loop sequence data. Data input must be provided as RNA/DNA or protein sequences in FASTA format. RNA/DNA sequences are translated into amino acid representations. Additionally, sequences are checked for input format and illegal characters. To provide the appropriate input format for the prediction models, the protein sequences are encoded with hydrophobicity values. For co-receptor usage prediction, the electrostatic potentials of the V3 loop sequences are calculated by TCUP 2.0 [16]. For each prediction query the false positive rate can be controlled by setting a cut-off for the specificity. The default value is 0.95 for drug resistance predictions and 0.98 for co-receptor determination. Figure 2 demonstrates the resistance testing workflow. The results of the predictions are provided as clinical reports in the web browser and can be optionally sent via email to the user. These reports contain the following information:

1.
user information, such as user ID, a patient’s ID (if provided), and date of query,
2.
information about input data, i.e., number of sequences, the type of sequence (RNA/DNA or protein), and the list of sequences (only when a small set of sequences has been queried),
3.
prediction results, i.e., the prediction results for queried drugs, the selected specificity cut-off and a visualization for easy interpretation of prediction results (see Fig. 3).

The clinical report shows the fraction of resistant sequences for a given patient, which can be used to detect resistant minority variants. Moreover, the report gives information about the drug model used and its general sensitivity for the selected specificity cut-off. For instance, as shown in Fig. 3, the sensitivity of the model predicting Ritonavir resistance is 94.02 % for the selected specificity of 95.0 %, and the fraction of resistant sequences for the given patient is 48.61 %.

A comparison between the prediction servers SHIVA, geno2pheno (resistance and coreceptor), HIVdb, and WebPSSM can be found in Table 1. We compared the different drug classes that can be predicted, the maximum number of sequences that can be uploaded, NGS compatability, run time (averaged over 10 runs), whether the server provides a clinical report, and whether the server provides detailed access to the predicted data. In contrast to geno2pheno, HIVdb, and WebPSSM, SHIVA is able to predict NGS data directly. In our recent publications, we already demonstrated the high accuracy of our drug resistance and tropism prediction models compared to other state-of-the-art models, e.g., geno2pheno [4, 5] or HIVdb [6]. Thus, we restricted the comparison of the web-services to features of usability and applicability.

Table 1 Comparison between different prediction servers

Full size table

geno2pheno _resistance and geno2pheno _coreceptor are only able to predict up to 8 and 50 sequences, respectively, while HIVdb and WebPSSM are restricted to 500 sequences. For co-receptor prediction based on NGS data generated with 454 pyrosequencing, geno2pheno₄₅₄ can be used as well, however the preprocessing of the data needs to be done offline. There are also differences in run times for the prediction of 8 protease and 50 V3 sequences, respectively. It turned out that HIVdb is the fastest tool, followed by SHIVA with 2.89 and 6.02 seconds for the prediction of 8 protease sequences, respectively. In contrast, geno2pheno _resistance needs 24.37 seconds. For the prediction of co-receptor tropism, SHIVA is slower than geno2pheno _coreceptor and WebPSSM, which is mainly due to the internal 3D-modeling process in TCUP 2.0 [16]. Except WebPSSM, all other servers provide a clinical report that can be used by the clinicans, however, the HIVdb report is not very intuitively and thus only of limited use. One major drawback of geno2pheno compared to the other servers is the lack of detailed data access, which is in particular important for large amounts of data.

Conclusion

SHIVA represents a novel high performing alternative for hitherto developed drug resistance testing approaches. SHIVA allows the processing of large amounts of data derived from high-throughput technologies [18]. Moreover, SHIVA is platform independent, easy to use and publicly available. In future, additional prediction models that are based on multi-label classification techniques and structural descriptors will be incorporated. Recent studies have demonstrated that such approaches have great potential to further improve drug resistance predictions [19, 20]. Moreover, we will incorporate GPU-based implementations [21] of our models in the near future to speed up the prediction of large data sets.

Availability and requirements

Project name: SHIVA
Project home page: http://shiva.heiderlab.de
Operating system(s): Platform independent
Programming language: Java, R
Other requirements: Javascript
License: GNU LGPL
Any restrictions to use by non-academics: no licence needed

Abbreviations

ART:: Antiretroviral therapy
BVM:: Bevirimat
DNA:: Desoxyribonucleic acid
GPU:: Graphics processor unit
HIV:: Human immunodeficiency virus
ID:: identifier
INI:: Integrase inhibitor
NGS:: Next-generation sequencing
NRTI:: Nucleotide reverse transcriptase inhibitor
NNRTI:: Non-nucleoside reverse transcriptase inhibitor
PI:: Protease inhibitor
RNA:: Ribonucleic acid
RTV:: Ritonavir

References

Deng H, Liu R, Ellmeier W, Choe S, Unutmaz D, Burkhart M, Marzio PD, Marmon S, Sutton RE, Hill CM, Davis CB, Peiper SC, Schall TJ, Littman DR, Landau NR. Identification of a major co-receptor for primary isolates of HIV-1. Nature. 1996; 381(6584):661–6.
Article CAS PubMed Google Scholar
Dorr P, Westby M, Dobbs S, Griffin P, Irvine B, Macartney M, Mori J, Rickett G, Smith-Burchnell C, Napier C, Webster R, Armour D, Price D, Stammen B, Wood A, Perros M. Maraviroc (UK-427,857), a potent, orally bioavailable, and selective small-molecule inhibitor of chemokine receptor CCR5 with broad-spectrum anti-human immunodeficiency virus type 1 activity. Antimicrob Agents Chemother. 2005; 49(11):4721–32.
Article CAS PubMed PubMed Central Google Scholar
Salzwedel K, Martin DE, Sakalian M. Maturation inhibitors: a new therapeutic class targets the virus structure. AIDS Rev. 2007; 9(3):162–72.
PubMed Google Scholar
Beerenwinkel N, Schmidt B, Walter H, Kaiser R, Lengauer T, Hoffmann D, Korn K, Selbig J. Diversity and complexity of HIV-1 drug resistance: A bioinformatics approach to predicting phenotype from genotype. Proc Nat Acad Sci USA. 2002; 99(12):85,8271–6.
Article Google Scholar
Lengauer T, Sander O, Sierra S, Thielen A, Kaiser R. Bioinformatics prediction of HIV coreceptor usage. Nat Biotechnol. 2007; 25(12):1407–10.
Article CAS PubMed Google Scholar
Liu TF, Shafer RW. Web resources for HIV type 1 genotypic-resistance test interpretation. Clin Infect Dis. 2006; 42(11):1608–18.
Article CAS PubMed PubMed Central Google Scholar
Jensen MA, Li FS, van ’t Wout AB, Nickle DC, Shriner D, He HX, McLaughlin S, Shankarappa R, Margolick JB, Mullins JI. Improved coreceptor usage prediction and genotypic monitoring of R5-to-X4 transition by motif analysis of human immunodeficiency virus type 1 env V3 loop sequences. J Virol. 2003; 77(24):13376–88.
Article CAS PubMed PubMed Central Google Scholar
Dybowski JN, Heider D, Hoffmann D. Prediction of co-receptor usage of HIV-1 from genotype. PLoS Comput Biol. 2010; 6(4):1000743.
Article Google Scholar
Breiman L. Random forests. Mach Learn. 2001; 45:5–32.
Article Google Scholar
Dybowski JN, Heider D, Hoffmann D. Structure of HIV-1 quasi-species as early indicator for switches of co-receptor tropism. AIDS Res Ther. 2010; 7:41.
Article CAS PubMed PubMed Central Google Scholar
Holland RC, Down TA, Pocock M, Prlic A, Huen D, James K, Foisy S, Drager A, Yates A, Heuer M, Schreiber MJ. BioJava: an open-source framework for bioinformatics. Bioinformatics. 2008; 24(18):2096–7.
Article CAS PubMed PubMed Central Google Scholar
Heider D, Verheyen J, Hoffmann D. Machine learning on normalized protein sequences. BMC Res Notes. 2011; 4:94.
Article CAS PubMed PubMed Central Google Scholar
Heider D, Verheyen J, Hoffmann D. Predicting Bevirimat resistance of HIV-1 from genotype. BMC Bioinformatics. 2010; 11:37.
Article PubMed PubMed Central Google Scholar
Kyte J, Doolittle R. A simple method for displaying the hydropathic character of a protein. J Mol Biol. 1982; 157:105–32.
Article CAS PubMed Google Scholar
Heider D, Hoffmann D. Interpol: An R package for preprocessing of protein sequences. BioData Min. 2011; 4:16.
Article CAS PubMed PubMed Central Google Scholar
Heider D, Dybowski JN, Wilms C, Hoffmann D. A simple structure-based model for the prediction of HIV-1 co-receptor tropism. BioData Min. 2014; 7:14.
Article PubMed PubMed Central Google Scholar
Wensing AM, Calvez V, Günthard HF, Johnson VA, Paredes R, Pillay D, Shafer RW, Richman DD. 2015 update of the drug resistance mutations in hiv-1. Top Antivir Med. 2015; 23(4):132–41.
PubMed Google Scholar
Ramos RT, Carneiro AR, Baumbach J, Azevedo V, Schneider MP, Silva A. Analysis of quality raw data of second generation sequencers with Quality Assessment Software. BMC Res Notes. 2011; 4:130.
Article PubMed PubMed Central Google Scholar
Dybowski JN, Riemenschneider M, Hauke S, Pyka M, Verheyen J, Hoffmann D, Heider D. Improved bevirimat resistance prediction by combination of structural and sequence-based classifiers. BioData Min. 2011; 4:26.
Article CAS PubMed PubMed Central Google Scholar
Heider D, Senge R, Cheng W, Hüllermeier E. Multilabel classification for exploiting cross-resistance information in HIV-1 drug resistance prediction. Bioinformatics. 2013; 29(16):1946–52.
Article CAS PubMed Google Scholar
Olejnik M, Steuwer M, Gorlatch S, Heider D. gCUP: rapid GPU-based HIV-1 co-receptor usage prediction for next-generation sequencing. Bioinformatics. 2014; 30(22):3272–3.
Article CAS PubMed Google Scholar

Download references

Funding

This work was supported by the German Research Foundation (DFG) and the Technische Universität München within the funding programme Open Access Publishing.

Authors’ contributions

Conceived and designed the experiments: MR, TH, DH. Performed the experiments: MR, TH. Interpreted results: MR, TH, DH. Wrote the paper: MR, DH. All authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Consent for publication

Not applicable.

Ethics approval and consent to participate

Not applicable.

Author information

Authors and Affiliations

Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315, Germany
Mona Riemenschneider, Thomas Hummel & Dominik Heider
University of Applied Science Weihenstephan-Triesdorf, Am Hofgarten 4, Freising, 85354, Germany
Mona Riemenschneider, Thomas Hummel & Dominik Heider
Wissenschaftszentrum Weihenstephan, Technische Universität München, Alte Akademie 8, Freising, 85354, Germany
Dominik Heider

Authors

Mona Riemenschneider
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Hummel
View author publications
You can also search for this author in PubMed Google Scholar
Dominik Heider
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dominik Heider.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Cite this article

Riemenschneider, M., Hummel, T. & Heider, D. SHIVA - a web application for drug resistance and tropism testing in HIV. BMC Bioinformatics 17, 314 (2016). https://doi.org/10.1186/s12859-016-1179-2

Download citation

Received: 28 October 2015
Accepted: 11 August 2016
Published: 22 August 2016
DOI: https://doi.org/10.1186/s12859-016-1179-2

SHIVA - a web application for drug resistance and tropism testing in HIV