1
|
Apetroaei MM, Velescu BȘ, Nedea MI(I, Dinu-Pîrvu CE, Drăgănescu D, Fâcă AI, Udeanu DI, Arsene AL. The Phenomenon of Antiretroviral Drug Resistance in the Context of Human Immunodeficiency Virus Treatment: Dynamic and Ever Evolving Subject Matter. Biomedicines 2024; 12:915. [PMID: 38672269 PMCID: PMC11048092 DOI: 10.3390/biomedicines12040915] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 04/11/2024] [Accepted: 04/17/2024] [Indexed: 04/28/2024] Open
Abstract
Human immunodeficiency virus (HIV) is a significant global health issue that affects a substantial number of individuals across the globe, with a total of 39 million individuals living with HIV/AIDS. ART has resulted in a reduction in HIV-related mortality. Nevertheless, the issue of medication resistance is a significant obstacle in the management of HIV/AIDS. The unique genetic composition of HIV enables it to undergo rapid mutations and adapt, leading to the emergence of drug-resistant forms. The development of drug resistance can be attributed to various circumstances, including noncompliance with treatment regimens, insufficient dosage, interactions between drugs, viral mutations, preexposure prophylactics, and transmission from mother to child. It is therefore essential to comprehend the molecular components of HIV and the mechanisms of antiretroviral medications to devise efficacious treatment options for HIV/AIDS.
Collapse
Affiliation(s)
- Miruna-Maria Apetroaei
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
| | - Bruno Ștefan Velescu
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
| | - Marina Ionela (Ilie) Nedea
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
| | - Cristina Elena Dinu-Pîrvu
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
| | - Doina Drăgănescu
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
| | - Anca Ionela Fâcă
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
- Marius Nasta Institute of Pneumophthisiology, 90 Viilor Street, 050159 Bucharest, Romania
| | - Denisa Ioana Udeanu
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
- Marius Nasta Institute of Pneumophthisiology, 90 Viilor Street, 050159 Bucharest, Romania
| | - Andreea Letiția Arsene
- Faculty of Pharmacy, Carol Davila University of Medicine and Pharmacy, 6 Traian Vuia Street, 020956 Bucharest, Romania; (M.-M.A.); (M.I.N.); (C.E.D.-P.); (D.D.); (A.I.F.); (D.I.U.); (A.L.A.)
- Marius Nasta Institute of Pneumophthisiology, 90 Viilor Street, 050159 Bucharest, Romania
| |
Collapse
|
2
|
Sivamalar S, Gomathi S, Boobalan J, Balakrishnan P, Pradeep A, Devaraj CA, Solomonl SS, Nallusamy D, Nalini D, Sureka V, Saravanan S. Delayed identification of treatment failure causes high levels of acquired drug resistance and less future drug options among HIV-1-infected South Indians. Indian J Med Microbiol 2024; 47:100520. [PMID: 38052366 DOI: 10.1016/j.ijmmb.2023.100520] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2023] [Revised: 06/21/2023] [Accepted: 11/28/2023] [Indexed: 12/07/2023]
Abstract
PURPOSE HIV-1 Drug Resistance Mutations (DRMs) among Immunological failure (IF) on NRTI based first-line regimens, Thymidine analogue (TA) - AZT & D4T and Non-Thymidine Analogue (NTA) -TDF; and predict viral drug susceptibility to gain vision about optimal treatment strategies for second-line. METHODS Cross-sectionally, 300 HIV-1 infected patients, failing first-line HAART were included. HIV-1 pol gene spanning 20-240 codons of RT was genotyped and mutation pattern was examined, (IAS-USA 2014 and Stanford HIV drug resistance database v7.0). RESULTS The median age of the participants was 35 years (IQR 29-40), CD4 T cell count of TDF failures was low at 172 cells/μL (IQR 80-252), and treatment duration was low among TDF failures (24 months vs. 61 months) (p < 0.0001). Majority of the TDF failures were on EFV based first-line (89 % vs 45 %) (p < 0.0001). Level of resistance for TDF and AZT shows, that resistance to TDF was about one-third (37 %) of TDF participants and onefourth (23 %) of AZT participants; resistance to AZT was 17 % among TDF participants and 47 % among AZT participants; resistance to both AZT and TDF was significantly high among AZT participants [21 % vs. 8 %, OR 3.057 (95 % CI 1.4-6.8), p < 0.0001]. CONCLUSION Although delayed identification of treatment failure caused high levels of acquired drug resistance in our study. Thus, we must include measures to regularize virological monitoring with integrated resistance testing in LMIC (Low and Middle Income Countries) like in India; this will help to preserve the effectiveness of ARV and ensure the success of ending AIDS as public health by 2030.
Collapse
Affiliation(s)
- Sathasivam Sivamalar
- Meenakshi Academy of Higher Education and Research (Deemed to be University), West K. K. Nagar, Chennai, 600 078, India; YR Gaitonde Centre for AIDS Research and Education, Voluntary Health Services, Hospital Campus, Taramani, Chennai, 600 113, India
| | - Selvamurthi Gomathi
- YR Gaitonde Centre for AIDS Research and Education, Voluntary Health Services, Hospital Campus, Taramani, Chennai, 600 113, India
| | - Jayaseelan Boobalan
- YR Gaitonde Centre for AIDS Research and Education, Voluntary Health Services, Hospital Campus, Taramani, Chennai, 600 113, India
| | - Pachamuthu Balakrishnan
- Centre for Infectious Diseases Saveetha Medical College & Hospitals [SMCH], Saveetha Institute of Medical and Technical Sciences [SIMATS], Saveetha University, Thandalam, Chennai, 602105, India
| | - Amrose Pradeep
- YR Gaitonde Centre for AIDS Research and Education, Voluntary Health Services, Hospital Campus, Taramani, Chennai, 600 113, India
| | - Chithra A Devaraj
- YR Gaitonde Centre for AIDS Research and Education, Voluntary Health Services, Hospital Campus, Taramani, Chennai, 600 113, India
| | - Sunil Suhas Solomonl
- YR Gaitonde Centre for AIDS Research and Education, Voluntary Health Services, Hospital Campus, Taramani, Chennai, 600 113, India; Johns Hopkins University School of Medicine, Baltimore, MD, USA
| | - Duraisamy Nallusamy
- Meenakshi Academy of Higher Education and Research (Deemed to be University), West K. K. Nagar, Chennai, 600 078, India
| | - Devarajan Nalini
- Meenakshi Academy of Higher Education and Research (Deemed to be University), West K. K. Nagar, Chennai, 600 078, India
| | - Varalakshmi Sureka
- Meenakshi Academy of Higher Education and Research (Deemed to be University), West K. K. Nagar, Chennai, 600 078, India
| | - Shanmugam Saravanan
- Centre for Infectious Diseases Saveetha Medical College & Hospitals [SMCH], Saveetha Institute of Medical and Technical Sciences [SIMATS], Saveetha University, Thandalam, Chennai, 602105, India.
| |
Collapse
|
3
|
Paremskaia AI, Rudik AV, Filimonov DA, Lagunin AA, Poroikov VV, Tarasova OA. Web Service for HIV Drug Resistance Prediction Based on Analysis of Amino Acid Substitutions in Main Drug Targets. Viruses 2023; 15:2245. [PMID: 38005921 PMCID: PMC10674809 DOI: 10.3390/v15112245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 10/30/2023] [Accepted: 11/06/2023] [Indexed: 11/26/2023] Open
Abstract
Predicting viral drug resistance is a significant medical concern. The importance of this problem stimulates the continuous development of experimental and new computational approaches. The use of computational approaches allows researchers to increase therapy effectiveness and reduce the time and expenses involved when the prescribed antiretroviral therapy is ineffective in the treatment of infection caused by the human immunodeficiency virus type 1 (HIV-1). We propose two machine learning methods and the appropriate models for predicting HIV drug resistance related to amino acid substitutions in HIV targets: (i) k-mers utilizing the random forest and the support vector machine algorithms of the scikit-learn library, and (ii) multi-n-grams using the Bayesian approach implemented in MultiPASSR software. Both multi-n-grams and k-mers were computed based on the amino acid sequences of HIV enzymes: reverse transcriptase and protease. The performance of the models was estimated by five-fold cross-validation. The resulting classification models have a relatively high reliability (minimum accuracy for the drugs is 0.82, maximum: 0.94) and were used to create a web application, HVR (HIV drug Resistance), for the prediction of HIV drug resistance to protease inhibitors and nucleoside and non-nucleoside reverse transcriptase inhibitors based on the analysis of the amino acid sequences of the appropriate HIV proteins from clinical samples.
Collapse
Affiliation(s)
- Anastasiia Iu. Paremskaia
- Department of Bioinformatics, Pirogov Russian National Research Medical University, Ostrovitianov Str. 1, Moscow 117997, Russia;
- Live Sciences Research Center, Moscow Institute of Physics and Technology, National Research University, Institutsky Lane 9, Dolgoprudny 141700, Russia
| | - Anastassia V. Rudik
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow 119121, Russia; (A.V.R.); (D.A.F.); (V.V.P.)
| | - Dmitry A. Filimonov
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow 119121, Russia; (A.V.R.); (D.A.F.); (V.V.P.)
| | - Alexey A. Lagunin
- Department of Bioinformatics, Pirogov Russian National Research Medical University, Ostrovitianov Str. 1, Moscow 117997, Russia;
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow 119121, Russia; (A.V.R.); (D.A.F.); (V.V.P.)
| | - Vladimir V. Poroikov
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow 119121, Russia; (A.V.R.); (D.A.F.); (V.V.P.)
| | - Olga A. Tarasova
- Laboratory of Structure-Function Based Drug Design, Institute of Biomedical Chemistry, 10 bldg. 8, Pogodinskaya Str., Moscow 119121, Russia; (A.V.R.); (D.A.F.); (V.V.P.)
| |
Collapse
|
4
|
Learning to increase the power of conditional randomization tests. Mach Learn 2023. [DOI: 10.1007/s10994-023-06302-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
5
|
Fithian W, Lei L. Conditional calibration for false discovery rate control under dependence. Ann Stat 2022. [DOI: 10.1214/21-aos2137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- William Fithian
- Department of Statistics, University of California, Berkeley
| | - Lihua Lei
- Department of Statistics, Stanford University
| |
Collapse
|
6
|
Tao J, Li B, Xue L. An additive graphical model for discrete data. J Am Stat Assoc 2022. [DOI: 10.1080/01621459.2022.2119983] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Affiliation(s)
- Jun Tao
- Department of Statistics, The Pennsylvania State University
| | - Bing Li
- Department of Statistics, The Pennsylvania State University
| | - Lingzhou Xue
- Department of Statistics, The Pennsylvania State University
| |
Collapse
|
7
|
Prevalence and Structure of HIV-1 Drug Resistance to Antiretrovirals in the Volga Federal District in 2008-2019. Viruses 2022; 14:v14091898. [PMID: 36146704 PMCID: PMC9503045 DOI: 10.3390/v14091898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/16/2022] [Accepted: 08/22/2022] [Indexed: 11/16/2022] Open
Abstract
The increasing number of HIV-infected people who are receiving ART, including those with low adherence, is causing the spread of HIV drug resistance (DR). A total of 1396 plasma samples obtained from treatment-experienced patients from the Volga federal district (VFD), Russia, were examined to investigate HIV DR occurrence. The time periods 2008−2015 and 2016−2019 were compared. Fragmentary Sanger sequencing was employed to identify HIV resistance to reverse transcriptase inhibitors (RTIs) and protease inhibitors (PIs) using an ABI 3500XL genetic analyzer, a ViroSeq™ HIV-1 genotyping system (Alameda, CA, USA) and AmpliSense HIV-Resist-Seq reagent kits (Moscow, Russia). In 2016−2019, HIV DR was detected significantly more often than in 2008−2015 (p < 0.01). Mutations to RTIs retained leading positions in the structure of DR. Frequencies of resistance mutations to nucleoside and non-nucleoside RTIs (NRTIs and NNRTIs) in the spectra of detected mutations show no significant differences. Resistance to NRTIs after 2016 began to be registered more often as a part of multidrug resistance (MDR), as opposed to resistance to a single class of antiretrovirals. The frequency of DR mutations to PIs was low, both before and after 2016 (7.9% and 6.1% in the spectrum, respectively, p > 0.05). MDR registration rate became significantly higher from 2008 to 2019 (17.1% to 72.7% of patients, respectively, p < 0.01). M184V was the dominant replacement in all the years of study. A significant increase in the frequency of K65R replacement was revealed. The prevalence of integrase strand transfer inhibitor (INSTI) resistance mutations remains to be investigated.
Collapse
|
8
|
Honeyman AS, Fegel TS, Peel HF, Masters NA, Vuono DC, Kleiber W, Rhoades CC, Spear JR. Statistical Learning and Uncommon Soil Microbiota Explain Biogeochemical Responses after Wildfire. Appl Environ Microbiol 2022; 88:e0034322. [PMID: 35703548 PMCID: PMC9275219 DOI: 10.1128/aem.00343-22] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2022] [Accepted: 05/16/2022] [Indexed: 11/20/2022] Open
Abstract
Wildfires are a perennial event globally, and the biogeochemical underpinnings of soil responses at relevant spatial and temporal scales are unclear. Soil biogeochemical processes regulate plant growth and nutrient losses that affect water quality, yet the response of soil after variable intensity fire is difficult to explain and predict. To address this issue, we examined two wildfires in Colorado, United States, across the first and second postfire years and leveraged statistical learning (SL) to predict and explain biogeochemical responses. We found that SL predicts biogeochemical responses in soil after wildfire with surprising accuracy. Of the 13 biogeochemical analytes analyzed in this study, 9 are best explained with a hybrid microbiome + biogeochemical SL model. Biogeochemical-only models best explain 3 features, and 1 feature is explained equally well with the hybrid and biogeochemical-only models. In some cases, microbiome-only SL models are also effective (such as predicting NH4+). Whenever a microbiome component is employed, selected features always involve uncommon soil microbiota (i.e., the "rare biosphere" [existing at <1% mean relative abundance]). Here, we demonstrate that SL paired with DNA sequence and biogeochemical data predicts environmental features in postfire soils, although this approach could likely be applied to any biogeochemical system. IMPORTANCE Soil biogeochemical processes are critical to plant growth and water quality and are substantially disturbed by wildfire. However, soil responses to fire are difficult to predict. To address this issue, we developed a large environmental data set that tracks postfire changes in soil and used statistical learning (SL) to build models that exploit complex data to make predictions about biogeochemical responses. Here, we show that SL depends upon uncommon microbiota in soil (the "rare biosphere") to make surprisingly accurate predictions about soil biogeochemical responses to wildfire. Using SL to explain variation in a natively chaotic environmental system is mechanism independent. Likely, the approach that we describe for combining SL with microbiome and biogeochemical parameters has practical applications across a range of issues in the environmental sciences where predicting responses would be useful.
Collapse
Affiliation(s)
- Alexander S. Honeyman
- Civil and Environmental Engineering, Colorado School of Mines, Golden, Colorado, USA
| | - Timothy S. Fegel
- Rocky Mountain Research Station, USDA Forest Service, Fort Collins, Colorado, USA
| | - Henry F. Peel
- Civil and Environmental Engineering, Colorado School of Mines, Golden, Colorado, USA
| | - Nicole A. Masters
- Civil and Environmental Engineering, Colorado School of Mines, Golden, Colorado, USA
| | - David C. Vuono
- Civil and Environmental Engineering, Colorado School of Mines, Golden, Colorado, USA
| | - William Kleiber
- Applied Mathematics, University of Colorado, Boulder, Colorado, USA
| | - Charles C. Rhoades
- Rocky Mountain Research Station, USDA Forest Service, Fort Collins, Colorado, USA
| | - John R. Spear
- Civil and Environmental Engineering, Colorado School of Mines, Golden, Colorado, USA
- Quantitative Biosciences and Engineering, Colorado School of Mines, Golden, Colorado, USA
| |
Collapse
|
9
|
Pikalyova K, Orlov A, Lin A, Tarasova O, Marcou M, Horvath D, Poroikov V, Varnek A. HIV-1 drug resistance profiling using amino acid sequence space cartography. Bioinformatics 2022; 38:2307-2314. [PMID: 35157024 DOI: 10.1093/bioinformatics/btac090] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Revised: 01/03/2022] [Accepted: 02/08/2022] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Human immunodeficiency virus (HIV) drug resistance is a global healthcare issue. The emergence of drug resistance influenced the efficacy of treatment regimens, thus stressing the importance of treatment adaptation. Computational methods predicting the drug resistance profile from genomic data of HIV isolates are advantageous for monitoring drug resistance in patients. However, existing computational methods for drug resistance prediction are either not suitable for emerging HIV strains with complex mutational patterns or lack interpretability, which is of paramount importance in clinical practice. The approach reported here overcomes these limitations and combines high accuracy of predictions and interpretability of the models. RESULTS In this work, a new methodology based on generative topographic mapping (GTM) for biological sequence space representation and quantitative genotype-phenotype relationships prediction purposes was introduced. The GTM-based resistance landscapes allowed us to predict the resistance of HIV strains based on sequencing and drug resistance data for three viral proteins [integrase (IN), protease (PR) and reverse transcriptase (RT)] from Stanford HIV drug resistance database. The average balanced accuracy for PR inhibitors was 0.89 ± 0.01, for IN inhibitors 0.85 ± 0.01, for non-nucleoside RT inhibitors 0.73 ± 0.01 and for nucleoside RT inhibitors 0.84 ± 0.01. We have demonstrated in several case studies that GTM-based resistance landscapes are useful for visualization and analysis of sequence space as well as for treatment optimization purposes. Here, GTMs were applied for the in-depth analysis of the relationships between mutation pattern and drug resistance using mutation landscapes. This allowed us to predict retrospectively the importance of the presence of particular mutations (e.g. V32I, L10F and L33F in HIV PR) for the resistance development. This study highlights some perspectives of GTM applications in clinical informatics and particularly in the field of sequence space exploration. AVAILABILITY AND IMPLEMENTATION https://github.com/karinapikalyova/ISIDASeq. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Karina Pikalyova
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, Strasbourg 67000, France
| | - Alexey Orlov
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, Strasbourg 67000, France
| | - Arkadii Lin
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, Strasbourg 67000, France
| | - Olga Tarasova
- Institute of Biomedical Chemistry, Moscow 119121, Russia
| | - MarcouGilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, Strasbourg 67000, France
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, Strasbourg 67000, France
| | | | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140, Université de Strasbourg, Strasbourg 67000, France
| |
Collapse
|
10
|
Affiliation(s)
| | - Buyu Lin
- Department of Statistics, Harvard University
| | - Xin Xing
- Department of Statistics, Virginia Tech
| | - Jun S. Liu
- Department of Statistics, Harvard University
| |
Collapse
|
11
|
Gianti E, Percec S. Machine Learning at the Interface of Polymer Science and Biology: How Far Can We Go? Biomacromolecules 2022; 23:576-591. [PMID: 35133143 DOI: 10.1021/acs.biomac.1c01436] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This Perspective outlines recent progress and future directions for using machine learning (ML), a data-driven method, to address critical questions in the design, synthesis, processing, and characterization of biomacromolecules. The achievement of these tasks requires the navigation of vast and complex chemical and biological spaces, difficult to accomplish with reasonable speed. Using modern algorithms and supercomputers, quantum physics methods are able to examine systems containing a few hundred interacting species and determine the probability of finding them in a particular region of phase space, thereby anticipating their properties. Likewise, modern approaches in chemistry and biomolecular simulation, supported by high performance computing, have culminated in producing data sets of escalating size and intrinsically high complexity. Hence, using ML to extract relevant information from these fields is of paramount importance to advance our understanding of chemical and biomolecular systems. At the heart of ML approaches lie statistical algorithms, which by evaluating a portion of a given data set, identify, learn, and manipulate the underlying rules that govern the whole data set. The assembly of a quality model to represent the data followed by the predictions and elimination of error sources are the key steps in ML. In addition to a growing infrastructure of ML tools to address complex problems, an increasing number of aspects related to our understanding of the fundamental properties of biomacromolecules are exposed to ML. These fields, including those residing at the interface of polymer science and biology (i.e., structure determination, de novo design, folding, and dynamics), strive to adopt and take advantage of the transformative power offered by approaches in the ML domain, which clearly has the potential of accelerating research in the field of biomacromolecules.
Collapse
Affiliation(s)
- Eleonora Gianti
- Institute for Computational Molecular Science (ICMS), Temple University, Philadelphia, Pennsylvania 19122, United States.,Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Simona Percec
- Department of Chemistry, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
12
|
Sarkar SK, Tang CY. Adjusting the Benjamini-Hochberg method for controlling the false discovery rate in knockoff-assisted variable selection. Biometrika 2021. [DOI: 10.1093/biomet/asab066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary
We consider the knockoff-based multiple testing setup of Barber & Candés (2015) for variable selection in multiple regression. The method of Benjamini & Hochberg (1995) and an adaptive version of it are adjusted to this setup, transforming them to valid p-value based, false discovery rate controlling methods that do not rely on specifying the correlation structure of the explanatory 15 variables. Simulations and real data applications show that our proposed methods are powerful competitors of the false discovery rate controlling method in Barber & Candés (2015).
Collapse
Affiliation(s)
- Sanat K Sarkar
- Department of Statistical Science, Temple University, 1810 Liacouras Walk, Philadelphia, Pennsylvania 19122-6083, U.S.A
| | - Cheng Yong Tang
- Department of Statistical Science, Temple University, 1810 Liacouras Walk, Philadelphia, Pennsylvania 19122-6083, U.S.A
| |
Collapse
|
13
|
Got FEB, Recordon-Pinson P, Loubano-Voumbi G, Ebourombi D, Blondot ML, Metifiot M, Ondzotto G, Andreola ML. Absence of Resistance Mutations in the Integrase Coding Region among ART-Experienced Patients in the Republic of the Congo. Microorganisms 2021; 9:2355. [PMID: 34835480 PMCID: PMC8620905 DOI: 10.3390/microorganisms9112355] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 11/05/2021] [Accepted: 11/07/2021] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND HIV infects around one hundred thousand patients in the Republic of the Congo. Approximately 25% of them receive an antiretroviral treatment; current first-line regimens include two NRTIs and one NNRTI, reverse transcriptase inhibitors. Recently, protease inhibitors (PIs) were also introduced as second-line therapy upon clinical signs of treatment failure. Due to the limited number of molecular characterizations and amount of drug resistance data available in the Republic of the Congo, this study aims to evaluate the prevalence of circulating resistance mutations within the pol region. METHODS HIV-positive, ART-experienced patients have been enrolled in four semi-urban localities in the Republic of the Congo. Plasma samples were collected, and viral RNA was extracted. The viral load for each patient was evaluated by RT-qPCR, following the general diagnostic procedures of the University Hospital of Bordeaux. Finally, drug resistance genotyping and phylogenetic analysis were conducted following Sanger sequencing of the pol region. RESULTS A high diversity of HIV-1 strains was observed with many recombinant forms. Drug resistance mutations in RT and PR genes were determined and correlated to HAART. Because integrase inhibitors are rarely included in treatments in the Republic of the Congo, the prevalence of integrase drug resistance mutations before treatment was also determined. Interestingly, very few mutations were observed. CONCLUSIONS We confirmed a high diversity of HIV-1 in the Republic of the Congo. Most patients presented an accumulation of mutations conferring resistance against NRTIs, NNRTIs and PIs. Nonetheless, the absence of integrase mutations associated with drug resistance suggests that the introduction of integrase inhibitors into therapy will be highly beneficial to patients in the Republic of the Congo.
Collapse
Affiliation(s)
- Ferdinand Emaniel Brel Got
- Faculté des Sciences de la Santé, Université Marien Ngouabi, Brazzaville BP69, Democratic Republic of the Congo; (F.E.B.G.); (G.L.-V.); (D.E.); (G.O.)
- UMR 5234 Microbiologie Fondamentale et Pathogénicité, CNRS, Univ. Bordeaux, F-33000 Bordeaux, France; (M.-L.B.); (M.M.); (M.-L.A.)
| | - Patricia Recordon-Pinson
- UMR 5234 Microbiologie Fondamentale et Pathogénicité, CNRS, Univ. Bordeaux, F-33000 Bordeaux, France; (M.-L.B.); (M.M.); (M.-L.A.)
- Virology Laboratory, WHO HIV Center, CHU Bordeaux, F-33000 Bordeaux, France
| | - Ghislain Loubano-Voumbi
- Faculté des Sciences de la Santé, Université Marien Ngouabi, Brazzaville BP69, Democratic Republic of the Congo; (F.E.B.G.); (G.L.-V.); (D.E.); (G.O.)
| | - Dagene Ebourombi
- Faculté des Sciences de la Santé, Université Marien Ngouabi, Brazzaville BP69, Democratic Republic of the Congo; (F.E.B.G.); (G.L.-V.); (D.E.); (G.O.)
| | - Marie-Lise Blondot
- UMR 5234 Microbiologie Fondamentale et Pathogénicité, CNRS, Univ. Bordeaux, F-33000 Bordeaux, France; (M.-L.B.); (M.M.); (M.-L.A.)
| | - Mathieu Metifiot
- UMR 5234 Microbiologie Fondamentale et Pathogénicité, CNRS, Univ. Bordeaux, F-33000 Bordeaux, France; (M.-L.B.); (M.M.); (M.-L.A.)
| | - Gontran Ondzotto
- Faculté des Sciences de la Santé, Université Marien Ngouabi, Brazzaville BP69, Democratic Republic of the Congo; (F.E.B.G.); (G.L.-V.); (D.E.); (G.O.)
| | - Marie-Line Andreola
- UMR 5234 Microbiologie Fondamentale et Pathogénicité, CNRS, Univ. Bordeaux, F-33000 Bordeaux, France; (M.-L.B.); (M.M.); (M.-L.A.)
| |
Collapse
|
14
|
Scriven YA, Mulinge MM, Saleri N, Luvai EA, Nyachieo A, Maina EN, Mwau M. Prevalence and factors associated with HIV-1 drug resistance mutations in treatment-experienced patients in Nairobi, Kenya: A cross-sectional study. Medicine (Baltimore) 2021; 100:e27460. [PMID: 34622871 PMCID: PMC8500620 DOI: 10.1097/md.0000000000027460] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Accepted: 09/20/2021] [Indexed: 01/05/2023] Open
Abstract
ABSTRACT An estimated 1.5 million Kenyans are HIV-seropositive, with 1.1 million on antiretroviral therapy (ART), with the majority of them unaware of their drug resistance status. In this study, we assessed the prevalence of drug resistance to nucleoside reverse transcriptase inhibitors (NRTIs), nucleoside reverse transcriptase inhibitors (NNRTIs), and protease inhibitors, and the variables associated with drug resistance in patients failing treatment in Nairobi, Kenya.This cross-sectional study utilized 128 HIV-positive plasma samples obtained from patients enrolled for routine viral monitoring in Nairobi clinics between 2015 and 2017. The primary outcome was human immunodeficiency virus type 1 (HIV-1) drug resistance mutation counts determined by Sanger sequencing of the polymerase (pol) gene followed by interpretation using Stanford's HIV Drug Resistance Database. Poisson regression was used to determine the effects of sex, viral load, age, HIV-subtype, treatment duration, and ART-regimen on the primary outcome.HIV-1 drug resistance mutations were found in 82.3% of the subjects, with 15.3% of subjects having triple-class ART resistance and 45.2% having dual-class resistance. NRTI primary mutations M184 V/I and K65R/E/N were found in 28.8% and 8.9% of subjects respectively, while NNRTI primary mutations K103N/S, G190A, and Y181C were found in 21.0%, 14.6%, and 10.9% of subjects. We found statistically significant evidence (P = .013) that the association between treatment duration and drug resistance mutations differed by sex. An increase of one natural-log transformed viral load unit was associated with 11% increase in drug resistance mutation counts (incidence rate ratio [IRR] 1.11; 95% CI 1.06-1.16; P < .001) after adjusting for age, HIV-1 subtype, and the sex-treatment duration interaction. Subjects who had been on treatment for 31 to 60 months had 63% higher resistance mutation counts (IRR 1.63; 95% CI 1.12-2.43; P = .013) compared to the reference group (<30 months). Similarly, patients on ART for 61 to 90 months were associated with 133% higher mutation counts than the reference group (IRR 2.33; 95% CI 1.59-3.49; P < .001). HIV-1 subtype, age, or ART-regimen were not associated with resistance mutation counts.Drug resistance mutations were found in alarmingly high numbers, and they were associated with viral load and treatment time. This finding emphasizes the importance of targeted resistance monitoring as a tool for addressing the problem.
Collapse
Affiliation(s)
- Yvonne A Scriven
- Centre for Infectious and Parasitic Diseases Control Research, Kenya Medical Research Institute, Busia, Kenya
| | - Martin M Mulinge
- Department of Biochemistry, School of Medicine, University of Nairobi, Nairobi, Kenya
- Kenya AIDS Vaccine Initiative - Institute of Clinical Research, University of Nairobi, Nairobi, Kenya
| | - Norah Saleri
- Centre for Infectious and Parasitic Diseases Control Research, Kenya Medical Research Institute, Busia, Kenya
| | - Elizabeth A Luvai
- Centre for Infectious and Parasitic Diseases Control Research, Kenya Medical Research Institute, Busia, Kenya
| | - Atunga Nyachieo
- Department of Biochemistry, School of Medicine, University of Nairobi, Nairobi, Kenya
| | - Esther N Maina
- Department of Biochemistry, School of Medicine, University of Nairobi, Nairobi, Kenya
| | - Matilu Mwau
- Centre for Infectious and Parasitic Diseases Control Research, Kenya Medical Research Institute, Busia, Kenya
| |
Collapse
|
15
|
Panigrahi S, Taylor J, Weinstein A. Integrative methods for post-selection inference under convex constraints. Ann Stat 2021. [DOI: 10.1214/21-aos2057] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Affiliation(s)
| | | | - Asaf Weinstein
- Department of Statistics, Hebrew University of Jerusalem
| |
Collapse
|
16
|
Qiu J, Tian X, Liu J, Qin Y, Zhu J, Xu D, Qiu T. Revealing the Mutation Patterns of Drug-Resistant Reverse Transcriptase Variants of Human Immunodeficiency Virus through Proteochemometric Modeling. Biomolecules 2021; 11:biom11091302. [PMID: 34572515 PMCID: PMC8467226 DOI: 10.3390/biom11091302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Revised: 08/31/2021] [Accepted: 09/01/2021] [Indexed: 11/16/2022] Open
Abstract
Drug-resistant cases of human immunodeficiency virus (HIV) nucleoside reverse transcriptase inhibitors (NRTI) are constantly accumulating due to the frequent mutations of the reverse transcriptase (RT). Predicting the potential drug resistance of HIV-1 NRTIs could provide instructions for the proper clinical use of available drugs. In this study, a novel proteochemometric (PCM) model was constructed to predict the drug resistance between six NRTIs against different variants of RT. Forty-seven dominant mutation sites were screened using the whole protein of HIV-1 RT. Thereafter, the physicochemical properties of the dominant mutation sites can be derived to generate the protein descriptors of RT. Furthermore, by combining the molecular descriptors of NRTIs, PCM modeling can be constructed to predict the inhibition ability between RT variants and NRTIs. The results indicated that our PCM model could achieve a mean AUC value of 0.946 and a mean accuracy of 0.873 on the external validation set. Finally, based on PCM modeling, the importance of features was calculated to reveal the dominant amino acid distribution and mutation patterns on RT, to reflect the characteristics of drug-resistant sequences.
Collapse
Affiliation(s)
- Jingxuan Qiu
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; (J.Q.); (X.T.); (J.L.); (Y.Q.); (J.Z.); (D.X.)
| | - Xinxin Tian
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; (J.Q.); (X.T.); (J.L.); (Y.Q.); (J.Z.); (D.X.)
| | - Jiangru Liu
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; (J.Q.); (X.T.); (J.L.); (Y.Q.); (J.Z.); (D.X.)
| | - Yulong Qin
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; (J.Q.); (X.T.); (J.L.); (Y.Q.); (J.Z.); (D.X.)
| | - Junjie Zhu
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; (J.Q.); (X.T.); (J.L.); (Y.Q.); (J.Z.); (D.X.)
| | - Dongpo Xu
- School of Medical Instrument and Food Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China; (J.Q.); (X.T.); (J.L.); (Y.Q.); (J.Z.); (D.X.)
| | - Tianyi Qiu
- Shanghai Public Health Clinical Center, Fudan University, Shanghai 200032, China
- Correspondence:
| |
Collapse
|
17
|
Fang F, Zhao J, Ahmed SE, Qu A. A weak‐signal‐assisted procedure for variable selection and statistical inference with an informative subsample. Biometrics 2021. [DOI: 10.1111/biom.13346] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Fang Fang
- Key Laboratory of Advanced Theory and Application in Statistics and Data Science ‐ MOE School of Statistics East China Normal University Shanghai China
| | - Jiwei Zhao
- Department of Biostatistics and Medical Informatics University of Wisconsin Madison Wisconsin
| | - S. Ejaz Ahmed
- Faculty of Mathematics and Science Brock University St. Catharines Ontario Canada
| | - Annie Qu
- Department of Statistics University of California Irvine California
| |
Collapse
|
18
|
Yendewa GA, Lakoh S, Yendewa SA, Bangura K, Tabernilla A, Patiño L, Jiba DF, Vandy AO, Massaquoi SP, Osório NS, Deen GF, Sahr F, Salata RA, Poveda E. Characterizing HIV-1 Genetic Subtypes and Drug Resistance Mutations among Children, Adolescents and Pregnant Women in Sierra Leone. Genes (Basel) 2021; 12:1314. [PMID: 34573296 PMCID: PMC8469552 DOI: 10.3390/genes12091314] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 08/15/2021] [Accepted: 08/24/2021] [Indexed: 11/17/2022] Open
Abstract
Human immunodeficiency virus (HIV) drug resistance (HIVDR) is widespread in sub-Saharan Africa. Children and pregnant women are particularly vulnerable, and laboratory testing capacity remains limited. We, therefore, used a cross-sectional design and convenience sampling to characterize HIV subtypes and resistance-associated mutations (RAMs) in these groups in Sierra Leone. In total, 96 children (age 2-9 years, 100% ART-experienced), 47 adolescents (age 10-18 years, 100% ART-experienced), and 54 pregnant women (>18 years, 72% ART-experienced) were enrolled. Median treatment durations were 36, 84, and 3 months, respectively, while the sequencing success rates were 45%, 70%, and 59%, respectively, among children, adolescents, and pregnant women. Overall, the predominant HIV-1 subtype was CRF02_AG (87.9%, 95/108), with minority variants constituting 12%. Among children and adolescents, the most common RAMs were M184V (76.6%, n = 49/64), K103N (45.3%, n = 29/64), Y181C/V/I (28.1%, n = 18/64), T215F/Y (25.0%, n = 16/64), and V108I (18.8%, n = 12/64). Among pregnant women, the most frequent RAMs were K103N (20.6%, n = 7/34), M184V (11.8%, n = 4/34), Y181C/V/I (5.9%, n = 2/34), P225H (8.8%, n = 3/34), and K219N/E/Q/R (5.9%, n = 2/34). Protease and integrase inhibitor-RAMs were relatively few or absent. Based on the genotype susceptibility score distributions, 73%, 88%, and 14% of children, adolescents, and pregnant women, respectively, were not susceptible to all three drug components of the WHO preferred first-line regimens per 2018 guidelines. These findings suggest that routine HIVDR surveillance and access to better ART choices may improve treatment outcomes in Sierra Leone.
Collapse
Affiliation(s)
- George A. Yendewa
- Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA;
- Division of Infectious Diseases and HIV Medicine, University Hospitals Cleveland Medical Center, Cleveland, OH 44106, USA
- Johns Hopkins Bloomberg School of Public Health, Baltimore, MD 21205, USA
| | - Sulaiman Lakoh
- Department of Medicine, College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone; (S.L.); (S.A.Y.); (G.F.D.); (F.S.)
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Sahr A. Yendewa
- Department of Medicine, College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone; (S.L.); (S.A.Y.); (G.F.D.); (F.S.)
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Khadijah Bangura
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Andrés Tabernilla
- Group of Virology and Pathogenesis, Galicia Sur Health Research Institute (IIS Galicia Sur), Complexo Hospitalario Universitario de Vigo, SERGAS-UVigo, 36213 Vigo, Spain; (A.T.); (L.P.); (E.P.)
| | - Lucia Patiño
- Group of Virology and Pathogenesis, Galicia Sur Health Research Institute (IIS Galicia Sur), Complexo Hospitalario Universitario de Vigo, SERGAS-UVigo, 36213 Vigo, Spain; (A.T.); (L.P.); (E.P.)
| | - Darlinda F. Jiba
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Alren O. Vandy
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Samuel P. Massaquoi
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Nuno S. Osório
- Life and Health Sciences Research Institute (ICVS), School of Medicine, University of Minho, 4710-057 Braga, Portugal;
| | - Gibrilla F. Deen
- Department of Medicine, College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone; (S.L.); (S.A.Y.); (G.F.D.); (F.S.)
- Ministry of Health and Sanitation, University of Sierra Leone Teaching Hospitals Complex, Freetown, Sierra Leone; (K.B.); (D.F.J.); (A.O.V.); (S.P.M.)
| | - Foday Sahr
- Department of Medicine, College of Medicine and Allied Health Sciences, University of Sierra Leone, Freetown, Sierra Leone; (S.L.); (S.A.Y.); (G.F.D.); (F.S.)
| | - Robert A. Salata
- Department of Medicine, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA;
- Division of Infectious Diseases and HIV Medicine, University Hospitals Cleveland Medical Center, Cleveland, OH 44106, USA
| | - Eva Poveda
- Group of Virology and Pathogenesis, Galicia Sur Health Research Institute (IIS Galicia Sur), Complexo Hospitalario Universitario de Vigo, SERGAS-UVigo, 36213 Vigo, Spain; (A.T.); (L.P.); (E.P.)
| |
Collapse
|
19
|
Cai Q, Yuan R, He J, Li M, Guo Y. Predicting HIV drug resistance using weighted machine learning method at target protein sequence-level. Mol Divers 2021; 25:1541-1551. [PMID: 34241771 DOI: 10.1007/s11030-021-10262-y] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Accepted: 06/19/2021] [Indexed: 11/29/2022]
Abstract
Acquired immune deficiency syndrome (AIDS) is a fatal disease caused by human immunodeficiency virus (HIV). Although 23 different drugs have been available, the treatment of AIDS remains challenging because the virus mutates very quickly which can lead to drug resistance. Therefore, predicting drug resistance before treatment is crucial for individual treatments. Here, based on HIV target protein sequence information, we analyzed 21-drug resistance caused by mutated residues using machine learning (ML) methods. To transform target sequences into numeric vectors, seven physicochemical properties were used, which can well represent the interacting characteristics of target proteins. Then, principal component analysis (PCA) method was adopted to reduce the feature dimensionality. Random forest (RF) and support vector machine (SVM) based on three different kernel functions, including linear, polynomial and radial basis function (RBF), were all employed. By comparisons, we found that RBF-based SVM method gives a comparative performance with RF model. Further, we added the weight information to RBF-based SVM method by four different weight evaluation methods of RF, eXtreme Gradient Boosting (XGB), CfsSubsetEval and ReliefFAttributeEval, respectively. Results show that the RF-weighted RBF-based SVM yield the superior performance and 13 out of 21 drug models provide the correlation coefficients (R2) over 0.8 and 3 of them are higher than 0.9. Finally, position-specific importance analysis indicates that most of the mutation residues with high RF weight scores are proved to be closely related with drug resistance, which has been revealed in previous reports. Overall, we can expect that this method can be a supplementary tool for predicting HIV drug resistance for newly discovered mutations. Here, based on HIV target protein sequence information, we analyzed 21-drug resistance caused by mutated residues using machine learning (ML) methods by fusing the weight information of different mutation positions.
Collapse
Affiliation(s)
- Qihang Cai
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Rongao Yuan
- College of Computer Science, Sichuan University, Chengdu, 610064, China
| | - Jian He
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Menglong Li
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu, 610064, Sichuan, China.
| |
Collapse
|
20
|
Liu Y, Ročková V. Variable Selection Via Thompson Sampling. J Am Stat Assoc 2021. [DOI: 10.1080/01621459.2021.1928514] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Affiliation(s)
- Yi Liu
- Department of Statistics, University of Chicago, Chicago, IL
| | | |
Collapse
|
21
|
Liu Y, Ročková V, Wang Y. Variable selection with ABC Bayesian forests. J R Stat Soc Series B Stat Methodol 2021. [DOI: 10.1111/rssb.12423] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Yi Liu
- Department of Statistics University of Chicago Chicago USA
| | | | - Yuexi Wang
- Booth School of Business University of Chicago Chicago USA
| |
Collapse
|
22
|
Identification of an Antiretroviral Small Molecule That Appears To Be a Host-Targeting Inhibitor of HIV-1 Assembly. J Virol 2021; 95:JVI.00883-20. [PMID: 33148797 PMCID: PMC7925099 DOI: 10.1128/jvi.00883-20] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Accepted: 10/25/2020] [Indexed: 12/16/2022] Open
Abstract
Given the projected increase in multidrug-resistant HIV-1, there is an urgent need for development of antiretrovirals that act on virus life cycle stages not targeted by drugs currently in use. Host-targeting compounds are of particular interest because they can offer a high barrier to resistance. Here, we report identification of two related small molecules that inhibit HIV-1 late events, a part of the HIV-1 life cycle for which potent and specific inhibitors are lacking. This chemotype was discovered using cell-free protein synthesis and assembly systems that recapitulate intracellular host-catalyzed viral capsid assembly pathways. These compounds inhibit replication of HIV-1 in human T cell lines and peripheral blood mononuclear cells, and are effective against a primary isolate. They reduce virus production, likely by inhibiting a posttranslational step in HIV-1 Gag assembly. Notably, the compound colocalizes with HIV-1 Gag in situ; however, unexpectedly, selection experiments failed to identify compound-specific resistance mutations in gag or pol, even though known resistance mutations developed upon parallel nelfinavir selection. Thus, we hypothesized that instead of binding to Gag directly, these compounds localize to assembly intermediates, the intracellular multiprotein complexes containing Gag and host factors that form during immature HIV-1 capsid assembly. Indeed, imaging of infected cells shows compound colocalized with two host enzymes found in assembly intermediates, ABCE1 and DDX6, but not two host proteins found in other complexes. While the exact target and mechanism of action of this chemotype remain to be determined, our findings suggest that these compounds represent first-in-class, host-targeting inhibitors of intracellular events in HIV-1 assembly.IMPORTANCE The success of antiretroviral treatment for HIV-1 is at risk of being undermined by the growing problem of drug resistance. Thus, there is a need to identify antiretrovirals that act on viral life cycle stages not targeted by drugs in use, such as the events of HIV-1 Gag assembly. To address this gap, we developed a compound screen that recapitulates the intracellular events of HIV-1 assembly, including virus-host interactions that promote assembly. This effort led to the identification of a new chemotype that inhibits HIV-1 replication at nanomolar concentrations, likely by acting on assembly. This compound colocalized with Gag and two host enzymes that facilitate capsid assembly. However, resistance selection did not result in compound-specific mutations in gag, suggesting that the chemotype does not directly target Gag. We hypothesize that this chemotype represents a first-in-class inhibitor of virus production that acts by targeting a virus-host complex important for HIV-1 Gag assembly.
Collapse
|
23
|
Hu L, Hu P, Luo X, Yuan X, You ZH. Incorporating the Coevolving Information of Substrates in Predicting HIV-1 Protease Cleavage Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:2017-2028. [PMID: 31056514 DOI: 10.1109/tcbb.2019.2914208] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Human immunodeficiency virus 1 (HIV-1) protease (PR) plays a crucial role in the maturation of the virus. The study of substrate specificity of HIV-1 PR as a new endeavor strives to increase our ability to understand how HIV-1 PR recognizes its various cleavage sites. To predict HIV-1 PR cleavage sites, most of the existing approaches have been developed solely based on the homogeneity of substrate sequence information with supervised classification techniques. Although efficient, these approaches are found to be restricted to the ability of explaining their results and probably provide few insights into the mechanisms by which HIV-1 PR cleaves the substrates in a site-specific manner. In this work, a coevolutionary pattern-based prediction model for HIV-1 PR cleavage sites, namely EvoCleave, is proposed by integrating the coevolving information obtained from substrate sequences with a linear SVM classifier. The experiment results showed that EvoCleave yielded a very promising performance in terms of ROC analysis and f-measure. We also prospectively assessed the biological significance of coevolutionary patterns by applying them to study three fundamental issues of HIV-1 PR cleavage site. The analysis results demonstrated that the coevolutionary patterns offered valuable insights into the understanding of substrate specificity of HIV-1 PR.
Collapse
|
24
|
Xing L, Lesperance ML, Zhang X. Simultaneous prediction of multiple outcomes using revised stacking algorithms. Bioinformatics 2020; 36:65-72. [PMID: 31263871 DOI: 10.1093/bioinformatics/btz531] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2019] [Revised: 05/30/2019] [Accepted: 06/28/2019] [Indexed: 01/04/2023] Open
Abstract
MOTIVATION HIV is difficult to treat because its virus mutates at a high rate and mutated viruses easily develop resistance to existing drugs. If the relationships between mutations and drug resistances can be determined from historical data, patients can be provided personalized treatment according to their own mutation information. The HIV Drug Resistance Database was built to investigate the relationships. Our goal is to build a model using data in this database, which simultaneously predicts the resistance of multiple drugs using mutation information from sequences of viruses for any new patient. RESULTS We propose two variations of a stacking algorithm which borrow information among multiple prediction tasks to improve multivariate prediction performance. The most attractive feature of our proposed methods is the flexibility with which complex multivariate prediction models can be constructed using any univariate prediction models. Using cross-validation studies, we show that our proposed methods outperform other popular multivariate prediction methods. AVAILABILITY AND IMPLEMENTATION An R package is being developed. In the meantime, R code can be requested by email. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Li Xing
- Department of Mathematics and Statistics, University of Saskatchewan, Saskatoon, SK S7N 5E6, Canada
| | - Mary L Lesperance
- Department of Mathematics and Statistics, University of Victoria, Victoria, BC V8W 2Y2, Canada
| | - Xuekui Zhang
- Department of Mathematics and Statistics, University of Victoria, Victoria, BC V8W 2Y2, Canada
| |
Collapse
|
25
|
Kneller DW, Agniswamy J, Harrison RW, Weber IT. Highly drug-resistant HIV-1 protease reveals decreased intra-subunit interactions due to clusters of mutations. FEBS J 2020; 287:3235-3254. [PMID: 31920003 PMCID: PMC7343616 DOI: 10.1111/febs.15207] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2019] [Revised: 11/16/2019] [Accepted: 01/08/2020] [Indexed: 01/07/2023]
Abstract
Drug-resistance is a serious problem for treatment of the HIV/AIDS pandemic. Potent clinical inhibitors of HIV-1 protease show several orders of magnitude worse inhibition of highly drug-resistant variants. Hence, the structure and enzyme activities were analyzed for HIV protease mutant HIV-1 protease (EC 3.4.23.16) (PR) with 22 mutations (PRS5B) from a clinical isolate that was selected by machine learning to represent high-level drug-resistance. PRS5B has 22 mutations including only one (I84V) in the inhibitor binding site; however, clinical inhibitors had poor inhibition of PRS5B activity with kinetic inhibition value (Ki ) values of 4-1000 nm or 18- to 8000-fold worse than for wild-type PR. High-resolution crystal structures of PRS5B complexes with the best inhibitors, amprenavir (APV) and darunavir (DRV) (Ki ~ 4 nm), revealed only minor changes in protease-inhibitor interactions. Instead, two distinct clusters of mutations in distal regions induce coordinated conformational changes that decrease favorable internal interactions across the entire protein subunit. The largest structural rearrangements are described and compared to other characterized resistant mutants. In the protease hinge region, the N83D mutation eliminates a hydrogen bond connecting the hinge and core of the protease and increases disorder compared to highly resistant mutants PR with 17 mutations and PR with 20 mutations with similar hinge mutations. In a distal β-sheet, mutations G73T and A71V coordinate with accessory mutations to bring about shifts that propagate throughout the subunit. Molecular dynamics simulations of ligand-free dimers show differences consistent with loss of interactions in mutant compared to wild-type PR. Clusters of mutations exhibit both coordinated and antagonistic effects, suggesting PRS5B may represent an intermediate stage in the evolution of more highly resistant variants. DATABASES: Structural data are available in Protein Data Bank under the accession codes 6P9A and 6P9B for PRS5B/DRV and PRS5B/APV, respectively.
Collapse
Affiliation(s)
- Daniel W. Kneller
- Department of Biology, Georgia State University, Atlanta, Georgia 30303, United States of America
| | - Johnson Agniswamy
- Department of Biology, Georgia State University, Atlanta, Georgia 30303, United States of America
| | - Robert W. Harrison
- Department of Computer Science, Georgia State University, Atlanta, Georgia 30303, United States of America
| | - Irene T. Weber
- Department of Biology, Georgia State University, Atlanta, Georgia 30303, United States of America,Department of Chemistry, Georgia State University, Atlanta, Georgia 30303, United States of America,Author of correspondence:
| |
Collapse
|
26
|
Alves NG, Mata AI, Luís JP, Brito RMM, Simões CJV. An Innovative Sequence-to-Structure-Based Approach to Drug Resistance Interpretation and Prediction: The Use of Molecular Interaction Fields to Detect HIV-1 Protease Binding-Site Dissimilarities. Front Chem 2020; 8:243. [PMID: 32411655 PMCID: PMC7202381 DOI: 10.3389/fchem.2020.00243] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2019] [Accepted: 03/13/2020] [Indexed: 12/15/2022] Open
Abstract
In silico methodologies have opened new avenues of research to understanding and predicting drug resistance, a pressing health issue that keeps rising at alarming pace. Sequence-based interpretation systems are routinely applied in clinical context in an attempt to predict mutation-based drug resistance and thus aid the choice of the most adequate antibiotic and antiviral therapy. An important limitation of approaches based on genotypic data exclusively is that mutations are not considered in the context of the three-dimensional (3D) structure of the target. Structure-based in silico methodologies are inherently more suitable to interpreting and predicting the impact of mutations on target-drug interactions, at the cost of higher computational and time demands when compared with sequence-based approaches. Herein, we present a fast, computationally inexpensive, sequence-to-structure-based approach to drug resistance prediction, which makes use of 3D protein structures encoded by input target sequences to draw binding-site comparisons with susceptible templates. Rather than performing atom-by-atom comparisons between input target and template structures, our workflow generates and compares Molecular Interaction Fields (MIFs) that map the areas of energetically favorable interactions between several chemical probe types and the target binding site. Quantitative, pairwise dissimilarity measurements between the target and the template binding sites are thus produced. The method is particularly suited to understanding changes to the 3D structure and the physicochemical environment introduced by mutations into the target binding site. Furthermore, the workflow relies exclusively on freeware, making it accessible to anyone. Using four datasets of known HIV-1 protease sequences as a case-study, we show that our approach is capable of correctly classifying resistant and susceptible sequences given as input. Guided by ROC curve analyses, we fined-tuned a dissimilarity threshold of classification that results in remarkable discriminatory performance (accuracy ≈ ROC AUC ≈ 0.99), illustrating the high potential of sequence-to-structure-, MIF-based approaches in the context of drug resistance prediction. We discuss the complementarity of the proposed methodology to existing prediction algorithms based on genotypic data. The present work represents a new step toward a more comprehensive and structurally-informed interpretation of the impact of genetic variability on the response to HIV-1 therapies.
Collapse
Affiliation(s)
- Nuno G Alves
- Department of Chemistry, Coimbra Chemistry Centre, University of Coimbra, Coimbra, Portugal
| | - Ana I Mata
- Department of Chemistry, Coimbra Chemistry Centre, University of Coimbra, Coimbra, Portugal
| | - João P Luís
- Department of Chemistry, Coimbra Chemistry Centre, University of Coimbra, Coimbra, Portugal
| | - Rui M M Brito
- Department of Chemistry, Coimbra Chemistry Centre, University of Coimbra, Coimbra, Portugal.,BSIM Therapeutics, Instituto Pedro Nunes, Coimbra, Portugal
| | - Carlos J V Simões
- Department of Chemistry, Coimbra Chemistry Centre, University of Coimbra, Coimbra, Portugal.,BSIM Therapeutics, Instituto Pedro Nunes, Coimbra, Portugal
| |
Collapse
|
27
|
Affiliation(s)
- Ying Liu
- Mental Health Data Science, Department of PsychiatryColumbia University Irving Medical Center New York 10032 NY USA
| | - Cheng Zheng
- Joseph J. Zilber School of Public HealthUniversity of Wisconsin‐Milwaukee Milwaukee 53211 WI USA
| |
Collapse
|
28
|
Delgado RA, Chen Z, Middleton RH. Stepwise Tikhonov Regularisation: Application to the Prediction of HIV-1 Drug Resistance. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2020; 17:292-301. [PMID: 29994131 DOI: 10.1109/tcbb.2018.2849369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
This paper focuses on constructing genotypic predictors for antiretroviral drug susceptibility of HIV. To this end, a method to recover the largest elements of an unknown vector in a least squares problem is developed. The proposed method introduces two novel ideas. The first idea is a novel forward stepwise selection procedure based on the magnitude of the estimates of the candidate variables. To implement this newly introduced procedure, we revise Tikhonov regularisation from a sparse representations' perspective. This analysis leads us to the second novel idea in the paper, which is the development of a new method to recover the largest elements of the unknown vector in the least squares problem. The method implements a sequence of Tikhonov regularisation problems which aim to recover the largest of the remaining elements of the unknown vector. Additionally, we derive sufficient conditions that ensure the recovery of the largest elements of the unknown vector. We perform numerical studies using simulated data and data from the Stanford HIV resistance database. The performance of the proposed method is compared against a state-of-the-art method.
Collapse
|
29
|
Affiliation(s)
- Nan Bi
- Department of Statistics Stanford University
| | | | - Lucy Xia
- Department of Statistics Stanford University
| | | |
Collapse
|
30
|
Sharma A, Rani R. Drug sensitivity prediction framework using ensemble and multi-task learning. INT J MACH LEARN CYB 2019. [DOI: 10.1007/s13042-019-01034-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
31
|
Brand L, Yang X, Liu K, Elbeleidy S, Wang H, Zhang H, Nie F. Learning Robust Multilabel Sample Specific Distances for Identifying HIV-1 Drug Resistance. J Comput Biol 2019; 27:655-672. [PMID: 31725323 DOI: 10.1089/cmb.2019.0329] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
AIDS is a syndrome caused by the HIV. During the progression of AIDS, a patient's immune system is weakened, which increases the patient's susceptibility to infections and diseases. Although antiretroviral drugs can effectively suppress HIV, the virus mutates very quickly and can become resistant to treatment. In addition, the virus can also become resistant to other treatments not currently being used through mutations, which is known in the clinical research community as cross-resistance. Since a single HIV strain can be resistant to multiple drugs, this problem is naturally represented as a multilabel classification problem. Given this multilabel relationship, traditional single-label classification methods often fail to effectively identify the drug resistances that may develop after a particular virus mutation. In this work, we propose a novel multilabel Robust Sample Specific Distance (RSSD) method to identify multiclass HIV drug resistance. Our method is novel in that it can illustrate the relative strength of the drug resistance of a reverse transcriptase (RT) sequence against a given drug nucleoside analog and learn the distance metrics for all the drug resistances. To learn the proposed RSSDs, we formulate a learning objective that maximizes the ratio of the summations of a number of ℓ1-norm distances, which is difficult to solve in general. To solve this optimization problem, we derive an efficient, nongreedy iterative algorithm with rigorously proved convergence. Our new method has been verified on a public HIV type 1 drug resistance data set with over 600 RT sequences and five nucleoside analogs. We compared our method against several state-of-the-art multilabel classification methods, and the experimental results have demonstrated the effectiveness of our proposed method.
Collapse
Affiliation(s)
- Lodewijk Brand
- Department of Computer Science, Colorado School of Mines, Golden, Colorado
| | - Xue Yang
- Department of Computer Science, Colorado School of Mines, Golden, Colorado
| | - Kai Liu
- Department of Computer Science, Colorado School of Mines, Golden, Colorado
| | - Saad Elbeleidy
- Department of Computer Science, Colorado School of Mines, Golden, Colorado
| | - Hua Wang
- Department of Computer Science, Colorado School of Mines, Golden, Colorado
| | - Hao Zhang
- Department of Computer Science, Colorado School of Mines, Golden, Colorado
| | - Feiping Nie
- School of Computer Science and Center for OPTical IMagery Analysis and Learning (OPTIMAL), Northwestern Polytechnical University, Xi'an, P.R. China
| |
Collapse
|
32
|
Affiliation(s)
- Yaniv Romano
- Department of Statistics, Stanford University, Stanford, CA
| | - Matteo Sesia
- Department of Statistics, Stanford University, Stanford, CA
| | | |
Collapse
|
33
|
Ramon E, Belanche-Muñoz L, Pérez-Enciso M. HIV drug resistance prediction with weighted categorical kernel functions. BMC Bioinformatics 2019; 20:410. [PMID: 31362714 PMCID: PMC6668108 DOI: 10.1186/s12859-019-2991-2] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2019] [Accepted: 07/11/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Antiretroviral drugs are a very effective therapy against HIV infection. However, the high mutation rate of HIV permits the emergence of variants that can be resistant to the drug treatment. Predicting drug resistance to previously unobserved variants is therefore very important for an optimum medical treatment. In this paper, we propose the use of weighted categorical kernel functions to predict drug resistance from virus sequence data. These kernel functions are very simple to implement and are able to take into account HIV data particularities, such as allele mixtures, and to weigh the different importance of each protein residue, as it is known that not all positions contribute equally to the resistance. RESULTS We analyzed 21 drugs of four classes: protease inhibitors (PI), integrase inhibitors (INI), nucleoside reverse transcriptase inhibitors (NRTI) and non-nucleoside reverse transcriptase inhibitors (NNRTI). We compared two categorical kernel functions, Overlap and Jaccard, against two well-known noncategorical kernel functions (Linear and RBF) and Random Forest (RF). Weighted versions of these kernels were also considered, where the weights were obtained from the RF decrease in node impurity. The Jaccard kernel was the best method, either in its weighted or unweighted form, for 20 out of the 21 drugs. CONCLUSIONS Results show that kernels that take into account both the categorical nature of the data and the presence of mixtures consistently result in the best prediction model. The advantage of including weights depended on the protein targeted by the drug. In the case of reverse transcriptase, weights based in the relative importance of each position clearly increased the prediction performance, while the improvement in the protease was much smaller. This seems to be related to the distribution of weights, as measured by the Gini index. All methods described, together with documentation and examples, are freely available at https://bitbucket.org/elies_ramon/catkern.
Collapse
Affiliation(s)
- Elies Ramon
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB Consortium, Campus UAB, 08193 Bellaterra, Barcelona, Spain.
| | - Lluís Belanche-Muñoz
- Computer Science Department, Technical University of Catalonia, Carrer de Jordi Girona 1-3, 08034, Barcelona, Spain
| | - Miguel Pérez-Enciso
- Centre for Research in Agricultural Genomics (CRAG), CSIC-IRTA-UAB-UB Consortium, Campus UAB, 08193 Bellaterra, Barcelona, Spain.,Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig de Lluís Companys 23, 08010, Barcelona, Spain
| |
Collapse
|
34
|
Deep learning on chaos game representation for proteins. Bioinformatics 2019; 36:272-279. [DOI: 10.1093/bioinformatics/btz493] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2019] [Revised: 05/29/2019] [Accepted: 06/14/2019] [Indexed: 11/14/2022] Open
Abstract
AbstractMotivationClassification of protein sequences is one big task in bioinformatics and has many applications. Different machine learning methods exist and are applied on these problems, such as support vector machines (SVM), random forests (RF) and neural networks (NN). All of these methods have in common that protein sequences have to be made machine-readable and comparable in the first step, for which different encodings exist. These encodings are typically based on physical or chemical properties of the sequence. However, due to the outstanding performance of deep neural networks (DNN) on image recognition, we used frequency matrix chaos game representation (FCGR) for encoding of protein sequences into images. In this study, we compare the performance of SVMs, RFs and DNNs, trained on FCGR encoded protein sequences. While the original chaos game representation (CGR) has been used mainly for genome sequence encoding and classification, we modified it to work also for protein sequences, resulting in n-flakes representation, an image with several icosagons.ResultsWe could show that all applied machine learning techniques (RF, SVM and DNN) show promising results compared to the state-of-the-art methods on our benchmark datasets, with DNNs outperforming the other methods and that FCGR is a promising new encoding method for protein sequences.Availability and implementationhttps://cran.r-project.org/.Supplementary informationSupplementary data are available at Bioinformatics online.
Collapse
|
35
|
|
36
|
Soret P, Avalos M, Wittkop L, Commenges D, Thiébaut R. Lasso regularization for left-censored Gaussian outcome and high-dimensional predictors. BMC Med Res Methodol 2018; 18:159. [PMID: 30514234 PMCID: PMC6280495 DOI: 10.1186/s12874-018-0609-4] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2017] [Accepted: 11/02/2018] [Indexed: 12/14/2022] Open
Abstract
Background Biological assays for the quantification of markers may suffer from a lack of sensitivity and thus from an analytical detection limit. This is the case of human immunodeficiency virus (HIV) viral load. Below this threshold the exact value is unknown and values are consequently left-censored. Statistical methods have been proposed to deal with left-censoring but few are adapted in the context of high-dimensional data. Methods We propose to reverse the Buckley-James least squares algorithm to handle left-censored data enhanced with a Lasso regularization to accommodate high-dimensional predictors. We present a Lasso-regularized Buckley-James least squares method with both non-parametric imputation using Kaplan-Meier and parametric imputation based on the Gaussian distribution, which is typically assumed for HIV viral load data after logarithmic transformation. Cross-validation for parameter-tuning is based on an appropriate loss function that takes into account the different contributions of censored and uncensored observations. We specify how these techniques can be easily implemented using available R packages. The Lasso-regularized Buckley-James least square method was compared to simple imputation strategies to predict the response to antiretroviral therapy measured by HIV viral load according to the HIV genotypic mutations. We used a dataset composed of several clinical trials and cohorts from the Forum for Collaborative HIV Research (HIV Med. 2008;7:27-40). The proposed methods were also assessed on simulated data mimicking the observed data. Results Approaches accounting for left-censoring outperformed simple imputation methods in a high-dimensional setting. The Gaussian Buckley-James method with cross-validation based on the appropriate loss function showed the lowest prediction error on simulated data and, using real data, the most valid results according to the current literature on HIV mutations. Conclusions The proposed approach deals with high-dimensional predictors and left-censored outcomes and has shown its interest for predicting HIV viral load according to HIV mutations.
Collapse
Affiliation(s)
- Perrine Soret
- Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, UMR 1219, Bordeaux, F-33000, France.,Inria SISTM Team, Talence, F-33405, France.,Vaccine Research Institute (VRI), Créteil, F-94000, France
| | - Marta Avalos
- Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, UMR 1219, Bordeaux, F-33000, France. .,Inria SISTM Team, Talence, F-33405, France.
| | - Linda Wittkop
- Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, UMR 1219, Bordeaux, F-33000, France.,Inria SISTM Team, Talence, F-33405, France.,CHU Bordeaux, Department of Public Health, Bordeaux, F-33000, France
| | - Daniel Commenges
- Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, UMR 1219, Bordeaux, F-33000, France.,Inria SISTM Team, Talence, F-33405, France
| | - Rodolphe Thiébaut
- Univ. Bordeaux, Inserm, Bordeaux Population Health Research Center, UMR 1219, Bordeaux, F-33000, France.,Inria SISTM Team, Talence, F-33405, France.,Vaccine Research Institute (VRI), Créteil, F-94000, France.,CHU Bordeaux, Department of Public Health, Bordeaux, F-33000, France
| |
Collapse
|
37
|
Hu W, Laber EB, Barker C, Stefanski LA. Assessing Tuning Parameter Selection Variability in Penalized Regression. Technometrics 2018; 61:154-164. [PMID: 31534281 PMCID: PMC6750234 DOI: 10.1080/00401706.2018.1513380] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2017] [Revised: 07/01/2018] [Indexed: 10/28/2022]
Abstract
Penalized regression methods that perform simultaneous model selection and estimation are ubiquitous in statistical modeling. The use of such methods is often unavoidable as manual inspection of all possible models quickly becomes intractable when there are more than a handful of predictors. However, automated methods usually fail to incorporate domain-knowledge, exploratory analyses, or other factors that might guide a more interactive model-building approach. A hybrid approach is to use penalized regression to identify a set of candidate models and then to use interactive model-building to examine this candidate set more closely. To identify a set of candidate models, we derive point and interval estimators of the probability that each model along a solution path will minimize a given model selection criterion, for example, Akaike information criterion, Bayesian information criterion (AIC, BIC), etc., conditional on the observed solution path. Then models with a high probability of selection are considered for further examination. Thus, the proposed methodology attempts to strike a balance between algorithmic modeling approaches that are computationally efficient but fail to incorporate expert knowledge, and interactive modeling approaches that are labor intensive but informed by experience, intuition, and domain knowledge. Supplementary materials for this article are available online.
Collapse
Affiliation(s)
- Wenhao Hu
- Department of Statistics, NC State University, Raleigh, NC
| | - Eric B. Laber
- Department of Statistics, NC State University, Raleigh, NC
| | | | | |
Collapse
|
38
|
Tarasova O, Biziukova N, Filimonov D, Poroikov V. A Computational Approach for the Prediction of HIV Resistance Based on Amino Acid and Nucleotide Descriptors. Molecules 2018; 23:E2751. [PMID: 30355996 PMCID: PMC6278491 DOI: 10.3390/molecules23112751] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 10/07/2018] [Accepted: 10/16/2018] [Indexed: 12/25/2022] Open
Abstract
The high variability of the human immunodeficiency virus (HIV) is an important cause of HIV resistance to reverse transcriptase and protease inhibitors. There are many variants of HIV type 1 (HIV-1) that can be used to model sequence-resistance relationships. Machine learning methods are widely and successfully used in new drug discovery. An emerging body of data regarding the interactions of small drug-like molecules with their protein targets provides the possibility of building models on "structure-property" relationships and analyzing the performance of various machine-learning techniques. In our research, we analyze several different types of descriptors in order to predict the resistance of HIV reverse transcriptase and protease to the marketed antiretroviral drugs using the Random Forest approach. First, we represented amino acid sequences as a set of short peptide fragments, which included several amino acid residues. Second, we represented nucleotide sequences as a set of fragments, which included several nucleotides. We compared these two approaches using open data from the Stanford HIV Drug Resistance Database. We have determined the factors that modulate the performance of prediction: in particular, we observed that the prediction performance was more sensitive to certain drugs than a type of the descriptor used.
Collapse
Affiliation(s)
- Olga Tarasova
- Institute of Biomedical Chemistry, Moscow 119121, Russia.
| | | | | | | |
Collapse
|
39
|
Dashwood T, Tan DHS. PrEParing for the unexpected: mechanisms and management of HIV pre-exposure prophylaxis failure. Future Virol 2018. [DOI: 10.2217/fvl-2018-0084] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Pre-exposure prophylaxis (PrEP) for HIV is a proven and effective tool for preventing HIV. However, there are instances where individuals taking PrEP have contracted HIV infection. Most of these cases are due to nonadherence to the drug, while other cases of apparent PrEP failure are due to unrecognized HIV infection at baseline. Importantly, there are also now at least three well-documented cases of PrEP failing despite adequate adherence; these are cases of PrEP ‘breakthrough’. This article outlines the potential mechanisms of PrEP failure, as well as how to identify and manage these patients. Finally, we provide a perspective on the future of PrEP as a key tool in preventing HIV worldwide.
Collapse
Affiliation(s)
- Thomas Dashwood
- Department of Medicine, University of Toronto, Toronto, ON M5G 2C4, Canada
| | - Darrell HS Tan
- Department of Medicine, University of Toronto, Toronto, ON M5G 2C4, Canada
- Division of Infectious Diseases, St Michael's Hospital, Toronto, ON M5B 1W8, Canada
- Centre for Urban Health Solutions, St Michael's Hospital, Toronto, ON M5B 1W8, Canada
- Institute for Health Policy, Management & Evaluation, University of Toronto, Toronto, ON M5T 3M6, Canada
| |
Collapse
|
40
|
A Surveillance on Protease Inhibitor Resistance-Associated Mutations Among Iranian HIV-1 Patients. ARCHIVES OF CLINICAL INFECTIOUS DISEASES 2018. [DOI: 10.5812/archcid.69153] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
41
|
Sharma A, Rani R. KSRMF: Kernelized similarity based regularized matrix factorization framework for predicting anti-cancer drug responses. JOURNAL OF INTELLIGENT & FUZZY SYSTEMS 2018. [DOI: 10.3233/jifs-169713] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Affiliation(s)
- Aman Sharma
- Department of Computer Science and Engineering, Thapar University, Patiala, Punjab, India
| | - Rinkle Rani
- Department of Computer Science and Engineering, Thapar University, Patiala, Punjab, India
| |
Collapse
|
42
|
Su WJ. When is the first spurious variable selected by sequential regression procedures? Biometrika 2018. [DOI: 10.1093/biomet/asy032] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Affiliation(s)
- Weijie J Su
- Department of Statistics, University of Pennsylvania, 472 John M. Huntsman Hall, 3730 Walnut Street, Philadelphia, Pennsylvania 19104, U.S.A
| |
Collapse
|
43
|
Khalid Z, Sezerman OU. Prediction of HIV Drug Resistance by Combining Sequence and Structural Properties. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:966-973. [PMID: 27992346 DOI: 10.1109/tcbb.2016.2638821] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/06/2023]
Abstract
Drug resistance is a major obstacle faced by therapist in treating HIV infected patients. The reason behind these phenomena is either protein mutation or the changes in gene expression level that induces resistance to drug treatments. These mutations affect the drug binding activity, hence resulting in failure of treatment. Therefore, it is necessary to conduct resistance testing in order to carry out HIV effective therapy. This study combines both sequence and structural features for predicting HIV resistance by applying SVM and Random Forests classifiers. The model was tested on the mutants of HIV-1 protease and reverse transcriptase. Taken together the features we have used in our method, total contact energies among multiple mutations have a strong impact in predicting resistance as they are crucial in understanding the interactions of HIV mutants. The combination of sequence-structure features offers high accuracy with support vector machines as compared to Random Forests classifier. Both single and acquisition of multiple mutations are important in predicting HIV resistance to certain drug treatments. We have discovered the practicality of these features; hence, these can be used in the future to predict resistance for other complex diseases.
Collapse
|
44
|
Tarasova O, Poroikov V. HIV Resistance Prediction to Reverse Transcriptase Inhibitors: Focus on Open Data. Molecules 2018; 23:E956. [PMID: 29671808 PMCID: PMC6017644 DOI: 10.3390/molecules23040956] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 04/16/2018] [Accepted: 04/17/2018] [Indexed: 12/16/2022] Open
Abstract
Research and development of new antiretroviral agents are in great demand due to issues with safety and efficacy of the antiretroviral drugs. HIV reverse transcriptase (RT) is an important target for HIV treatment. RT inhibitors targeting early stages of the virus-host interaction are of great interest for researchers. There are a lot of clinical and biochemical data on relationships between the occurring of the single point mutations and their combinations in the pol gene of HIV and resistance of the particular variants of HIV to nucleoside and non-nucleoside reverse transcriptase inhibitors. The experimental data stored in the databases of HIV sequences can be used for development of methods that are able to predict HIV resistance based on amino acid or nucleotide sequences. The data on HIV sequences resistance can be further used for (1) development of new antiretroviral agents with high potential for HIV inhibition and elimination and (2) optimization of antiretroviral therapy. In our communication, we focus on the data on the RT sequences and HIV resistance, which are available on the Internet. The experimental methods, which are applied to produce the data on HIV-1 resistance, the known data on their concordance, are also discussed.
Collapse
Affiliation(s)
- Olga Tarasova
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya st., Moscow 119121, Russia.
| | - Vladimir Poroikov
- Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya st., Moscow 119121, Russia.
| |
Collapse
|
45
|
Bien J, Gaynanova I, Lederer J, Müller CL. Non-Convex Global Minimization and False Discovery Rate Control for the TREX. J Comput Graph Stat 2018. [DOI: 10.1080/10618600.2017.1341414] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Affiliation(s)
- Jacob Bien
- Department of Data Sciences and Operations, University of Southern California, Los Angeles, CA
| | - Irina Gaynanova
- Department of Statistics, Texas A&M University, College Station, TX
| | - Johannes Lederer
- Departments of Statistics and Biostatistics, University of Washington, Seattle, WA
| | | |
Collapse
|
46
|
Vasavi C, Tamizhselvi R, Munusami P. Drug Resistance Mechanism of L10F, L10F/N88S and L90M mutations in CRF01_AE HIV-1 protease: Molecular dynamics simulations and binding free energy calculations. J Mol Graph Model 2017. [DOI: 10.1016/j.jmgm.2017.06.007] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
47
|
Flynn CJ, Hurvich CM, Simonoff JS. On the Sensitivity of the Lasso to the Number of Predictor Variables. Stat Sci 2017. [DOI: 10.1214/16-sts586] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
|
48
|
Tarasova O, Filimonov D, Poroikov V. PASS-based approach to predict HIV-1 reverse transcriptase resistance. J Bioinform Comput Biol 2016; 15:1650040. [PMID: 28033735 DOI: 10.1142/s0219720016500402] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
HIV reverse transcriptase (RT) inhibitors targeting the early stages of virus-host interactions are of great interest to scientists. Acquired HIV RT resistance happens due to mutations in a particular region of the pol gene encoding the HIV RT amino acid sequence. We propose an application of the previously developed PASS algorithm for prediction of amino acid substitutions potentially involved in the resistance of HIV-1 based on open data. In our work, we used more than 3200 HIV-1 RT variants from the publicly available Stanford HIV RT and protease sequence database already tested for 10 anti-HIV drugs including both nucleoside and non-nucleoside RT inhibitors. We used a particular amino acid residue and its position to describe primary structure-resistance relationships. The average balanced accuracy of the prediction obtained in 20-fold cross-validation for the Phenosense dataset was about 88% and for the Antivirogram dataset was about 79%. Thus, the PASS-based algorithm may be used for prediction of the amino acid substitutions associated with the resistance of HIV-1 based on open data. The computational approach for the prediction of HIV-1 associated resistance can be useful for the selection of RT inhibitors for the treatment of HIV infected patients in the clinical practice. Prediction of the HIV-1 RT associated resistance can be useful for the development of new anti-HIV drugs active against the resistant variants of RT. Therefore, we propose that this study can be potentially useful for anti-HIV drug development.
Collapse
Affiliation(s)
- Olga Tarasova
- 1 Department for Bioinformatics, Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya street, 119121, Moscow, Russia
| | - Dmitry Filimonov
- 1 Department for Bioinformatics, Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya street, 119121, Moscow, Russia
| | - Vladimir Poroikov
- 1 Department for Bioinformatics, Institute of Biomedical Chemistry, 10 building 8, Pogodinskaya street, 119121, Moscow, Russia
| |
Collapse
|
49
|
Riemenschneider M, Senge R, Neumann U, Hüllermeier E, Heider D. Exploiting HIV-1 protease and reverse transcriptase cross-resistance information for improved drug resistance prediction by means of multi-label classification. BioData Min 2016; 9:10. [PMID: 26933450 PMCID: PMC4772363 DOI: 10.1186/s13040-016-0089-1] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 02/20/2016] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Antiretroviral therapy is essential for human immunodeficiency virus (HIV) infected patients to inhibit viral replication and therewith to slow progression of disease and prolong a patient's life. However, the high mutation rate of HIV can lead to a fast adaptation of the virus under drug pressure and thereby to the evolution of resistant variants. In turn, these variants will lead to the failure of antiretroviral treatment. Moreover, these mutations cannot only lead to resistance against single drugs, but also to cross-resistance, i.e., resistance against drugs that have not yet been applied. METHODS 662 protease sequences and 715 reverse transcriptase sequences with complete resistance profiles were analyzed using machine learning techniques, namely binary relevance classifiers, classifier chains, and ensembles of classifier chains. RESULTS In our study, we applied multi-label classification models incorporating cross-resistance information to predict drug resistance for two of the major drug classes used in antiretroviral therapy for HIV-1, namely protease inhibitors (PIs) and non-nucleoside reverse transcriptase inhibitors (NNRTIs). By means of multi-label learning, namely classifier chains (CCs) and ensembles of classifier chains (ECCs), we were able to improve overall prediction accuracy for all drugs compared to hitherto applied binary classification models. CONCLUSIONS The development of fast and precise models to predict drug resistance in HIV-1 is highly important to enable a highly effective personalized therapy. Cross-resistance information can be exploited to improve prediction accuracy of computational drug resistance models.
Collapse
Affiliation(s)
- Mona Riemenschneider
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315 Germany ; University of Applied Science Weihenstephan-Triesdorf, Am Hofgarten 4, Freising, 85354 Germany
| | - Robin Senge
- Department of Computer Science, University of Paderborn, Pohlweg 47, Paderborn, 33098 Germany
| | - Ursula Neumann
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315 Germany ; Wissenschaftszentrum Weihenstephan, Technische Universität München, Alte Akademie 8, Freising, 85354 Germany ; University of Applied Science Weihenstephan-Triesdorf, Am Hofgarten 4, Freising, 85354 Germany
| | - Eyke Hüllermeier
- Department of Computer Science, University of Paderborn, Pohlweg 47, Paderborn, 33098 Germany
| | - Dominik Heider
- Department of Bioinformatics, Straubing Center of Science, Petersgasse 18, Straubing, 94315 Germany ; Wissenschaftszentrum Weihenstephan, Technische Universität München, Alte Akademie 8, Freising, 85354 Germany ; University of Applied Science Weihenstephan-Triesdorf, Am Hofgarten 4, Freising, 85354 Germany
| |
Collapse
|
50
|
Lopinavir Resistance Classification with Imbalanced Data Using Probabilistic Neural Networks. J Med Syst 2016; 40:69. [PMID: 26733278 DOI: 10.1007/s10916-015-0428-7] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2015] [Accepted: 12/23/2015] [Indexed: 10/22/2022]
Abstract
Resistance to antiretroviral drugs has been a major obstacle for long-lasting treatment of HIV-infected patients. The development of models to predict drug resistance is recognized as useful for helping the decision of the best therapy for each HIV+ individual. The aim of this study was to develop classifiers for predicting resistance to the HIV protease inhibitor lopinavir using a probabilistic neural network (PNN). The data were provided by the Molecular Virology Laboratory of the Health Sciences Center, Federal University of Rio de Janeiro (CCS-UFRJ/Brazil). Using bootstrap and stepwise techniques, ten features were selected by logistic regression (LR) to be used as inputs to the network. Bootstrap and cross-validation were used to define the smoothing parameter of the PNN networks. Four balanced models were designed and evaluated using a separate test set. The accuracies of the classifiers with the test set ranged from 0.89 to 0.94, and the area under the receiver operating characteristic (ROC) curve (AUC) ranged from 0.96 to 0.97. The sensitivity ranged from 0.94 to 1.00, and the specificity was between 0.88 and 0.92. Four classifiers showed performances very close to three existing expert-based interpretation systems, the HIVdb, the Rega and the ANRS algorithms, and to a k-Nearest Neighbor.
Collapse
|