1
|
Xiong Y, Wang Y, Wang Y, Li C, Yusong P, Wu J, Wang Y, Gu L, Butch CJ. Improving drug discovery with a hybrid deep generative model using reinforcement learning trained on a Bayesian docking approximation. J Comput Aided Mol Des 2023; 37:507-517. [PMID: 37550462 DOI: 10.1007/s10822-023-00523-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 07/17/2023] [Indexed: 08/09/2023]
Abstract
Generative approaches to molecular design are an area of intense study in recent years as a method to generate new pharmaceuticals with desired properties. Often though, these types of efforts are constrained by limited experimental activity data, resulting in either models that generate molecules with poor performance or models that are overfit and produce close analogs of known molecules. In this paper, we reduce this data dependency for the generation of new chemotypes by incorporating docking scores of known and de novo molecules to expand the applicability domain of the reward function and diversify the compounds generated during reinforcement learning. Our approach employs a deep generative model initially trained using a combination of limited known drug activity and an approximate docking score provided by a second machine learned Bayes regression model, with final evaluation of high scoring compounds by a full docking simulation. This strategy results in molecules with docking scores improved by 10-20% compared to molecules of similar size, while being 130 × faster than a docking only approach on a typical GPU workstation. We also show that the increased docking scores correlate with (1) docking poses with interactions similar to known inhibitors and (2) result in higher MM-GBSA binding energies comparable to the energies of known DDR1 inhibitors, demonstrating that the Bayesian model contains sufficient information for the network to learn to efficiently interact with the binding pocket during reinforcement learning. This outcome shows that the combination of the learned latent molecular representation along with the feature-based docking regression is sufficient for reinforcement learning to infer the relationship between the molecules and the receptor binding site, which suggest that our method can be a powerful tool for the discovery of new chemotypes with potential therapeutic applications.
Collapse
Affiliation(s)
- Youjin Xiong
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Yiqing Wang
- Icekredit Incorporated, Shanghai, 200120, China
| | - Yisheng Wang
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Chenmei Li
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Peng Yusong
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Junyu Wu
- Icekredit Incorporated, Shanghai, 200120, China
| | - Yiqing Wang
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China
| | - Lingyun Gu
- Department of Information Systems Technology and Design, Singapore University of Technology and Design, Singapore, Singapore.
| | - Christopher J Butch
- Department of Biomedical Engineering, Nanjing University, Nanjing, 210093, China.
| |
Collapse
|
2
|
Han R, Yoon H, Kim G, Lee H, Lee Y. Revolutionizing Medicinal Chemistry: The Application of Artificial Intelligence (AI) in Early Drug Discovery. Pharmaceuticals (Basel) 2023; 16:1259. [PMID: 37765069 PMCID: PMC10537003 DOI: 10.3390/ph16091259] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 08/24/2023] [Accepted: 09/04/2023] [Indexed: 09/29/2023] Open
Abstract
Artificial intelligence (AI) has permeated various sectors, including the pharmaceutical industry and research, where it has been utilized to efficiently identify new chemical entities with desirable properties. The application of AI algorithms to drug discovery presents both remarkable opportunities and challenges. This review article focuses on the transformative role of AI in medicinal chemistry. We delve into the applications of machine learning and deep learning techniques in drug screening and design, discussing their potential to expedite the early drug discovery process. In particular, we provide a comprehensive overview of the use of AI algorithms in predicting protein structures, drug-target interactions, and molecular properties such as drug toxicity. While AI has accelerated the drug discovery process, data quality issues and technological constraints remain challenges. Nonetheless, new relationships and methods have been unveiled, demonstrating AI's expanding potential in predicting and understanding drug interactions and properties. For its full potential to be realized, interdisciplinary collaboration is essential. This review underscores AI's growing influence on the future trajectory of medicinal chemistry and stresses the importance of ongoing synergies between computational and domain experts.
Collapse
Affiliation(s)
| | | | | | | | - Yoonji Lee
- College of Pharmacy, Chung-Ang University, Seoul 06974, Republic of Korea
| |
Collapse
|
3
|
Novak J, Pathak P, Grishina MA, Potemkin VA. The design of compounds with desirable properties - The anti-HIV case study. J Comput Chem 2023; 44:1016-1030. [PMID: 36533526 DOI: 10.1002/jcc.27061] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2022] [Revised: 11/14/2022] [Accepted: 12/04/2022] [Indexed: 12/23/2022]
Abstract
Efficacy and safety are among the most desirable characteristics of an ideal drug. The tremendous increase in computing power and the entry of artificial intelligence into the field of computational drug design are accelerating the process of identifying, developing, and optimizing potential drugs. Here, we present novel approach to design new molecules with desired properties. We combined various neural networks and linear regression algorithms to build models for cytotoxicity and anti-HIV activity based on Continual Molecular Interior analysis (CoMIn) and Cinderella's Shoe (CiS) derived molecular descriptors. After validating the reliability of the models, a genetic algorithm was coupled with the Des-Pot Grid algorithm to generate new molecules from a predefined pool of molecular fragments and predict their bioactivity and cytotoxicity. This combination led to the proposal of 16 hit molecules with high anti-HIV activity and low cytotoxicity. The anti-SARS-CoV-2 activity of the hits was predicted.
Collapse
Affiliation(s)
- Jurica Novak
- Department of Biotechnology, University of Rijeka, Rijeka, Croatia
- Center for Artificial Intelligence and Cybersecurity, University of Rijeka, Rijeka, Croatia
- Scientific and Educational Center "Biomedical Technologies", Higher Medical & Biological School, South Ural State University, Chelyabinsk, Russia
| | - Prateek Pathak
- Laboratory of Computational Modelling of Drugs, Higher Medical & Biological School, South Ural State University, Chelyabinsk, Russia
| | - Maria A Grishina
- Laboratory of Computational Modelling of Drugs, Higher Medical & Biological School, South Ural State University, Chelyabinsk, Russia
| | - Vladimir A Potemkin
- Laboratory of Computational Modelling of Drugs, Higher Medical & Biological School, South Ural State University, Chelyabinsk, Russia
| |
Collapse
|
4
|
Tarasova O, Poroikov V. Machine Learning in Discovery of New Antivirals and Optimization of Viral Infections Therapy. Curr Med Chem 2021; 28:7840-7861. [PMID: 33949929 DOI: 10.2174/0929867328666210504114351] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Revised: 02/13/2021] [Accepted: 02/24/2021] [Indexed: 11/22/2022]
Abstract
Nowadays, computational approaches play an important role in the design of new drug-like compounds and optimization of pharmacotherapeutic treatment of diseases. The emerging growth of viral infections, including those caused by the Human Immunodeficiency Virus (HIV), Ebola virus, recently detected coronavirus, and some others, leads to many newly infected people with a high risk of death or severe complications. A huge amount of chemical, biological, clinical data is at the disposal of the researchers. Therefore, there are many opportunities to find the relationships between the particular features of chemical data and the antiviral activity of biologically active compounds based on machine learning approaches. Biological and clinical data can also be used for building models to predict relationships between viral genotype and drug resistance, which might help determine the clinical outcome of treatment. In the current study, we consider machine-learning approaches in the antiviral research carried out during the past decade. We overview in detail the application of machine-learning methods for the design of new potential antiviral agents and vaccines, drug resistance prediction, and analysis of virus-host interactions. Our review also covers the perspectives of using the machine-learning approaches for antiviral research, including Dengue, Ebola viruses, Influenza A, Human Immunodeficiency Virus, coronaviruses, and some others.
Collapse
Affiliation(s)
- Olga Tarasova
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| | - Vladimir Poroikov
- Department of Bioinformatics, Institute of Biomedical Chemistry, Moscow. Russian Federation
| |
Collapse
|
5
|
Egieyeh S, Malan SF, Christoffels A. Cheminformatics techniques in antimalarial drug discovery and development from natural products 2: Molecular scaffold and machine learning approaches. PHYSICAL SCIENCES REVIEWS 2021. [DOI: 10.1515/psr-2019-0029] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
A large number of natural products, especially those used in ethnomedicine of malaria, have shown varying in-vitro antiplasmodial activities. Cheminformatics involves the organization, integration, curation, standardization, simulation, mining and transformation of pharmacology data (compounds and bioactivity) into knowledge that can drive rational and viable drug development decisions. This chapter will review the application of two cheminformatics techniques (including molecular scaffold analysis and bioactivity predictive modeling via Machine learning) to natural products with in-vitro and in-vivo antiplasmodial activities in order to facilitate their development into antimalarial drug candidates and design of new potential antimalarial compounds.
Collapse
Affiliation(s)
- Samuel Egieyeh
- School of Pharmacy , University of the Western Cape Faculty of Natural Science , Belville , South Africa
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute , University of the Western Cape Faculty of Natural Science , Belville , South Africa
| | - Sarel F. Malan
- School of Pharmacy , University of the Western Cape Faculty of Natural Science , Belville , South Africa
| | - Alan Christoffels
- South African Medical Research Council Bioinformatics Unit, South African National Bioinformatics Institute , University of the Western Cape Faculty of Natural Science , Belville , South Africa
| |
Collapse
|
6
|
Deng L, Zhong W, Zhao L, He X, Lian Z, Jiang S, Chen CYC. Artificial Intelligence-Based Application to Explore Inhibitors of Neurodegenerative Diseases. Front Neurorobot 2020; 14:617327. [PMID: 33414713 PMCID: PMC7783404 DOI: 10.3389/fnbot.2020.617327] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Accepted: 11/30/2020] [Indexed: 12/23/2022] Open
Abstract
Neuroinflammation is a common factor in neurodegenerative diseases, and it has been demonstrated that galectin-3 activates microglia and astrocytes, leading to inflammation. This means that inhibition of galectin-3 may become a new strategy for the treatment of neurodegenerative diseases. Based on this motivation, the objective of this study is to explore an integrated new approach for finding lead compounds that inhibit galectin-3, by combining universal artificial intelligence algorithms with traditional drug screening methods. Based on molecular docking method, potential compounds with high binding affinity were screened out from Chinese medicine database. Manifold artificial intelligence algorithms were performed to validate the docking results and further screen compounds. Among all involved predictive methods, the deep learning-based algorithm made 500 modeling attempts, and the square correlation coefficient of the best trained model on the test sets was 0.9. The XGBoost model reached a square correlation coefficient of 0.97 and a mean square error of only 0.01. We switched to the ZINC database and performed the same experiment, the results showed that the compounds in the former database showed stronger affinity. Finally, we further verified through molecular dynamics simulation that the complex composed of the candidate ligand and the target protein showed stable binding within 100 ns of simulation time. In summary, combined with the application based on artificial intelligence algorithms, we unearthed the active ingredients 1,2-Dimethylbenzene and Typhic acid contained in Crataegus pinnatifida and Typha angustata might be the effective inhibitors of neurodegenerative diseases. The high prediction accuracy of the models shows that it has practical application value on small sample data sets such as drug screening.
Collapse
Affiliation(s)
- Leping Deng
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China
| | - Weihe Zhong
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China
| | - Lu Zhao
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China.,Department of Clinical Laboratory, The Sixth Affiliated Hospital, Sun Yat-sen University, Guangzhou, China
| | - Xuedong He
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China
| | - Zongkai Lian
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China
| | - Shancheng Jiang
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China
| | - Calvin Yu-Chian Chen
- Artificial Intelligence Medical Center, School of Intelligent Systems Engineering, Sun Yat-sen University, Shenzhen, China.,Department of Medical Research, China Medical University Hospital, Taiwan, China.,Department of Bioinformatics and Medical Engineering, Asia University, Taiwan, China
| |
Collapse
|
7
|
Pribut N, Kaiser TM, Wilson RJ, Jecs E, Dentmon ZW, Pelly SC, Sharma S, Bartsch PW, Burger PB, Hwang SS, Le T, Sourimant J, Yoon JJ, Plemper RK, Liotta DC. Accelerated Discovery of Potent Fusion Inhibitors for Respiratory Syncytial Virus. ACS Infect Dis 2020; 6:922-929. [PMID: 32275393 DOI: 10.1021/acsinfecdis.9b00524] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
A series of five benzimidazole-based compounds were identified using a machine learning algorithm as potential inhibitors of the respiratory syncytial virus (RSV) fusion protein. These compounds were synthesized, and compound 2 in particular exhibited excellent in vitro potency with an EC50 value of 5 nM. This new scaffold was then further refined leading to the identification of compound 44, which exhibited a 10-fold improvement in activity with an EC50 value of 0.5 nM.
Collapse
Affiliation(s)
- Nicole Pribut
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Thomas M. Kaiser
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Robert J. Wilson
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Edgars Jecs
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Zackery W. Dentmon
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Stephen C. Pelly
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Savita Sharma
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Perry W. Bartsch
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Pieter B. Burger
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Soyon S. Hwang
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Thalia Le
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Julien Sourimant
- Institute for Biomedical Sciences, Georgia State University, Atlanta, Georgia 30303, United States
| | - Jeong-Joong Yoon
- Institute for Biomedical Sciences, Georgia State University, Atlanta, Georgia 30303, United States
| | - Richard K. Plemper
- Institute for Biomedical Sciences, Georgia State University, Atlanta, Georgia 30303, United States
| | - Dennis C. Liotta
- Department of Chemistry, Emory University, 1515 Dickey Drive, Atlanta, Georgia 30322, United States
| |
Collapse
|
8
|
Kaiser TM, Dentmon ZW, Dalloul CE, Sharma SK, Liotta DC. Accelerated Discovery of Novel Ponatinib Analogs with Improved Properties for the Treatment of Parkinson's Disease. ACS Med Chem Lett 2020; 11:491-496. [PMID: 32292555 DOI: 10.1021/acsmedchemlett.9b00612] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Accepted: 03/12/2020] [Indexed: 11/30/2022] Open
Abstract
Parkinson's disease (PD) is a debilitating and common neurodegenerative disease. New insights implicating c-Abl activation as a driving force in PD have opened a new drug development avenue for PD treatment beyond the symptomatic relief by L-DOPA. BCR-Abl inhibitors, which include nilotinib and ponatinib, have been found to inhibit this process, and nilotinib has shown improvement in outcomes in a 12-patient, nonrandomized trial. However, nilotinib is a potent inhibitor of hERG, a cardiac K+ channel whose inhibition increases risk of sudden death. We used our machine learning approach to predict novel molecules that would inhibit c-Abl while also having minimal liability against hERG. Of our six novel compounds tested, we identified two that had c-Abl potencies comparable to nilotinib, but with significantly improved profiles regarding the hERG channel. Our best compound exhibited a hERG IC50 of 12.1 μM (compared to nilotinib with an IC50 of 0.45 μM and ponatinib with IC50 of 0.767 μM). This work is a step forward for a machine learning enabled, multiparameter optimization of a chemical space and represents a significant advance in the development of novel Parkinson's therapies.
Collapse
Affiliation(s)
- Thomas M. Kaiser
- St Peter’s College, University of Oxford, New Inn Hall Street, Oxford, U.K. OX1 2DL
| | - Zackery W. Dentmon
- Department of Chemistry, Emory University, 1521 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Christopher E. Dalloul
- Department of Chemistry, Emory University, 1521 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Savita K. Sharma
- Department of Chemistry, Emory University, 1521 Dickey Drive, Atlanta, Georgia 30322, United States
| | - Dennis C. Liotta
- Department of Chemistry, Emory University, 1521 Dickey Drive, Atlanta, Georgia 30322, United States
| |
Collapse
|
9
|
Error Tolerance of Machine Learning Algorithms across Contemporary Biological Targets. Molecules 2019; 24:molecules24112115. [PMID: 31167452 PMCID: PMC6601015 DOI: 10.3390/molecules24112115] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2019] [Revised: 05/31/2019] [Accepted: 06/01/2019] [Indexed: 12/16/2022] Open
Abstract
Machine learning continues to make strident advances in the prediction of desired properties concerning drug development. Problematically, the efficacy of machine learning in these arenas is reliant upon highly accurate and abundant data. These two limitations, high accuracy and abundance, are often taken together; however, insight into the dataset accuracy limitation of contemporary machine learning algorithms may yield insight into whether non-bench experimental sources of data may be used to generate useful machine learning models where there is a paucity of experimental data. We took highly accurate data across six kinase types, one GPCR, one polymerase, a human protease, and HIV protease, and intentionally introduced error at varying population proportions in the datasets for each target. With the generated error in the data, we explored how the retrospective accuracy of a Naïve Bayes Network, a Random Forest Model, and a Probabilistic Neural Network model decayed as a function of error. Additionally, we explored the ability of a training dataset with an error profile resembling that produced by the Free Energy Perturbation method (FEP+) to generate machine learning models with useful retrospective capabilities. The categorical error tolerance was quite high for a Naïve Bayes Network algorithm averaging 39% error in the training set required to lose predictivity on the test set. Additionally, a Random Forest tolerated a significant degree of categorical error introduced into the training set with an average error of 29% required to lose predictivity. However, we found the Probabilistic Neural Network algorithm did not tolerate as much categorical error requiring an average of 20% error to lose predictivity. Finally, we found that a Naïve Bayes Network and a Random Forest could both use datasets with an error profile resembling that of FEP+. This work demonstrates that computational methods of known error distribution like FEP+ may be useful in generating machine learning models not based on extensive and expensive in vitro-generated datasets.
Collapse
|
10
|
Zorn KM, Lane TR, Russo DP, Clark AM, Makarov V, Ekins S. Multiple Machine Learning Comparisons of HIV Cell-based and Reverse Transcriptase Data Sets. Mol Pharm 2019; 16:1620-1632. [PMID: 30779585 DOI: 10.1021/acs.molpharmaceut.8b01297] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
The human immunodeficiency virus (HIV) causes over a million deaths every year and has a huge economic impact in many countries. The first class of drugs approved were nucleoside reverse transcriptase inhibitors. A newer generation of reverse transcriptase inhibitors have become susceptible to drug resistant strains of HIV, and hence, alternatives are urgently needed. We have recently pioneered the use of Bayesian machine learning to generate models with public data to identify new compounds for testing against different disease targets. The current study has used the NIAID ChemDB HIV, Opportunistic Infection and Tuberculosis Therapeutics Database for machine learning studies. We curated and cleaned data from HIV-1 wild-type cell-based and reverse transcriptase (RT) DNA polymerase inhibition assays. Compounds from this database with ≤1 μM HIV-1 RT DNA polymerase activity inhibition and cell-based HIV-1 inhibition are correlated (Pearson r = 0.44, n = 1137, p < 0.0001). Models were trained using multiple machine learning approaches (Bernoulli Naive Bayes, AdaBoost Decision Tree, Random Forest, support vector classification, k-Nearest Neighbors, and deep neural networks as well as consensus approaches) and then their predictive abilities were compared. Our comparison of different machine learning methods demonstrated that support vector classification, deep learning, and a consensus were generally comparable and not significantly different from each other using 5-fold cross validation and using 24 training and test set combinations. This study demonstrates findings in line with our previous studies for various targets that training and testing with multiple data sets does not demonstrate a significant difference between support vector machine and deep neural networks.
Collapse
Affiliation(s)
- Kimberley M Zorn
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States
| | - Thomas R Lane
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States
| | - Daniel P Russo
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States.,The Rutgers Center for Computational and Integrative Biology , Camden , New Jersey 08102 , United States
| | - Alex M Clark
- Molecular Materials Informatics, Inc. , 2234 Duvernay Street , Montreal , Quebec H3J2Y3 , Canada
| | - Vadim Makarov
- Bach Institute of Biochemistry , Research Center of Biotechnology of the Russian Academy of Sciences , Leninsky Prospekt 33-2 , Moscow 119071 , Russia
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc. , Main Campus Drive, Lab 3510 , Raleigh , North Carolina 27606 , United States
| |
Collapse
|
11
|
Tarasova O, Biziukova N, Filimonov D, Poroikov V. A Computational Approach for the Prediction of HIV Resistance Based on Amino Acid and Nucleotide Descriptors. Molecules 2018; 23:E2751. [PMID: 30355996 PMCID: PMC6278491 DOI: 10.3390/molecules23112751] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2018] [Revised: 10/07/2018] [Accepted: 10/16/2018] [Indexed: 12/25/2022] Open
Abstract
The high variability of the human immunodeficiency virus (HIV) is an important cause of HIV resistance to reverse transcriptase and protease inhibitors. There are many variants of HIV type 1 (HIV-1) that can be used to model sequence-resistance relationships. Machine learning methods are widely and successfully used in new drug discovery. An emerging body of data regarding the interactions of small drug-like molecules with their protein targets provides the possibility of building models on "structure-property" relationships and analyzing the performance of various machine-learning techniques. In our research, we analyze several different types of descriptors in order to predict the resistance of HIV reverse transcriptase and protease to the marketed antiretroviral drugs using the Random Forest approach. First, we represented amino acid sequences as a set of short peptide fragments, which included several amino acid residues. Second, we represented nucleotide sequences as a set of fragments, which included several nucleotides. We compared these two approaches using open data from the Stanford HIV Drug Resistance Database. We have determined the factors that modulate the performance of prediction: in particular, we observed that the prediction performance was more sensitive to certain drugs than a type of the descriptor used.
Collapse
Affiliation(s)
- Olga Tarasova
- Institute of Biomedical Chemistry, Moscow 119121, Russia.
| | | | | | | |
Collapse
|