1
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Kumarasamy V, Subramaniyan V, Wong LS. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front Pharmacol 2024; 15:1331062. [PMID: 38384298 PMCID: PMC10879372 DOI: 10.3389/fphar.2024.1331062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 02/23/2024] Open
Abstract
There are two main ways to discover or design small drug molecules. The first involves fine-tuning existing molecules or commercially successful drugs through quantitative structure-activity relationships and virtual screening. The second approach involves generating new molecules through de novo drug design or inverse quantitative structure-activity relationship. Both methods aim to get a drug molecule with the best pharmacokinetic and pharmacodynamic profiles. However, bringing a new drug to market is an expensive and time-consuming endeavor, with the average cost being estimated at around $2.5 billion. One of the biggest challenges is screening the vast number of potential drug candidates to find one that is both safe and effective. The development of artificial intelligence in recent years has been phenomenal, ushering in a revolution in many fields. The field of pharmaceutical sciences has also significantly benefited from multiple applications of artificial intelligence, especially drug discovery projects. Artificial intelligence models are finding use in molecular property prediction, molecule generation, virtual screening, synthesis planning, repurposing, among others. Lately, generative artificial intelligence has gained popularity across domains for its ability to generate entirely new data, such as images, sentences, audios, videos, novel chemical molecules, etc. Generative artificial intelligence has also delivered promising results in drug discovery and development. This review article delves into the fundamentals and framework of various generative artificial intelligence models in the context of drug discovery via de novo drug design approach. Various basic and advanced models have been discussed, along with their recent applications. The review also explores recent examples and advances in the generative artificial intelligence approach, as well as the challenges and ongoing efforts to fully harness the potential of generative artificial intelligence in generating novel drug molecules in a faster and more affordable manner. Some clinical-level assets generated form generative artificial intelligence have also been discussed in this review to show the ever-increasing application of artificial intelligence in drug discovery through commercial partnerships.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Azim Ansari
- Computer Aided Drug Design Center Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Dhule, India
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, Malaysia
| | - Vinoth Kumarasamy
- Department of Parasitology and Medical Entomology, Faculty of Medicine, Universiti Kebangsaan Malaysia, Cheras, Malaysia
| | - Vetriselvan Subramaniyan
- Pharmacology Unit, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Selangor, Malaysia
- School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, India
| | - Ling Shing Wong
- Faculty of Health and Life Sciences, INTI International University, Nilai, Malaysia
| |
Collapse
|
2
|
Ilnicka A, Schneider G. Designing molecules with autoencoder networks. NATURE COMPUTATIONAL SCIENCE 2023; 3:922-933. [PMID: 38177601 DOI: 10.1038/s43588-023-00548-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/03/2023] [Indexed: 01/06/2024]
Abstract
Autoencoders are versatile tools in molecular informatics. These unsupervised neural networks serve diverse tasks such as data-driven molecular representation and constructive molecular design. This Review explores their algorithmic foundations and applications in drug discovery, highlighting the most active areas of development and the contributions autoencoder networks have made in advancing this field. We also explore the challenges and prospects concerning the utilization of autoencoders and the various adaptations of this neural network architecture in molecular design.
Collapse
Affiliation(s)
- Agnieszka Ilnicka
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
3
|
Salas-Estrada L, Provasi D, Qiu X, Kaniskan HÜ, Huang XP, DiBerto JF, Lamim Ribeiro JM, Jin J, Roth BL, Filizola M. De Novo Design of κ-Opioid Receptor Antagonists Using a Generative Deep-Learning Framework. J Chem Inf Model 2023; 63:5056-5065. [PMID: 37555591 PMCID: PMC10466374 DOI: 10.1021/acs.jcim.3c00651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Indexed: 08/10/2023]
Abstract
Likely effective pharmacological interventions for the treatment of opioid addiction include attempts to attenuate brain reward deficits during periods of abstinence. Pharmacological blockade of the κ-opioid receptor (KOR) has been shown to abolish brain reward deficits in rodents during withdrawal, as well as to reduce the escalation of opioid use in rats with extended access to opioids. Although KOR antagonists represent promising candidates for the treatment of opioid addiction, very few potent selective KOR antagonists are known to date and most of them exhibit significant safety concerns. Here, we used a generative deep-learning framework for the de novo design of chemotypes with putative KOR antagonistic activity. Molecules generated by models trained with this framework were prioritized for chemical synthesis based on their predicted optimal interactions with the receptor. Our models and proposed training protocol were experimentally validated by binding and functional assays.
Collapse
Affiliation(s)
- Leslie Salas-Estrada
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Davide Provasi
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Xing Qiu
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Husnu Ümit Kaniskan
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Xi-Ping Huang
- National
Institute of Mental Health, Psychoactive Drug Screening Program, Department
of Pharmacology, University of North Carolina
School of Medicine, Chapel Hill, North Carolina 27599, United States
| | - Jeffrey F. DiBerto
- National
Institute of Mental Health, Psychoactive Drug Screening Program, Department
of Pharmacology, University of North Carolina
School of Medicine, Chapel Hill, North Carolina 27599, United States
| | - João Marcelo Lamim Ribeiro
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Jian Jin
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
- Mount
Sinai Center for Therapeutics Discovery, Departments of Oncological
Sciences and Neuroscience, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Bryan L. Roth
- National
Institute of Mental Health, Psychoactive Drug Screening Program, Department
of Pharmacology, University of North Carolina
School of Medicine, Chapel Hill, North Carolina 27599, United States
- Division
of Chemical Biology and Medicinal Chemistry, University of North Carolina at Chapel Hill Eshelman School of Pharmacy, Chapel Hill, North Carolina 27599, United States
| | - Marta Filizola
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| |
Collapse
|
4
|
Salas-Estrada L, Provasi D, Qui X, Kaniskan HÜ, Huang XP, DiBerto JF, Ribeiro JML, Jin J, Roth BL, Filizola M. De Novo Design of κ-Opioid Receptor Antagonists Using a Generative Deep Learning Framework. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.537995. [PMID: 37162828 PMCID: PMC10168226 DOI: 10.1101/2023.04.25.537995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Likely effective pharmacological interventions for the treatment of opioid addiction include attempts to attenuate brain reward deficits during periods of abstinence. Pharmacological blockade of the κ-opioid receptor (KOR) has been shown to abolish brain reward deficits in rodents during withdrawal, as well as to reduce the escalation of opioid use in rats with extended access to opioids. Although KOR antagonists represent promising candidates for the treatment of opioid addiction, very few potent selective KOR antagonists are known to date and most of them exhibit significant safety concerns. Here, we used a generative deep learning framework for the de novo design of chemotypes with putative KOR antagonistic activity. Molecules generated by models trained with this framework were prioritized for chemical synthesis based on their predicted optimal interactions with the receptor. Our models and proposed training protocol were experimentally validated by binding and functional assays.
Collapse
|
5
|
Tamaian R, Porozov Y, Shityakov S. Exhaustive in silico design and screening of novel antipsychotic compounds with improved pharmacodynamics and blood-brain barrier permeation properties. J Biomol Struct Dyn 2023; 41:14849-14870. [PMID: 36927517 DOI: 10.1080/07391102.2023.2184179] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2022] [Accepted: 02/18/2023] [Indexed: 03/18/2023]
Abstract
Antipsychotic drugs or neuroleptics are widely used in the treatment of psychosis as a manifestation of schizophrenia and bipolar disorder. However, their effectiveness largely depends on the blood-brain barrier (BBB) permeation (pharmacokinetics) and drug-receptor pharmacodynamics. Therefore, in this study, we developed and implemented the in silico pipeline to design novel compounds (n = 260) as leads using the standard drug scaffolds with improved PK/PD properties from the standard scaffolds. As a result, the best candidates (n = 3) were evaluated in molecular docking to interact with serotonin and dopamine receptors. Finally, haloperidol (HAL) derivative (1-(4-fluorophenyl)-4-(4-hydroxy-4-{4-[(2-phenyl-1,3-thiazol-4-yl)methyl]phenyl}piperidin-1-yl)butan-1-one) was identified as a "magic shotgun" lead compound with better affinity to the 5-HT2A, 5-HT1D, D2, D3, and 5-HT1B receptors than the control molecule. Additionally, this hit substance was predicted to possess similar BBB permeation properties and much lower toxicological profiles in comparison to HAL. Overall, the proposed rational drug design platform for novel antipsychotic drugs based on the BBB permeation and receptor binding might be an invaluable asset for a medicinal chemist or translational pharmacologist.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Radu Tamaian
- ICSI Analytics, National Research and Development Institute for Cryogenics and Isotopic Technologies - ICSI Rm. Vâlcea, Râmnicu Vâlcea, Romania
| | - Yuri Porozov
- Center of Bio- and Chemoinformatics, I.M. Sechenov First Moscow State Medical University, Moscow, Russia
| | - Sergey Shityakov
- Laboratory of Chemoinformatics, Infochemistry Scientific Center, ITMO University, Saint-Petersburg, Russia
| |
Collapse
|
6
|
Li Y, Zhang L, Wang Y, Zou J, Yang R, Luo X, Wu C, Yang W, Tian C, Xu H, Wang F, Yang X, Li L, Yang S. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat Commun 2022; 13:6891. [PMID: 36371441 PMCID: PMC9653409 DOI: 10.1038/s41467-022-34692-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 11/03/2022] [Indexed: 11/13/2022] Open
Abstract
The retrieval of hit/lead compounds with novel scaffolds during early drug development is an important but challenging task. Various generative models have been proposed to create drug-like molecules. However, the capacity of these generative models to design wet-lab-validated and target-specific molecules with novel scaffolds has hardly been verified. We herein propose a generative deep learning (GDL) model, a distribution-learning conditional recurrent neural network (cRNN), to generate tailor-made virtual compound libraries for given biological targets. The GDL model is then applied to RIPK1. Virtual screening against the generated tailor-made compound library and subsequent bioactivity evaluation lead to the discovery of a potent and selective RIPK1 inhibitor with a previously unreported scaffold, RI-962. This compound displays potent in vitro activity in protecting cells from necroptosis, and good in vivo efficacy in two inflammatory models. Collectively, the findings prove the capacity of our GDL model in generating hit/lead compounds with unreported scaffolds, highlighting a great potential of deep learning in drug discovery.
Collapse
Affiliation(s)
- Yueshan Li
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Liting Zhang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Yifei Wang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Jun Zou
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Ruicheng Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Xinling Luo
- grid.13291.380000 0001 0807 1581Key Laboratory of Drug Targeting and Drug Delivery System of Ministry of Education, West China School of Pharmacy, Sichuan University, 610041 Chengdu, Sichuan China
| | - Chengyong Wu
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Wei Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Chenyu Tian
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Haixing Xu
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Falu Wang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Xin Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Linli Li
- grid.13291.380000 0001 0807 1581Key Laboratory of Drug Targeting and Drug Delivery System of Ministry of Education, West China School of Pharmacy, Sichuan University, 610041 Chengdu, Sichuan China
| | - Shengyong Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| |
Collapse
|
7
|
Kumar R, Sharma A, Alexiou A, Ashraf GM. Artificial Intelligence in De novo Drug Design: Are We Still There? Curr Top Med Chem 2022; 22:2483-2492. [PMID: 36263480 DOI: 10.2174/1568026623666221017143244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2022] [Revised: 09/06/2022] [Accepted: 09/15/2022] [Indexed: 01/20/2023]
Abstract
BACKGROUND The artificial intelligence (AI)-assisted design of drug candidates with novel structures and desired properties has received significant attention in the recent past, so related areas of forward prediction that aim to discover chemical matters worth synthesizing and further experimental investigation. OBJECTIVES The purpose behind developing AI-driven models is to explore the broader chemical space and suggest new drug candidate scaffolds with promising therapeutic value. Moreover, it is anticipated that such AI-based models may not only significantly reduce the cost and time but also decrease the attrition rate of drug candidates that fail to reach the desirable endpoints at the final stages of drug development. In an attempt to develop AI-based models for de novo drug design, numerous methods have been proposed by various study groups by applying machine learning and deep learning algorithms to chemical datasets. However, there are many challenges in obtaining accurate predictions, and real breakthroughs in de novo drug design are still scarce. METHODS In this review, we explore the recent trends in developing AI-based models for de novo drug design to assess the current status, challenges, and opportunities in the field. CONCLUSION The consistently improved AI algorithms and the abundance of curated training chemical data indicate that AI-based de novo drug design should perform better than the current models. Improvements in the performance are warranted to obtain better outcomes in the form of potential drug candidates, which can perform well in in vivo conditions, especially in the case of more complex diseases.
Collapse
Affiliation(s)
- Rajnish Kumar
- Amity Institute of Biotechnology, Amity University Uttar Pradesh Lucknow Campus, Uttar Pradesh, India
| | - Anju Sharma
- Department of Applied Science, Indian Institute of Information Technology, Allahabad, Uttar Pradesh, India
| | - Athanasios Alexiou
- Novel Global Community Educational Foundation, Hebersham, 2770 NSW, Australia.,AFNP Med Austria, 1010 Wien, Austria
| | - Ghulam Md Ashraf
- Pre-Clinical Research Unit (PCRU), King Fahd Medical Research Center, King Abdulaziz University, Jeddah, Saudi Arabia.,Department of Medical Laboratory Technology, Faculty of Applied Medical Sciences, King Abdulaziz University, Jeddah, Saudi Arabia
| |
Collapse
|
8
|
Polanski J. Unsupervised Learning in Drug Design from Self-Organization to Deep Chemistry. Int J Mol Sci 2022; 23:2797. [PMID: 35269939 PMCID: PMC8910896 DOI: 10.3390/ijms23052797] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2022] [Revised: 02/27/2022] [Accepted: 02/27/2022] [Indexed: 12/10/2022] Open
Abstract
The availability of computers has brought novel prospects in drug design. Neural networks (NN) were an early tool that cheminformatics tested for converting data into drugs. However, the initial interest faded for almost two decades. The recent success of Deep Learning (DL) has inspired a renaissance of neural networks for their potential application in deep chemistry. DL targets direct data analysis without any human intervention. Although back-propagation NN is the main algorithm in the DL that is currently being used, unsupervised learning can be even more efficient. We review self-organizing maps (SOM) in mapping molecular representations from the 1990s to the current deep chemistry. We discovered the enormous efficiency of SOM not only for features that could be expected by humans, but also for those that are not trivial to human chemists. We reviewed the DL projects in the current literature, especially unsupervised architectures. DL appears to be efficient in pattern recognition (Deep Face) or chess (Deep Blue). However, an efficient deep chemistry is still a matter for the future. This is because the availability of measured property data in chemistry is still limited.
Collapse
Affiliation(s)
- Jaroslaw Polanski
- Institute of Chemistry, Faculty of Science and Technology, University of Silesia, Szkolna 9, 40-006 Katowice, Poland
| |
Collapse
|
9
|
Oliveira AF, Da Silva JLF, Quiles MG. Molecular Property Prediction and Molecular Design Using a Supervised Grammar Variational Autoencoder. J Chem Inf Model 2022; 62:817-828. [PMID: 35174705 DOI: 10.1021/acs.jcim.1c01573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Some of the most common applications of machine learning (ML) algorithms dealing with small molecules usually fall within two distinct domains, namely, the prediction of molecular properties and the design of novel molecules with some desirable property. Here we unite these applications under a single molecular representation and ML algorithm by modifying the grammar variational autoencoder (GVAE) model with the incorporation of property information into its training procedure, thus creating a supervised GVAE (SGVAE). Results indicate that the biased latent space generated by this approach can successfully be used to predict the molecular properties of the input molecules, produce novel and unique molecules with some desired property and also estimate the properties of random sampled molecules. We illustrate these possibilities by sampling novel molecules from the latent space with specific values of the lowest unoccupied molecular orbital (LUMO) energy after training the model using the QM9 data set. Furthermore, the trained model is also used to predict the properties of a hold-out set and the resulting mean absolute error (MAE) shows values close to chemical accuracy for the dipole moment and atomization energies, even outperforming ML models designed to exclusive predict molecular properties using the SMILES as molecular representation. Therefore, these results show that the proposed approach is a viable way to provide generative ML models with molecular property information in a way that the generation of novel molecules is likely to achieve better results, with the benefit that these new molecules can also have their molecular properties accurately predicted.
Collapse
Affiliation(s)
- André F Oliveira
- Associate Laboratory for Computing and Applied Mathematics, National Institute for Space Research, P.O. Box 515, 12227-010, São José dos Campos, SP, Brazil
| | - Juarez L F Da Silva
- São Carlos Institute of Chemistry, University of São Paulo, P.O. Box 780, 13560-970, São Carlos, SP, Brazil
| | - Marcos G Quiles
- Institute of Science and Technology, Federal University of São Paulo, 12247-014, São José dos Campos, SP, Brazil
| |
Collapse
|
10
|
Deep Learning Applied to Ligand-Based De Novo Drug Design. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2021; 2390:273-299. [PMID: 34731474 DOI: 10.1007/978-1-0716-1787-8_12] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
In the latest years, the application of deep generative models to suggest virtual compounds is becoming a new and powerful tool in drug discovery projects. The idea behind this review is to offer an updated view on de novo design approaches based on artificial intelligent (AI) algorithms, with a particular focus on ligand-based methods. We start this review by reporting a brief overview of the most relevant de novo design approaches developed before the use of AI techniques. We then describe the nowadays most common neural network architectures employed in ligand-based de novo design, together with an up-to-date list of more than 100 deep generative models found in the literature (2017-2020). In order to show how deep generative approaches are applied into drug discovery context, we report all the now available studies in which generated compounds have been synthetized and their biological activity tested. Finally, we discuss what we envisage as beneficial future directions for further application of deep generative models in de novo drug design.
Collapse
|
11
|
Buin A, Chiang HY, Gadsden SA, Alderson FA. Permutationally Invariant Deep Learning Approach to Molecular Fingerprinting with Application to Compound Mixtures. J Chem Inf Model 2021; 61:631-640. [PMID: 33539087 DOI: 10.1021/acs.jcim.0c01097] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Recent advancements in deep learning have led to widespread applications of its algorithms to synthetic planning and reaction predictions in the field of chemistry. One major area, known as supervised learning, is being explored for predicting certain properties such as reaction yields and types. Many chemical descriptors known as fingerprints are being explored as potential candidates for reaction properties prediction. However, there are few studies that describe the permutational invariance of chemical fingerprints, which are concatenated at some stage before being fed to deep learning architecture. In this work, we show that by utilizing permutational invariance, we consistently see improved results in terms of accuracy relative to previously published studies. Furthermore, we are able to accurately predict hydrogen peroxide loss with our own dataset, which consists of more than 20 ingredients in each chemical formulation.
Collapse
Affiliation(s)
- Andrei Buin
- College of Engineering and Physical Sciences, University of Guelph, Guelph, Ontario N1G 2W1, Canada
| | - Hung Yi Chiang
- College of Engineering and Physical Sciences, University of Guelph, Guelph, Ontario N1G 2W1, Canada
| | - S Andrew Gadsden
- College of Engineering and Physical Sciences, University of Guelph, Guelph, Ontario N1G 2W1, Canada
| | - Faraz A Alderson
- College of Engineering and Physical Sciences, University of Guelph, Guelph, Ontario N1G 2W1, Canada
| |
Collapse
|
12
|
Transformer neural network for protein-specific de novo drug generation as a machine translation problem. Sci Rep 2021; 11:321. [PMID: 33432013 PMCID: PMC7801439 DOI: 10.1038/s41598-020-79682-4] [Citation(s) in RCA: 78] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 12/09/2020] [Indexed: 12/22/2022] Open
Abstract
Drug discovery for a protein target is a very laborious, long and costly process. Machine learning approaches and, in particular, deep generative networks can substantially reduce development time and costs. However, the majority of methods imply prior knowledge of protein binders, their physicochemical characteristics or the three-dimensional structure of the protein. The method proposed in this work generates novel molecules with predicted ability to bind a target protein by relying on its amino acid sequence only. We consider target-specific de novo drug design as a translational problem between the amino acid “language” and simplified molecular input line entry system representation of the molecule. To tackle this problem, we apply Transformer neural network architecture, a state-of-the-art approach in sequence transduction tasks. Transformer is based on a self-attention technique, which allows the capture of long-range dependencies between items in sequence. The model generates realistic diverse compounds with structural novelty. The computed physicochemical properties and common metrics used in drug discovery fall within the plausible drug-like range of values.
Collapse
|
13
|
Kim H, Kim E, Lee I, Bae B, Park M, Nam H. Artificial Intelligence in Drug Discovery: A Comprehensive Review of Data-driven and Machine Learning Approaches. BIOTECHNOL BIOPROC E 2021; 25:895-930. [PMID: 33437151 PMCID: PMC7790479 DOI: 10.1007/s12257-020-0049-y] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2020] [Revised: 05/27/2020] [Accepted: 06/03/2020] [Indexed: 02/07/2023]
Abstract
As expenditure on drug development increases exponentially, the overall drug discovery process requires a sustainable revolution. Since artificial intelligence (AI) is leading the fourth industrial revolution, AI can be considered as a viable solution for unstable drug research and development. Generally, AI is applied to fields with sufficient data such as computer vision and natural language processing, but there are many efforts to revolutionize the existing drug discovery process by applying AI. This review provides a comprehensive, organized summary of the recent research trends in AI-guided drug discovery process including target identification, hit identification, ADMET prediction, lead optimization, and drug repositioning. The main data sources in each field are also summarized in this review. In addition, an in-depth analysis of the remaining challenges and limitations will be provided, and proposals for promising future directions in each of the aforementioned areas.
Collapse
Affiliation(s)
- Hyunho Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Eunyoung Kim
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Ingoo Lee
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Bongsung Bae
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Minsu Park
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| | - Hojung Nam
- School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology (GIST), Gwangju, 61005 Korea
| |
Collapse
|
14
|
Hudson IL. Data Integration Using Advances in Machine Learning in Drug Discovery and Molecular Biology. Methods Mol Biol 2021; 2190:167-184. [PMID: 32804365 DOI: 10.1007/978-1-0716-0826-5_7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
While the term artificial intelligence and the concept of deep learning are not new, recent advances in high-performance computing, the availability of large annotated data sets required for training, and novel frameworks for implementing deep neural networks have led to an unprecedented acceleration of the field of molecular (network) biology and pharmacogenomics. The need to align biological data to innovative machine learning has stimulated developments in both data integration (fusion) and knowledge representation, in the form of heterogeneous, multiplex, and biological networks or graphs. In this chapter we briefly introduce several popular neural network architectures used in deep learning, namely, the fully connected deep neural network, recurrent neural network, convolutional neural network, and the autoencoder. Deep learning predictors, classifiers, and generators utilized in modern feature extraction may well assist interpretability and thus imbue AI tools with increased explication, potentially adding insights and advancements in novel chemistry and biology discovery.The capability of learning representations from structures directly without using any predefined structure descriptor is an important feature distinguishing deep learning from other machine learning methods and makes the traditional feature selection and reduction procedures unnecessary. In this chapter we briefly show how these technologies are applied for data integration (fusion) and analysis in drug discovery research covering these areas: (1) application of convolutional neural networks to predict ligand-protein interactions; (2) application of deep learning in compound property and activity prediction; (3) de novo design through deep learning. We also: (1) discuss some aspects of future development of deep learning in drug discovery/chemistry; (2) provide references to published information; (3) provide recently advocated recommendations on using artificial intelligence and deep learning in -omics research and drug discovery.
Collapse
Affiliation(s)
- Irene Lena Hudson
- Mathematical Sciences, School of Science, RMIT University, Melbourne, VIC, Australia.
| |
Collapse
|
15
|
Chen JH, Tseng YJ. Different molecular enumeration influences in deep learning: an example using aqueous solubility. Brief Bioinform 2020; 22:5851267. [PMID: 32501508 DOI: 10.1093/bib/bbaa092] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Revised: 04/27/2020] [Accepted: 04/27/2020] [Indexed: 12/24/2022] Open
Abstract
Aqueous solubility is the key property driving many chemical and biological phenomena and impacts experimental and computational attempts to assess those phenomena. Accurate prediction of solubility is essential and challenging, even with modern computational algorithms. Fingerprint-based, feature-based and molecular graph-based representations have all been used with different deep learning methods for aqueous solubility prediction. It has been clearly demonstrated that different molecular representations impact the model prediction and explainability. In this work, we reviewed different representations and also focused on using graph and line notations for modeling. In general, one canonical chemical structure is used to represent one molecule when computing its properties. We carefully examined the commonly used simplified molecular-input line-entry specification (SMILES) notation representing a single molecule and proposed to use the full enumerations in SMILES to achieve better accuracy. A convolutional neural network (CNN) was used. The full enumeration of SMILES can improve the presentation of a molecule and describe the molecule with all possible angles. This CNN model can be very robust when dealing with large datasets since no additional explicit chemistry knowledge is necessary to predict the solubility. Also, traditionally it is hard to use a neural network to explain the contribution of chemical substructures to a single property. We demonstrated the use of attention in the decoding network to detect the part of a molecule that is relevant to solubility, which can be used to explain the contribution from the CNN.
Collapse
|
16
|
Nahmias D, Cohen A, Nissim N, Elovici Y. Deep feature transfer learning for trusted and automated malware signature generation in private cloud environments. Neural Netw 2020; 124:243-257. [DOI: 10.1016/j.neunet.2020.01.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2019] [Revised: 12/01/2019] [Accepted: 01/07/2020] [Indexed: 10/25/2022]
|
17
|
de Souza Neto LR, Moreira-Filho JT, Neves BJ, Maidana RLBR, Guimarães ACR, Furnham N, Andrade CH, Silva FP. In silico Strategies to Support Fragment-to-Lead Optimization in Drug Discovery. Front Chem 2020; 8:93. [PMID: 32133344 PMCID: PMC7040036 DOI: 10.3389/fchem.2020.00093] [Citation(s) in RCA: 101] [Impact Index Per Article: 25.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 01/30/2020] [Indexed: 12/16/2022] Open
Abstract
Fragment-based drug (or lead) discovery (FBDD or FBLD) has developed in the last two decades to become a successful key technology in the pharmaceutical industry for early stage drug discovery and development. The FBDD strategy consists of screening low molecular weight compounds against macromolecular targets (usually proteins) of clinical relevance. These small molecular fragments can bind at one or more sites on the target and act as starting points for the development of lead compounds. In developing the fragments attractive features that can translate into compounds with favorable physical, pharmacokinetics and toxicity (ADMET-absorption, distribution, metabolism, excretion, and toxicity) properties can be integrated. Structure-enabled fragment screening campaigns use a combination of screening by a range of biophysical techniques, such as differential scanning fluorimetry, surface plasmon resonance, and thermophoresis, followed by structural characterization of fragment binding using NMR or X-ray crystallography. Structural characterization is also used in subsequent analysis for growing fragments of selected screening hits. The latest iteration of the FBDD workflow employs a high-throughput methodology of massively parallel screening by X-ray crystallography of individually soaked fragments. In this review we will outline the FBDD strategies and explore a variety of in silico approaches to support the follow-up fragment-to-lead optimization of either: growing, linking, and merging. These fragment expansion strategies include hot spot analysis, druggability prediction, SAR (structure-activity relationships) by catalog methods, application of machine learning/deep learning models for virtual screening and several de novo design methods for proposing synthesizable new compounds. Finally, we will highlight recent case studies in fragment-based drug discovery where in silico methods have successfully contributed to the development of lead compounds.
Collapse
Affiliation(s)
- Lauro Ribeiro de Souza Neto
- LaBECFar – Laboratório de Bioquímica Experimental e Computacional de Fármacos, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - José Teófilo Moreira-Filho
- LabMol – Laboratory for Molecular Modeling and Drug Design, Faculdade de Farmácia, Universidade Federal de Goiás, Goiânia, Brazil
| | - Bruno Junior Neves
- LabMol – Laboratory for Molecular Modeling and Drug Design, Faculdade de Farmácia, Universidade Federal de Goiás, Goiânia, Brazil
- Laboratory of Cheminformatics, Centro Universitário de Anápolis – UniEVANGÉLICA, Anápolis, Brazil
| | - Rocío Lucía Beatriz Riveros Maidana
- LaBECFar – Laboratório de Bioquímica Experimental e Computacional de Fármacos, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
- Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - Ana Carolina Ramos Guimarães
- Laboratório de Genômica Funcional e Bioinformática, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| | - Nicholas Furnham
- Department of Infection Biology, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Carolina Horta Andrade
- LabMol – Laboratory for Molecular Modeling and Drug Design, Faculdade de Farmácia, Universidade Federal de Goiás, Goiânia, Brazil
| | - Floriano Paes Silva
- LaBECFar – Laboratório de Bioquímica Experimental e Computacional de Fármacos, Instituto Oswaldo Cruz, Fundação Oswaldo Cruz, Rio de Janeiro, Brazil
| |
Collapse
|
18
|
Wang Y, Huang L, Jiang S, Wang Y, Zou J, Fu H, Yang S. Capsule Networks Showed Excellent Performance in the Classification of hERG Blockers/Nonblockers. Front Pharmacol 2020; 10:1631. [PMID: 32063849 PMCID: PMC6997788 DOI: 10.3389/fphar.2019.01631] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2019] [Accepted: 12/13/2019] [Indexed: 02/05/2023] Open
Abstract
Capsule networks (CapsNets), a new class of deep neural network architectures proposed recently by Hinton et al., have shown a great performance in many fields, particularly in image recognition and natural language processing. However, CapsNets have not yet been applied to drug discovery-related studies. As the first attempt, we in this investigation adopted CapsNets to develop classification models of hERG blockers/nonblockers; drugs with hERG blockade activity are thought to have a potential risk of cardiotoxicity. Two capsule network architectures were established: convolution-capsule network (Conv-CapsNet) and restricted Boltzmann machine-capsule networks (RBM-CapsNet), in which convolution and a restricted Boltzmann machine (RBM) were used as feature extractors, respectively. Two prediction models of hERG blockers/nonblockers were then developed by Conv-CapsNet and RBM-CapsNet with the Doddareddy's training set composed of 2,389 compounds. The established models showed excellent performance in an independent test set comprising 255 compounds, with prediction accuracies of 91.8 and 92.2% for Conv-CapsNet and RBM-CapsNet models, respectively. Various comparisons were also made between our models and those developed by other machine learning methods including deep belief network (DBN), convolutional neural network (CNN), multilayer perceptron (MLP), support vector machine (SVM), k-nearest neighbors (kNN), logistic regression (LR), and LightGBM, and with different training sets. All the results showed that the models by Conv-CapsNet and RBM-CapsNet are among the best classification models. Overall, the excellent performance of capsule networks achieved in this investigation highlights their potential in drug discovery-related studies.
Collapse
Affiliation(s)
- Yiwei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
- College of Preclinical Medicine, Southwest Medical University, Luzhou, China
| | - Lei Huang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
- Basic Teaching Department, Sichuan College of Architectural Technology, Deyang, China
| | - Siwen Jiang
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Jun Zou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| | - Hongguang Fu
- School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu, China
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, China
| |
Collapse
|
19
|
Griffiths RR, Hernández-Lobato JM. Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem Sci 2020; 11:577-586. [PMID: 32190274 PMCID: PMC7067240 DOI: 10.1039/c9sc04026a] [Citation(s) in RCA: 86] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2019] [Accepted: 11/15/2019] [Indexed: 12/15/2022] Open
Abstract
Automatic Chemical Design is a framework for generating novel molecules with optimized properties. The original scheme, featuring Bayesian optimization over the latent space of a variational autoencoder, suffers from the pathology that it tends to produce invalid molecular structures. First, we demonstrate empirically that this pathology arises when the Bayesian optimization scheme queries latent space points far away from the data on which the variational autoencoder has been trained. Secondly, by reformulating the search procedure as a constrained Bayesian optimization problem, we show that the effects of this pathology can be mitigated, yielding marked improvements in the validity of the generated molecules. We posit that constrained Bayesian optimization is a good approach for solving this kind of training set mismatch in many generative tasks involving Bayesian optimization over the latent space of a variational autoencoder.
Collapse
Affiliation(s)
- Ryan-Rhys Griffiths
- Cavendish Laboratory , Department of Physics , University of Cambridge , UK .
| | - José Miguel Hernández-Lobato
- Department of Engineering , University of Cambridge , UK .
- Alan Turing Institute , London , UK
- Microsoft Research , Cambridge , UK
| |
Collapse
|
20
|
Wang YW, Huang L, Jiang SW, Li K, Zou J, Yang SY. CapsCarcino: A novel sparse data deep learning tool for predicting carcinogens. Food Chem Toxicol 2020; 135:110921. [PMID: 31669597 DOI: 10.1016/j.fct.2019.110921] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2019] [Revised: 09/21/2019] [Accepted: 10/23/2019] [Indexed: 12/11/2022]
Affiliation(s)
- Yi-Wei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China; College of Preclinical Medicine, Southwest Medical University, Luzhou, Sichuan, 646000, PR China
| | - Lei Huang
- School of Computer Science & Engineer, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, PR China; Basic Teaching Department, Sichuan College of Architectural Technology, Deyang, Sichuan, 61800, PR China
| | - Si-Wen Jiang
- School of Computer Science & Engineer, University of Electronic Science and Technology of China, Chengdu, Sichuan, 611731, PR China
| | - Kan Li
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China
| | - Jun Zou
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China.
| | - Sheng-Yong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, Chengdu, Sichuan, 610041, PR China.
| |
Collapse
|
21
|
Cova TFGG, Pais AACC. Deep Learning for Deep Chemistry: Optimizing the Prediction of Chemical Patterns. Front Chem 2019; 7:809. [PMID: 32039134 PMCID: PMC6988795 DOI: 10.3389/fchem.2019.00809] [Citation(s) in RCA: 60] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Accepted: 11/11/2019] [Indexed: 12/14/2022] Open
Abstract
Computational Chemistry is currently a synergistic assembly between ab initio calculations, simulation, machine learning (ML) and optimization strategies for describing, solving and predicting chemical data and related phenomena. These include accelerated literature searches, analysis and prediction of physical and quantum chemical properties, transition states, chemical structures, chemical reactions, and also new catalysts and drug candidates. The generalization of scalability to larger chemical problems, rather than specialization, is now the main principle for transforming chemical tasks in multiple fronts, for which systematic and cost-effective solutions have benefited from ML approaches, including those based on deep learning (e.g. quantum chemistry, molecular screening, synthetic route design, catalysis, drug discovery). The latter class of ML algorithms is capable of combining raw input into layers of intermediate features, enabling bench-to-bytes designs with the potential to transform several chemical domains. In this review, the most exciting developments concerning the use of ML in a range of different chemical scenarios are described. A range of different chemical problems and respective rationalization, that have hitherto been inaccessible due to the lack of suitable analysis tools, is thus detailed, evidencing the breadth of potential applications of these emerging multidimensional approaches. Focus is given to the models, algorithms and methods proposed to facilitate research on compound design and synthesis, materials design, prediction of binding, molecular activity, and soft matter behavior. The information produced by pairing Chemistry and ML, through data-driven analyses, neural network predictions and monitoring of chemical systems, allows (i) prompting the ability to understand the complexity of chemical data, (ii) streamlining and designing experiments, (ii) discovering new molecular targets and materials, and also (iv) planning or rethinking forthcoming chemical challenges. In fact, optimization engulfs all these tasks directly.
Collapse
Affiliation(s)
- Tânia F. G. G. Cova
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| | - Alberto A. C. C. Pais
- Coimbra Chemistry Centre, CQC, Department of Chemistry, Faculty of Sciences and Technology, University of Coimbra, Coimbra, Portugal
| |
Collapse
|
22
|
Gantzer P, Creton B, Nieto-Draghi C. Inverse-QSPR for de novo Design: A Review. Mol Inform 2019; 39:e1900087. [PMID: 31682079 DOI: 10.1002/minf.201900087] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2019] [Accepted: 11/04/2019] [Indexed: 11/09/2022]
Abstract
The use of computer tools to solve chemistry-related problems has given rise to a large and increasing number of publications these last decades. This new field of science is now well recognized and labelled Chemoinformatics. Among all chemoinformatics techniques, the use of statistical based approaches for property predictions has been the subject of numerous research reflecting both new developments and many cases of applications. The so obtained predictive models relating a property to molecular features - descriptors - are gathered under the acronym QSPR, for Quantitative Structure Property Relationships. Apart from the obvious use of such models to predict property values for new compounds, their use to virtually synthesize new molecules - de novo design - is currently a high-interest subject. Inverse-QSPR (i-QSPR) methods have hence been developed to accelerate the discovery of new materials that meet a set of specifications. In the proposed manuscript, we review existing i-QSPR methodologies published in the open literature in a way to highlight developments, applications, improvements and limitations of each.
Collapse
Affiliation(s)
- Philippe Gantzer
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Benoit Creton
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Carlos Nieto-Draghi
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| |
Collapse
|
23
|
Bian Y, Wang J, Jun JJ, Xie XQ. Deep Convolutional Generative Adversarial Network (dcGAN) Models for Screening and Design of Small Molecules Targeting Cannabinoid Receptors. Mol Pharm 2019; 16:4451-4460. [PMID: 31589460 DOI: 10.1021/acs.molpharmaceut.9b00500] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
A deep convolutional generative adversarial network (dcGAN) model was developed in this study to screen and design target-specific novel compounds for cannabinoid receptors. In the adversarial process of training, two models, the discriminator D and the generator G, are iteratively trained. D is trained to discover the hidden patterns among the input data to have the accurate discrimination of the authentic compounds and the "fake" compounds generated by G; G is trained to generate "fake" compounds to fool the well-trained D by optimizing the weights for matrix multiplication of data sampling. In order to determine the appropriate architecture and the input data structure for the involved convolutional neural networks (CNNs), the combinations of various network architectures and molecular fingerprints were explored. Well-developed CNN models including LeNet-5, AlexNet, ZFNet, and VGGNet were investigated. Four types of fingerprints, including MACCS, ECFP6, AtomPair, and AtomPair Count, were calculated to describe the small molecules with diverse structural characteristics. The limitation of generating fingerprints as output remains that the concrete molecular structures cannot be converted directly, while the generative models with convolutional networks provide promising opportunities to the screening of molecules and rational modifications afterward. This study demonstrated how computer-aided drug discovery could benefit from the recent advances in deep learning.
Collapse
|
24
|
Fan Y, Zhang Y, Hua Y, Wang Y, Zhu L, Zhao J, Yang Y, Chen X, Lu S, Lu T, Chen Y, Liu H. Investigation of Machine Intelligence in Compound Cell Activity Classification. Mol Pharm 2019; 16:4472-4484. [PMID: 31580683 DOI: 10.1021/acs.molpharmaceut.9b00558] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Machine intelligence has been greatly developed in the past decades and has been widely used in many fields. In the recent years, many reports have shown its satisfactory effect in drug discovery. In this study, machine intelligence methods were explored to assist the cell activity prediction. Multiple machine intelligence methods including support vector machine, decision tree, random forest, extra trees, gradient boosting machine, convolutional neural network, long short-term memory network, and gated recurrent unit network were employed to separate compounds based on their cell activity. Different from some reported classification models, compounds were expressed as a string by the simplified molecular input line entry system and directly used as input rather than any chemical descriptors, which mimicked natural language processing. Both the single cell strain and whole data set under the balanced and imbalanced data distributions were discussed, respectively. Different activity cutoffs were set for the single (Z-score = 3) and the whole (Z-score = 5 and 6) data set. Nine metrics were used to evaluate the models including accuracy, precision, recall, f1-score, area under the receiver operating characteristic curve score, Cohen's κ, Brier score, Matthews correlation coefficient, and balanced accuracy. The results show that the gradient boosting machine is competent at balanced data distribution, and convolutional neural network is qualified for the imbalanced one. The results demonstrate that both classic machine learning methods and deep learning methods have potential in classification of compound cell activity.
Collapse
Affiliation(s)
- Yuanrong Fan
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yanmin Zhang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yi Hua
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yuchen Wang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Lu Zhu
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Junnan Zhao
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Yan Yang
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Xingye Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Shuai Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Tao Lu
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China.,State Key Laboratory of Natural Medicines , China Pharmaceutical University , 24 Tongjiaxiang , Nanjing 210009 , China
| | - Yadong Chen
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| | - Haichun Liu
- Laboratory of Molecular Design and Drug Discovery, School of Science , China Pharmaceutical University , 639 Longmian Avenue , Nanjing 211198 , China
| |
Collapse
|
25
|
Deep learning enables rapid identification of potent DDR1 kinase inhibitors. Nat Biotechnol 2019; 37:1038-1040. [PMID: 31477924 DOI: 10.1038/s41587-019-0224-x] [Citation(s) in RCA: 516] [Impact Index Per Article: 103.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2018] [Accepted: 07/12/2019] [Indexed: 11/08/2022]
Abstract
We have developed a deep generative model, generative tensorial reinforcement learning (GENTRL), for de novo small-molecule design. GENTRL optimizes synthetic feasibility, novelty, and biological activity. We used GENTRL to discover potent inhibitors of discoidin domain receptor 1 (DDR1), a kinase target implicated in fibrosis and other diseases, in 21 days. Four compounds were active in biochemical assays, and two were validated in cell-based assays. One lead candidate was tested and demonstrated favorable pharmacokinetics in mice.
Collapse
|
26
|
Zhavoronkov A. Artificial Intelligence for Drug Discovery, Biomarker Development, and Generation of Novel Chemistry. Mol Pharm 2019; 15:4311-4313. [PMID: 30269508 DOI: 10.1021/acs.molpharmaceut.8b00930] [Citation(s) in RCA: 73] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Alex Zhavoronkov
- JHU , Insilico Medicine, Inc. , 9601 Medical Center Dr, Suite 127 , Rockville , Maryland 20850 , United States
| |
Collapse
|
27
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 351] [Impact Index Per Article: 70.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
28
|
Awale M, Sirockin F, Stiefl N, Reymond JL. Drug Analogs from Fragment-Based Long Short-Term Memory Generative Neural Networks. J Chem Inf Model 2019; 59:1347-1356. [DOI: 10.1021/acs.jcim.8b00902] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Affiliation(s)
- Mahendra Awale
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| | - Finton Sirockin
- Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, CH-4002 Basel, Switzerland
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012 Bern, Switzerland
| |
Collapse
|
29
|
Brown N, Fiscato M, Segler MHS, Vaucher AC. GuacaMol: Benchmarking Models for de Novo Molecular Design. J Chem Inf Model 2019; 59:1096-1108. [PMID: 30887799 DOI: 10.1021/acs.jcim.8b00839] [Citation(s) in RCA: 309] [Impact Index Per Article: 61.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
De novo design seeks to generate molecules with required property profiles by virtual design-make-test cycles. With the emergence of deep learning and neural generative models in many application areas, models for molecular design based on neural networks appeared recently and show promising results. However, the new models have not been profiled on consistent tasks, and comparative studies to well-established algorithms have only seldom been performed. To standardize the assessment of both classical and neural models for de novo molecular design, we propose an evaluation framework, GuacaMol, based on a suite of standardized benchmarks. The benchmark tasks encompass measuring the fidelity of the models to reproduce the property distribution of the training sets, the ability to generate novel molecules, the exploration and exploitation of chemical space, and a variety of single and multiobjective optimization tasks. The benchmarking open-source Python code and a leaderboard can be found on https://benevolent.ai/guacamol .
Collapse
Affiliation(s)
- Nathan Brown
- BenevolentAI , 4-8 Maple Street , W1T 5HD London , U.K
| | - Marco Fiscato
- BenevolentAI , 4-8 Maple Street , W1T 5HD London , U.K
| | | | | |
Collapse
|