1
|
Isigkeit L, Hörmann T, Schallmayer E, Scholz K, Lillich FF, Ehrler JHM, Hufnagel B, Büchner J, Marschner JA, Pabel J, Proschak E, Merk D. Automated design of multi-target ligands by generative deep learning. Nat Commun 2024; 15:7946. [PMID: 39261471 PMCID: PMC11390726 DOI: 10.1038/s41467-024-52060-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 08/23/2024] [Indexed: 09/13/2024] Open
Abstract
Generative deep learning models enable data-driven de novo design of molecules with tailored features. Chemical language models (CLM) trained on string representations of molecules such as SMILES have been successfully employed to design new chemical entities with experimentally confirmed activity on intended targets. Here, we probe the application of CLM to generate multi-target ligands for designed polypharmacology. We capitalize on the ability of CLM to learn from small fine-tuning sets of molecules and successfully bias the model towards designing drug-like molecules with similarity to known ligands of target pairs of interest. Designs obtained from CLM after pooled fine-tuning are predicted active on both proteins of interest and comprise pharmacophore elements of ligands for both targets in one molecule. Synthesis and testing of twelve computationally favored CLM designs for six target pairs reveals modulation of at least one intended protein by all selected designs with up to double-digit nanomolar potency and confirms seven compounds as designed dual ligands. These results corroborate CLM for multi-target de novo design as source of innovation in drug discovery.
Collapse
Affiliation(s)
- Laura Isigkeit
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Tim Hörmann
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Espen Schallmayer
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Katharina Scholz
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Felix F Lillich
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, 60596, Frankfurt, Germany
| | - Johanna H M Ehrler
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Benedikt Hufnagel
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Jasmin Büchner
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Julian A Marschner
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Jörg Pabel
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Ewgenij Proschak
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, 60596, Frankfurt, Germany
| | - Daniel Merk
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany.
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany.
| |
Collapse
|
2
|
Lavecchia A. Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discov Today 2024; 29:104133. [PMID: 39103144 DOI: 10.1016/j.drudis.2024.104133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 07/20/2024] [Accepted: 07/31/2024] [Indexed: 08/07/2024]
Abstract
Deep generative models (GMs) have transformed the exploration of drug-like chemical space (CS) by generating novel molecules through complex, nontransparent processes, bypassing direct structural similarity. This review examines five key architectures for CS exploration: recurrent neural networks (RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), normalizing flows (NF), and Transformers. It discusses molecular representation choices, training strategies for focused CS exploration, evaluation criteria for CS coverage, and related challenges. Future directions include refining models, exploring new notations, improving benchmarks, and enhancing interpretability to better understand biologically relevant molecular properties.
Collapse
Affiliation(s)
- Antonio Lavecchia
- 'Drug Discovery' Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
3
|
Mervin L, Voronov A, Kabeshov M, Engkvist O. QSARtuna: An Automated QSAR Modeling Platform for Molecular Property Prediction in Drug Design. J Chem Inf Model 2024; 64:5365-5374. [PMID: 38950185 DOI: 10.1021/acs.jcim.4c00457] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/03/2024]
Abstract
Machine-learning (ML) and deep-learning (DL) approaches to predict the molecular properties of small molecules are increasingly deployed within the design-make-test-analyze (DMTA) drug design cycle to predict molecular properties of interest. Despite this uptake, there are only a few automated packages to aid their development and deployment that also support uncertainty estimation, model explainability, and other key aspects of model usage. This represents a key unmet need within the field, and the large number of molecular representations and algorithms (and associated parameters) means it is nontrivial to robustly optimize, evaluate, reproduce, and deploy models. Here, we present QSARtuna, a molecule property prediction modeling pipeline, written in Python and utilizing the Optuna, Scikit-learn, RDKit, and ChemProp packages, which enables the efficient and automated comparison between molecular representations and machine learning models. The platform was developed by considering the increasingly important aspect of model uncertainty quantification and explainability by design. We provide details for our framework and provide illustrative examples to demonstrate the capability of the software when applied to simple molecular property, reaction/reactivity prediction, and DNA encoded library enrichment classification. We hope that the release of QSARtuna will further spur innovation in automatic ML modeling and provide a platform for education of best practices in molecular property modeling. The code for the QSARtuna framework is made freely available via GitHub.
Collapse
Affiliation(s)
- Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge CB2 0AA, United Kingdom
| | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
| | - Mikhail Kabeshov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg 412 96, Sweden
- Department of Computer Science and Engineering, University of Gothenburg, Chalmers University of Technology, Gothenburg 412 96, Sweden
| |
Collapse
|
4
|
Özçelik R, de Ruiter S, Criscuolo E, Grisoni F. Chemical language modeling with structured state space sequence models. Nat Commun 2024; 15:6176. [PMID: 39039051 PMCID: PMC11263548 DOI: 10.1038/s41467-024-50469-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 07/05/2024] [Indexed: 07/24/2024] Open
Abstract
Generative deep learning is reshaping drug design. Chemical language models (CLMs) - which generate molecules in the form of molecular strings - bear particular promise for this endeavor. Here, we introduce a recent deep learning architecture, termed Structured State Space Sequence (S4) model, into de novo drug design. In addition to its unprecedented performance in various fields, S4 has shown remarkable capabilities to learn the global properties of sequences. This aspect is intriguing in chemical language modeling, where complex molecular properties like bioactivity can 'emerge' from separated portions in the molecular string. This observation gives rise to the following question: Can S4 advance chemical language modeling for de novo design? To provide an answer, we systematically benchmark S4 with state-of-the-art CLMs on an array of drug discovery tasks, such as the identification of bioactive compounds, and the design of drug-like molecules and natural products. S4 shows a superior capacity to learn complex molecular properties, while at the same time exploring diverse scaffolds. Finally, when applied prospectively to kinase inhibition, S4 designs eight of out ten molecules that are predicted as highly active by molecular dynamics simulations. Taken together, these findings advocate for the introduction of S4 into chemical language modeling - uncovering its untapped potential in the molecular sciences.
Collapse
Affiliation(s)
- Rıza Özçelik
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | - Sarah de Ruiter
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Emanuele Criscuolo
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Francesca Grisoni
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands.
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
5
|
Catacutan DB, Alexander J, Arnold A, Stokes JM. Machine learning in preclinical drug discovery. Nat Chem Biol 2024:10.1038/s41589-024-01679-1. [PMID: 39030362 DOI: 10.1038/s41589-024-01679-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2023] [Accepted: 06/13/2024] [Indexed: 07/21/2024]
Abstract
Drug-discovery and drug-development endeavors are laborious, costly and time consuming. These programs can take upward of 12 years and cost US $2.5 billion, with a failure rate of more than 90%. Machine learning (ML) presents an opportunity to improve the drug-discovery process. Indeed, with the growing abundance of public and private large-scale biological and chemical datasets, ML techniques are becoming well positioned as useful tools that can augment the traditional drug-development process. In this Perspective, we discuss the integration of algorithmic methods throughout the preclinical phases of drug discovery. Specifically, we highlight an array of ML-based efforts, across diverse disease areas, to accelerate initial hit discovery, mechanism-of-action (MOA) elucidation and chemical property optimization. With advances in the application of ML across diverse therapeutic areas, we posit that fully ML-integrated drug-discovery pipelines will define the future of drug-development programs.
Collapse
Affiliation(s)
- Denise B Catacutan
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jeremie Alexander
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Autumn Arnold
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada
| | - Jonathan M Stokes
- Department of Biochemistry and Biomedical Sciences, McMaster University, Hamilton, Ontario, Canada.
- Michael G. DeGroote Institute for Infectious Disease Research, McMaster University, Hamilton, Ontario, Canada.
- David Braley Centre for Antibiotic Discovery, McMaster University, Hamilton, Ontario, Canada.
| |
Collapse
|
6
|
Abbas MKG, Rassam A, Karamshahi F, Abunora R, Abouseada M. The Role of AI in Drug Discovery. Chembiochem 2024; 25:e202300816. [PMID: 38735845 DOI: 10.1002/cbic.202300816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 05/09/2024] [Accepted: 05/10/2024] [Indexed: 05/14/2024]
Abstract
The emergence of Artificial Intelligence (AI) in drug discovery marks a pivotal shift in pharmaceutical research, blending sophisticated computational techniques with conventional scientific exploration to break through enduring obstacles. This review paper elucidates the multifaceted applications of AI across various stages of drug development, highlighting significant advancements and methodologies. It delves into AI's instrumental role in drug design, polypharmacology, chemical synthesis, drug repurposing, and the prediction of drug properties such as toxicity, bioactivity, and physicochemical characteristics. Despite AI's promising advancements, the paper also addresses the challenges and limitations encountered in the field, including data quality, generalizability, computational demands, and ethical considerations. By offering a comprehensive overview of AI's role in drug discovery, this paper underscores the technology's potential to significantly enhance drug development, while also acknowledging the hurdles that must be overcome to fully realize its benefits.
Collapse
Affiliation(s)
- M K G Abbas
- Center for Advanced Materials, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Abrar Rassam
- Secondary Education, Educational Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Fatima Karamshahi
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| | - Rehab Abunora
- Faculty of Medicine, General Medicine and Surgery, Helwan University, Cairo, Egypt
| | - Maha Abouseada
- Department of Chemistry and Earth Sciences, Qatar University, P.O. Box, 2713, Doha, Qatar
| |
Collapse
|
7
|
Huang ETC, Yang JS, Liao KYK, Tseng WCW, Lee CK, Gill M, Compas C, See S, Tsai FJ. Predicting blood-brain barrier permeability of molecules with a large language model and machine learning. Sci Rep 2024; 14:15844. [PMID: 38982309 PMCID: PMC11233737 DOI: 10.1038/s41598-024-66897-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 07/05/2024] [Indexed: 07/11/2024] Open
Abstract
Predicting the blood-brain barrier (BBB) permeability of small-molecule compounds using a novel artificial intelligence platform is necessary for drug discovery. Machine learning and a large language model on artificial intelligence (AI) tools improve the accuracy and shorten the time for new drug development. The primary goal of this research is to develop artificial intelligence (AI) computing models and novel deep learning architectures capable of predicting whether molecules can permeate the human blood-brain barrier (BBB). The in silico (computational) and in vitro (experimental) results were validated by the Natural Products Research Laboratories (NPRL) at China Medical University Hospital (CMUH). The transformer-based MegaMolBART was used as the simplified molecular input line entry system (SMILES) encoder with an XGBoost classifier as an in silico method to check if a molecule could cross through the BBB. We used Morgan or Circular fingerprints to apply the Morgan algorithm to a set of atomic invariants as a baseline encoder also with an XGBoost classifier to compare the results. BBB permeability was assessed in vitro using three-dimensional (3D) human BBB spheroids (human brain microvascular endothelial cells, brain vascular pericytes, and astrocytes). Using multiple BBB databases, the results of the final in silico transformer and XGBoost model achieved an area under the receiver operating characteristic curve of 0.88 on the held-out test dataset. Temozolomide (TMZ) and 21 randomly selected BBB permeable compounds (Pred scores = 1, indicating BBB-permeable) from the NPRL penetrated human BBB spheroid cells. No evidence suggests that ferulic acid or five BBB-impermeable compounds (Pred scores < 1.29423E-05, which designate compounds that pass through the human BBB) can pass through the spheroid cells of the BBB. Our validation of in vitro experiments indicated that the in silico prediction of small-molecule permeation in the BBB model is accurate. Transformer-based models like MegaMolBART, leveraging the SMILES representations of molecules, show great promise for applications in new drug discovery. These models have the potential to accelerate the development of novel targeted treatments for disorders of the central nervous system.
Collapse
Affiliation(s)
- Eddie T C Huang
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Jai-Sing Yang
- Department of Medical Research, China Medical University Hospital, China Medical University, Taichung, Taiwan
| | - Ken Y K Liao
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Warren C W Tseng
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - C K Lee
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Michelle Gill
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Colin Compas
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Simon See
- NVIDIA AI Technology Center, NVIDIA Corporation, Santa Clara, USA
| | - Fuu-Jen Tsai
- School of Chinese Medicine, College of Chinese Medicine, China Medical University, China Medical University Children's Hospital, No. 2, Yude Road, Taichung, 404332, Taiwan.
- China Medical University Children's Hospital, Taichung, Taiwan.
| |
Collapse
|
8
|
Ai C, Yang H, Liu X, Dong R, Ding Y, Guo F. MTMol-GPT: De novo multi-target molecular generation with transformer-based generative adversarial imitation learning. PLoS Comput Biol 2024; 20:e1012229. [PMID: 38924082 PMCID: PMC11233020 DOI: 10.1371/journal.pcbi.1012229] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2023] [Revised: 07/09/2024] [Accepted: 06/03/2024] [Indexed: 06/28/2024] Open
Abstract
De novo drug design is crucial in advancing drug discovery, which aims to generate new drugs with specific pharmacological properties. Recently, deep generative models have achieved inspiring progress in generating drug-like compounds. However, the models prioritize a single target drug generation for pharmacological intervention, neglecting the complicated inherent mechanisms of diseases, and influenced by multiple factors. Consequently, developing novel multi-target drugs that simultaneously target specific targets can enhance anti-tumor efficacy and address issues related to resistance mechanisms. To address this issue and inspired by Generative Pre-trained Transformers (GPT) models, we propose an upgraded GPT model with generative adversarial imitation learning for multi-target molecular generation called MTMol-GPT. The multi-target molecular generator employs a dual discriminator model using the Inverse Reinforcement Learning (IRL) method for a concurrently multi-target molecular generation. Extensive results show that MTMol-GPT generates various valid, novel, and effective multi-target molecules for various complex diseases, demonstrating robustness and generalization capability. In addition, molecular docking and pharmacophore mapping experiments demonstrate the drug-likeness properties and effectiveness of generated molecules potentially improve neuropsychiatric interventions. Furthermore, our model's generalizability is exemplified by a case study focusing on the multi-targeted drug design for breast cancer. As a broadly applicable solution for multiple targets, MTMol-GPT provides new insight into future directions to enhance potential complex disease therapeutics by generating high-quality multi-target molecules in drug discovery.
Collapse
Affiliation(s)
- Chengwei Ai
- School of computer science and engineering, Central South University, Changsha, China
| | - Hongpeng Yang
- Department of computer science and engineering, University of South Carolina, Columbia, South Carolina, United States of America
| | - Xiaoyi Liu
- School of Chinese Materia Medica, Beijing University of Chinese Medicine, Beijing, China
- Ministry of Education, Engineering Research Center for Pharmaceutics of Chinese Materia Medica and New Drug Development, Beijing, China
| | - Ruihan Dong
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, China
| | - Yijie Ding
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, China
| | - Fei Guo
- School of computer science and engineering, Central South University, Changsha, China
| |
Collapse
|
9
|
Gangwal A, Lavecchia A. Unleashing the power of generative AI in drug discovery. Drug Discov Today 2024; 29:103992. [PMID: 38663579 DOI: 10.1016/j.drudis.2024.103992] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Revised: 03/22/2024] [Accepted: 04/18/2024] [Indexed: 05/04/2024]
Abstract
Artificial intelligence (AI) is revolutionizing drug discovery by enhancing precision, reducing timelines and costs, and enabling AI-driven computer-aided drug design. This review focuses on recent advancements in deep generative models (DGMs) for de novo drug design, exploring diverse algorithms and their profound impact. It critically analyses the challenges that are intricately interwoven into these technologies, proposing strategies to unlock their full potential. It features case studies of both successes and failures in advancing drugs to clinical trials with AI assistance. Last, it outlines a forward-looking plan for optimizing DGMs in de novo drug design, thereby fostering faster and more cost-effective drug development.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal's Institute of Pharmacy, Dhule 424001, Maharashtra, India
| | - Antonio Lavecchia
- "Drug Discovery" Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
10
|
Oniani D, Hilsman J, Zang C, Wang J, Cai L, Zawala J, Wang Y. Emerging opportunities of using large language models for translation between drug molecules and indications. Sci Rep 2024; 14:10738. [PMID: 38730226 PMCID: PMC11087469 DOI: 10.1038/s41598-024-61124-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 05/02/2024] [Indexed: 05/12/2024] Open
Abstract
A drug molecule is a substance that changes an organism's mental or physical state. Every approved drug has an indication, which refers to the therapeutic use of that drug for treating a particular medical condition. While the Large Language Model (LLM), a generative Artificial Intelligence (AI) technique, has recently demonstrated effectiveness in translating between molecules and their textual descriptions, there remains a gap in research regarding their application in facilitating the translation between drug molecules and indications (which describes the disease, condition or symptoms for which the drug is used), or vice versa. Addressing this challenge could greatly benefit the drug discovery process. The capability of generating a drug from a given indication would allow for the discovery of drugs targeting specific diseases or targets and ultimately provide patients with better treatments. In this paper, we first propose a new task, the translation between drug molecules and corresponding indications, and then test existing LLMs on this new task. Specifically, we consider nine variations of the T5 LLM and evaluate them on two public datasets obtained from ChEMBL and DrugBank. Our experiments show the early results of using LLMs for this task and provide a perspective on the state-of-the-art. We also emphasize the current limitations and discuss future work that has the potential to improve the performance on this task. The creation of molecules from indications, or vice versa, will allow for more efficient targeting of diseases and significantly reduce the cost of drug discovery, with the potential to revolutionize the field of drug discovery in the era of generative AI.
Collapse
Affiliation(s)
- David Oniani
- Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jordan Hilsman
- Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, USA
| | - Chengxi Zang
- Department of Population Health Sciences, Weill Cornell Medicine, New York, NY, USA
- Institute of Artificial Intelligence for Digital Health, Weill Cornell Medicine, New York, NY, USA
| | - Junmei Wang
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Lianjin Cai
- Department of Pharmaceutical Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | - Jan Zawala
- Jerzy Haber Institute of Catalysis and Surface Chemistry, Polish Academy of Sciences, Kraków, Poland
| | - Yanshan Wang
- Department of Health Information Management, University of Pittsburgh, Pittsburgh, PA, USA.
- Department of Biomedical Informatics, University of Pittsburgh, Pittsburgh, PA, USA.
- Intelligent Systems Program, University of Pittsburgh, Pittsburgh, PA, USA.
- Clinical and Translational Science Institute, University of Pittsburgh, Pittsburgh, PA, USA.
| |
Collapse
|
11
|
Shields JD, Howells R, Lamont G, Leilei Y, Madin A, Reimann CE, Rezaei H, Reuillon T, Smith B, Thomson C, Zheng Y, Ziegler RE. AiZynth impact on medicinal chemistry practice at AstraZeneca. RSC Med Chem 2024; 15:1085-1095. [PMID: 38665822 PMCID: PMC11042116 DOI: 10.1039/d3md00651d] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 02/15/2024] [Indexed: 04/28/2024] Open
Abstract
AstraZeneca chemists have been using the AI retrosynthesis tool AiZynth for three years. In this article, we present seven examples of how medicinal chemists using AiZynth positively impacted their drug discovery programmes. These programmes run the gamut from early-stage hit confirmation to late-stage route optimisation efforts. We also discuss the different use cases for which AI retrosynthesis tools are best suited.
Collapse
Affiliation(s)
- Jason D Shields
- Early Oncology R&D, AstraZeneca 35 Gatehouse Drive Waltham MA 02451 USA
| | - Rachel Howells
- Early Oncology R&D, AstraZeneca 1 Francis Crick Avenue Cambridge CB2 0AA UK
| | - Gillian Lamont
- Early Oncology R&D, AstraZeneca 1 Francis Crick Avenue Cambridge CB2 0AA UK
| | - Yin Leilei
- Pharmaron Beijing Co., Ltd. 6 Taihe Road BDA Beijing 100176 P.R. China
| | - Andrew Madin
- Discovery Sciences, AstraZeneca 1 Francis Crick Avenue Cambridge CB2 0AA UK
| | | | - Hadi Rezaei
- Early Oncology R&D, AstraZeneca 35 Gatehouse Drive Waltham MA 02451 USA
| | - Tristan Reuillon
- Respiratory & Immunology, BioPharmaceuticals R&D, AstraZeneca Pepparedsleden 1 43183 Mölndal Sweden
| | - Bryony Smith
- Early Oncology R&D, AstraZeneca 1 Francis Crick Avenue Cambridge CB2 0AA UK
| | - Clare Thomson
- Early Oncology R&D, AstraZeneca 1 Francis Crick Avenue Cambridge CB2 0AA UK
| | - Yuting Zheng
- Pharmaron Beijing Co., Ltd. 6 Taihe Road BDA Beijing 100176 P.R. China
| | - Robert E Ziegler
- Early Oncology R&D, AstraZeneca 35 Gatehouse Drive Waltham MA 02451 USA
| |
Collapse
|
12
|
Atz K, Cotos L, Isert C, Håkansson M, Focht D, Hilleke M, Nippa DF, Iff M, Ledergerber J, Schiebroek CCG, Romeo V, Hiss JA, Merk D, Schneider P, Kuhn B, Grether U, Schneider G. Prospective de novo drug design with deep interactome learning. Nat Commun 2024; 15:3408. [PMID: 38649351 PMCID: PMC11035696 DOI: 10.1038/s41467-024-47613-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Accepted: 04/02/2024] [Indexed: 04/25/2024] Open
Abstract
De novo drug design aims to generate molecules from scratch that possess specific chemical and pharmacological properties. We present a computational approach utilizing interactome-based deep learning for ligand- and structure-based generation of drug-like molecules. This method capitalizes on the unique strengths of both graph neural networks and chemical language models, offering an alternative to the need for application-specific reinforcement, transfer, or few-shot learning. It enables the "zero-shot" construction of compound libraries tailored to possess specific bioactivity, synthesizability, and structural novelty. In order to proactively evaluate the deep interactome learning framework for protein structure-based drug design, potential new ligands targeting the binding site of the human peroxisome proliferator-activated receptor (PPAR) subtype gamma are generated. The top-ranking designs are chemically synthesized and computationally, biophysically, and biochemically characterized. Potent PPAR partial agonists are identified, demonstrating favorable activity and the desired selectivity profiles for both nuclear receptors and off-target interactions. Crystal structure determination of the ligand-receptor complex confirms the anticipated binding mode. This successful outcome positively advocates interactome-based de novo design for application in bioorganic and medicinal chemistry, enabling the creation of innovative bioactive molecules.
Collapse
Affiliation(s)
- Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Leandro Cotos
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Clemens Isert
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Maria Håkansson
- SARomics Biostructures AB, Medicon Village, SE-223 81, Lund, Sweden
| | - Dorota Focht
- SARomics Biostructures AB, Medicon Village, SE-223 81, Lund, Sweden
| | - Mattis Hilleke
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - David F Nippa
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Michael Iff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Jann Ledergerber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Carl C G Schiebroek
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Valentina Romeo
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Jan A Hiss
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität München, Butenandtstrasse 5, 81377, Munich, Germany
| | - Petra Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Bernd Kuhn
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Uwe Grether
- Roche Pharma Research and Early Development (pRED), Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Grenzacherstrasse 124, CH-4070, Basel, Switzerland
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
| |
Collapse
|
13
|
Zhang K, Tang Y, Yu H, Yang J, Tao L, Xiang P. Discovery of lupus nephritis targeted inhibitors based on De novo molecular design: comprehensive application of vinardo scoring, ADMET analysis, and molecular dynamics simulation. J Biomol Struct Dyn 2024:1-14. [PMID: 38501728 DOI: 10.1080/07391102.2024.2329293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2023] [Accepted: 03/06/2024] [Indexed: 03/20/2024]
Abstract
Lupus Nephritis (LN) is an autoimmune disease affecting the kidneys, and conventional drug studies have limitations due to its imprecise and complex pathogenesis. Therefore, the aim of this study was to design a novel Lupus Nephritis-targeted drug with good clinical due potential, high potency and selectivity by computer-assisted approach.NIK belongs to the serine/threonine protein kinase, which is gaining attention as a drug target for Lupus Nephritis. we used bioinformatics, homology modelling and sequence comparison analysis, small molecule ab initio design, ADMET analysis, molecular docking, molecular dynamics simulation, and MM/PBSA analysis to design and explore the selectivity and efficiency of a novel Lupus Nephritis-targeting drug, ClImYnib, and a classical NIK inhibitor, NIK SMI1. We used bioinformatics techniques to determine the correlation between lupus nephritis and the NF-κB signaling pathway. De novo drugs design was used to create a NIK-targeted inhibitor, ClImYnib, with lower toxicity, after which we used molecular dynamics to simulate NIK SMI1 against ClImYnib, and the simulation results showed that ClImYnib had better selectivity and efficiency. Our research delves into the molecular mechanism of protein ligands, and we have designed and validated an excellent NIK inhibitor using multiple computational simulation methods. More importantly, it provides an idea of target designing small molecules.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Kaiyuan Zhang
- School of Clinical Medicine, Bengbu Medical College, China
| | - Yingkai Tang
- Department of Anatomy, School of basic Medicine, Bengbu Medical College, China
| | - Haiyue Yu
- School of Clinical Medicine, Bengbu Medical College, China
| | - Jingtao Yang
- School of Clinical Medicine, Bengbu Medical College, China
| | - Lu Tao
- Central Laboratory, The Frist Affiliated Hospital of Bengbu Medical College, Bengbu, Anhui, China
| | - Ping Xiang
- Central Laboratory, The Frist Affiliated Hospital of Bengbu Medical College, Bengbu, Anhui, China
| |
Collapse
|
14
|
Kong Y, Zhou C, Tan D, Xu X, Li Z, Cheng J. Discovery of Potential Neonicotinoid Insecticides by an Artificial Intelligence Generative Model and Structure-Based Virtual Screening. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2024; 72:5145-5152. [PMID: 38419506 DOI: 10.1021/acs.jafc.3c06895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/02/2024]
Abstract
The identification of neonicotinoid insecticides bearing novel scaffolds is of great importance for pesticide discovery. Here, artificial intelligence-based tools and virtual screening strategy were integrated to discover potential leads of neonicotinoid insecticides. A deep generative model was successfully constructed using a recurrent neural network combined with transfer learning. The model evaluation showed that the pretrained model could accurately grasp the SMILES grammar of drug-like molecules and generate potential neonicotinoid compounds after transfer learning. The generated molecules were evaluated by hierarchical virtual screening, hits were subjected to a similarity search, and the most similar structures were purchased for the bioassay. Compounds A2 and A5 displayed 52.5 and 50.3% mortality rates against Aphis craccivora at 100 mg/L, respectively. The docking study indicated that these two compounds have similar binding modes to neonicotinoids, which were verified by further molecular dynamics simulations.
Collapse
Affiliation(s)
- Yijin Kong
- Shanghai Key Laboratory of Chemical Biology, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Cong Zhou
- Shanghai Key Laboratory of Chemical Biology, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Du Tan
- Shanghai Key Laboratory of Chemical Biology, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Xiaoyong Xu
- Shanghai Key Laboratory of Chemical Biology, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Zhong Li
- Shanghai Key Laboratory of Chemical Biology, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| | - Jiagao Cheng
- Shanghai Key Laboratory of Chemical Biology, School of Pharmacy, East China University of Science and Technology, Shanghai 200237, China
| |
Collapse
|
15
|
Loeffler HH, He J, Tibo A, Janet JP, Voronov A, Mervin LH, Engkvist O. Reinvent 4: Modern AI-driven generative molecule design. J Cheminform 2024; 16:20. [PMID: 38383444 PMCID: PMC10882833 DOI: 10.1186/s13321-024-00812-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/09/2024] [Indexed: 02/23/2024] Open
Abstract
REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within the general machine learning optimization algorithms, transfer learning, reinforcement learning and curriculum learning. REINVENT 4 enables and facilitates de novo design, R-group replacement, library design, linker design, scaffold hopping and molecule optimization. This contribution gives an overview of the software and describes its design. Algorithms and their applications are discussed in detail. REINVENT 4 is a command line tool which reads a user configuration in either TOML or JSON format. The aim of this release is to provide reference implementations for some of the most common algorithms in AI based molecule generation. An additional goal with the release is to create a framework for education and future innovation in AI based molecular design. The software is available from https://github.com/MolecularAI/REINVENT4 and released under the permissive Apache 2.0 license. Scientific contribution. The software provides an open-source reference implementation for generative molecular design where the software is also being used in production to support in-house drug discovery projects. The publication of the most common machine learning algorithms in one code and full documentation thereof will increase transparency of AI and foster innovation, collaboration and education.
Collapse
Affiliation(s)
- Hannes H Loeffler
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.
| | - Jiazhen He
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Lewis H Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
16
|
Kyro GW, Morgunov A, Brent RI, Batista VS. ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation. J Chem Inf Model 2024; 64:653-665. [PMID: 38287889 DOI: 10.1021/acs.jcim.3c01456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
The incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology and demonstrate its applicability to targeted molecular generation. When applied to c-Abl kinase, a protein with FDA-approved small-molecule inhibitors, the model learns to generate molecules similar to the inhibitors without prior knowledge of their existence and even reproduces two of them exactly. We also show that the methodology is effective for a protein without any commercially available small-molecule inhibitors, the HNH domain of the CRISPR-associated protein 9 (Cas9) enzyme. To facilitate implementation and reproducibility, we made all of our software available through the open-source ChemSpaceAL Python package.
Collapse
Affiliation(s)
- Gregory W Kyro
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Anton Morgunov
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Rafael I Brent
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| |
Collapse
|
17
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Kumarasamy V, Subramaniyan V, Wong LS. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front Pharmacol 2024; 15:1331062. [PMID: 38384298 PMCID: PMC10879372 DOI: 10.3389/fphar.2024.1331062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 02/23/2024] Open
Abstract
There are two main ways to discover or design small drug molecules. The first involves fine-tuning existing molecules or commercially successful drugs through quantitative structure-activity relationships and virtual screening. The second approach involves generating new molecules through de novo drug design or inverse quantitative structure-activity relationship. Both methods aim to get a drug molecule with the best pharmacokinetic and pharmacodynamic profiles. However, bringing a new drug to market is an expensive and time-consuming endeavor, with the average cost being estimated at around $2.5 billion. One of the biggest challenges is screening the vast number of potential drug candidates to find one that is both safe and effective. The development of artificial intelligence in recent years has been phenomenal, ushering in a revolution in many fields. The field of pharmaceutical sciences has also significantly benefited from multiple applications of artificial intelligence, especially drug discovery projects. Artificial intelligence models are finding use in molecular property prediction, molecule generation, virtual screening, synthesis planning, repurposing, among others. Lately, generative artificial intelligence has gained popularity across domains for its ability to generate entirely new data, such as images, sentences, audios, videos, novel chemical molecules, etc. Generative artificial intelligence has also delivered promising results in drug discovery and development. This review article delves into the fundamentals and framework of various generative artificial intelligence models in the context of drug discovery via de novo drug design approach. Various basic and advanced models have been discussed, along with their recent applications. The review also explores recent examples and advances in the generative artificial intelligence approach, as well as the challenges and ongoing efforts to fully harness the potential of generative artificial intelligence in generating novel drug molecules in a faster and more affordable manner. Some clinical-level assets generated form generative artificial intelligence have also been discussed in this review to show the ever-increasing application of artificial intelligence in drug discovery through commercial partnerships.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Azim Ansari
- Computer Aided Drug Design Center Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Dhule, India
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, Malaysia
| | - Vinoth Kumarasamy
- Department of Parasitology and Medical Entomology, Faculty of Medicine, Universiti Kebangsaan Malaysia, Cheras, Malaysia
| | - Vetriselvan Subramaniyan
- Pharmacology Unit, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Selangor, Malaysia
- School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, India
| | - Ling Shing Wong
- Faculty of Health and Life Sciences, INTI International University, Nilai, Malaysia
| |
Collapse
|
18
|
Hoff SE, Zinke M, Izadi-Pruneyre N, Bonomi M. Bonds and bytes: The odyssey of structural biology. Curr Opin Struct Biol 2024; 84:102746. [PMID: 38101027 DOI: 10.1016/j.sbi.2023.102746] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2023] [Revised: 11/20/2023] [Accepted: 11/24/2023] [Indexed: 12/17/2023]
Abstract
Characterizing structural and dynamic properties of proteins and large macromolecular assemblies is crucial to understand the molecular mechanisms underlying biological functions. In the field of structural biology, no single method comprehensively reveals the behavior of biological systems across various spatiotemporal scales. Instead, we have a versatile toolkit of techniques, each contributing a piece to the overall puzzle. Integrative structural biology combines different techniques to create accurate and precise multi-scale models that expand our understanding of complex biological systems. This review outlines recent advancements in computational and experimental methods in structural biology, with special focus on recent Artificial Intelligence techniques, emphasizes integrative approaches that combine different types of data for precise spatiotemporal modeling, and provides an outlook into future directions of this field.
Collapse
Affiliation(s)
- S E Hoff
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, France
| | - M Zinke
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Bacterial Transmembrane Systems Unit, Paris, France. https://twitter.com/ZinkeMaximilian
| | - N Izadi-Pruneyre
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Bacterial Transmembrane Systems Unit, Paris, France.
| | - M Bonomi
- Institut Pasteur, Université Paris Cité, CNRS UMR 3528, Structural Bioinformatics Unit, Paris, France.
| |
Collapse
|
19
|
Zdrazil B, Felix E, Hunter F, Manners EJ, Blackshaw J, Corbett S, de Veij M, Ioannidis H, Lopez DM, Mosquera J, Magarinos M, Bosc N, Arcila R, Kizilören T, Gaulton A, Bento A, Adasme M, Monecke P, Landrum G, Leach A. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Res 2024; 52:D1180-D1192. [PMID: 37933841 PMCID: PMC10767899 DOI: 10.1093/nar/gkad1004] [Citation(s) in RCA: 71] [Impact Index Per Article: 71.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2023] [Revised: 10/09/2023] [Accepted: 10/23/2023] [Indexed: 11/08/2023] Open
Abstract
ChEMBL (https://www.ebi.ac.uk/chembl/) is a manually curated, high-quality, large-scale, open, FAIR and Global Core Biodata Resource of bioactive molecules with drug-like properties, previously described in the 2012, 2014, 2017 and 2019 Nucleic Acids Research Database Issues. Since its introduction in 2009, ChEMBL's content has changed dramatically in size and diversity of data types. Through incorporation of multiple new datasets from depositors since the 2019 update, ChEMBL now contains slightly more bioactivity data from deposited data vs data extracted from literature. In collaboration with the EUbOPEN consortium, chemical probe data is now regularly deposited into ChEMBL. Release 27 made curated data available for compounds screened for potential anti-SARS-CoV-2 activity from several large-scale drug repurposing screens. In addition, new patent bioactivity data have been added to the latest ChEMBL releases, and various new features have been incorporated, including a Natural Product likeness score, updated flags for Natural Products, a new flag for Chemical Probes, and the initial annotation of the action type for ∼270 000 bioactivity measurements.
Collapse
Affiliation(s)
- Barbara Zdrazil
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Eloy Felix
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Fiona Hunter
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Emma J Manners
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - James Blackshaw
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Sybilla Corbett
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Marleen de Veij
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Harris Ioannidis
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - David Mendez Lopez
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Juan F Mosquera
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Maria Paula Magarinos
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Nicolas Bosc
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Ricardo Arcila
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Tevfik Kizilören
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Anna Gaulton
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - A Patrícia Bento
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Melissa F Adasme
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Peter Monecke
- Sanofi, R&D, Preclinical Safety, Industriepark Höchst, 65926 Frankfurt am Main, Germany
| | - Gregory A Landrum
- Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland
| | - Andrew R Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| |
Collapse
|
20
|
Olmedo DA, Durant-Archibold AA, López-Pérez JL, Medina-Franco JL. Design and Diversity Analysis of Chemical Libraries in Drug Discovery. Comb Chem High Throughput Screen 2024; 27:502-515. [PMID: 37409545 DOI: 10.2174/1386207326666230705150110] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2023] [Revised: 05/30/2023] [Accepted: 05/30/2023] [Indexed: 07/07/2023]
Abstract
Chemical libraries and compound data sets are among the main inputs to start the drug discovery process at universities, research institutes, and the pharmaceutical industry. The approach used in the design of compound libraries, the chemical information they possess, and the representation of structures, play a fundamental role in the development of studies: chemoinformatics, food informatics, in silico pharmacokinetics, computational toxicology, bioinformatics, and molecular modeling to generate computational hits that will continue the optimization process of drug candidates. The prospects for growth in drug discovery and development processes in chemical, biotechnological, and pharmaceutical companies began a few years ago by integrating computational tools with artificial intelligence methodologies. It is anticipated that it will increase the number of drugs approved by regulatory agencies shortly.
Collapse
Affiliation(s)
- Dionisio A Olmedo
- Centro de Investigaciones Farmacognósticas de la Flora Panameña (CIFLORPAN), Facultad de Farmacia, Universidad de Panamá, Ciudad de Panamá, Apartado, 0824-00178, Panamá
- Sistema Nacional de Investigación (SNI), Secretaria Nacional de Ciencia, Tecnología e Innovación (SENACYT), Ciudad del Saber, Clayton, Panamá
| | - Armando A Durant-Archibold
- Centro de Biodiversidad y Descubrimiento de Drogas, Instituto de Investigaciones Científicas y Servicios de Alta Tecnología (INDICASAT AIP), Apartado, 0843-01103, Panamá
- Departamento de Bioquímica, Facultad de Ciencias Naturales, Exactas y Tecnología, Universidad de Panamá, Ciudad de Panamá, Panamá
| | - José Luis López-Pérez
- CESIFAR, Departamento de Farmacología, Facultad de Medicina, Universidad de Panamá, Ciudad de Panamá, Panamá
- Departamento de Ciencias Farmacéuticas, Facultad de Farmacia, Universidad de Salamanca, Avda. Campo Charro s/n, 37071 Salamanca, España
| | - José Luis Medina-Franco
- DIFACQUIM Grupo de Investigación, Departamento de Farmacia, Escuela de Química, Universidad Nacional Autónoma de México, Ciudad de México, Apartado, 04510, México
| |
Collapse
|
21
|
Huang CH, Lin ST. MARS Plus: An Improved Molecular Design Tool for Complex Compounds Involving Ionic, Stereo, and Cis-Trans Isomeric Structures. J Chem Inf Model 2023; 63:7711-7728. [PMID: 38100117 DOI: 10.1021/acs.jcim.3c01745] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2023]
Abstract
MARS (Molecular Assembling and Representation Suite) (Hsu et al. J. Chem. Inf. Model. 2019, 59, 3703-3713) is a toolbox for the molecular design of organic molecules. MARS uses integer arrays to represent the elements and connectivity between elements of a molecule. It provides a collection of operations to manipulate the elemental composition and connectivity of a molecule (or a pair of molecules), enabling the creation of novel chemical compounds. In this work, the original MARS is extended to handle complex molecular structures, including geometric (cis-trans) isomers, stereo isomers, cyclic compounds, and ionic species. The extended version of MARS, referred to as MARS+, has a more comprehensive coverage of the chemical space and therefore can explore molecules with a greater chemical and physical diversity. Compared to other molecular design tools, MARS+ is designed to perform all possible manipulations on a given molecule or a pair of molecules. Molecular structure manipulation can be conducted in either a controlled or a random fashion. Furthermore, every structure manipulation has a counterpart so that the operation can be reversed. Nearly any possible chemical structure can be generated with MARS+ via a combination of molecular operations. The capabilities of MARS+ are examined by the design of new ionic liquids (ILs). The results show that MARS+ is a useful tool for computer-aided molecular design (CAMD) and molecular structure enumeration.
Collapse
Affiliation(s)
- Chen-Hsuan Huang
- Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan
| | - Shiang-Tai Lin
- Department of Chemical Engineering, National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|
22
|
Wang Q, Wei Z, Hu X, Wang Z, Dong Y, Liu H. Molecular generation strategy and optimization based on A2C reinforcement learning in de novo drug design. Bioinformatics 2023; 39:btad693. [PMID: 37971970 PMCID: PMC10689670 DOI: 10.1093/bioinformatics/btad693] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/08/2023] [Accepted: 11/15/2023] [Indexed: 11/19/2023] Open
Abstract
MOTIVATION In the field of pharmacochemistry, it is a time-consuming and expensive process for the new drug development. The existing drug design methods face a significant challenge in terms of generation efficiency and quality. RESULTS In this paper, we proposed a novel molecular generation strategy and optimization based on A2C reinforcement learning. In molecular generation strategy, we adopted transformer-DNN to retain the scaffolds advantages, while accounting for the generated molecules' similarity and internal diversity by dynamic parameter adjustment, further improving the overall quality of molecule generation. In molecular optimization, we introduced heterogeneous parallel supercomputing for large-scale molecular docking based on message passing interface communication technology to rapidly obtain bioactive information, thereby enhancing the efficiency of drug design. Experiments show that our model can generate high-quality molecules with multi-objective properties at a high generation efficiency, with effectiveness and novelty close to 100%. Moreover, we used our method to assist shandong university school of pharmacy to find several candidate drugs molecules of anti-PEDV. AVAILABILITY AND IMPLEMENTATION The datasets involved in this method and the source code are freely available to academic users at https://github.com/wq-sunshine/MomdTDSRL.git.
Collapse
Affiliation(s)
- Qian Wang
- College of Computer Science and Technology, Ocean University of China, Qingdao, Shandong 266100, China
| | - Zhiqiang Wei
- College of Computer Science and Technology, Ocean University of China, Qingdao, Shandong 266100, China
| | - Xiaotong Hu
- College of Computer Science and Technology, Ocean University of China, Qingdao, Shandong 266100, China
| | - Zhuoya Wang
- Center for High Performance Computing and System Simulation, National Laboratory for Marine Science and Technology, Qingdao, Shandong 266237, China
| | - Yujie Dong
- Marine Big Data Center of Institute for Advanced Ocean Study, Ocean University of China, Qingdao, Shandong 266100, China
| | - Hao Liu
- College of Computer Science and Technology, Ocean University of China, Qingdao, Shandong 266100, China
| |
Collapse
|
23
|
Stuart DD, Guzman-Perez A, Brooijmans N, Jackson EL, Kryukov GV, Friedman AA, Hoos A. Precision Oncology Comes of Age: Designing Best-in-Class Small Molecules by Integrating Two Decades of Advances in Chemistry, Target Biology, and Data Science. Cancer Discov 2023; 13:2131-2149. [PMID: 37712571 PMCID: PMC10551669 DOI: 10.1158/2159-8290.cd-23-0280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Revised: 04/27/2023] [Accepted: 07/28/2023] [Indexed: 09/16/2023]
Abstract
Small-molecule drugs have enabled the practice of precision oncology for genetically defined patient populations since the first approval of imatinib in 2001. Scientific and technology advances over this 20-year period have driven the evolution of cancer biology, medicinal chemistry, and data science. Collectively, these advances provide tools to more consistently design best-in-class small-molecule drugs against known, previously undruggable, and novel cancer targets. The integration of these tools and their customization in the hands of skilled drug hunters will be necessary to enable the discovery of transformational therapies for patients across a wider spectrum of cancers. SIGNIFICANCE Target-centric small-molecule drug discovery necessitates the consideration of multiple approaches to identify chemical matter that can be optimized into drug candidates. To do this successfully and consistently, drug hunters require a comprehensive toolbox to avoid following the "law of instrument" or Maslow's hammer concept where only one tool is applied regardless of the requirements of the task. Combining our ever-increasing understanding of cancer and cancer targets with the technological advances in drug discovery described below will accelerate the next generation of small-molecule drugs in oncology.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Axel Hoos
- Scorpion Therapeutics, Boston, Massachusetts
| |
Collapse
|
24
|
Stanley M, Segler M. Fake it until you make it? Generative de novo design and virtual screening of synthesizable molecules. Curr Opin Struct Biol 2023; 82:102658. [PMID: 37473637 DOI: 10.1016/j.sbi.2023.102658] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 07/22/2023]
Abstract
Computational techniques, including virtual screening, de novo design, and generative models, play an increasing role in expediting DMTA cycles for modern molecular discovery. However, computationally proposed molecules must be synthetically feasible for laboratory testing. In this perspective, we offer a succinct introduction to the subject, and showcase typical workflows to integrate synthesis planning, synthesizability scoring, and molecule generation. Finally, we address limitations and opportunities for future research.
Collapse
Affiliation(s)
- Megan Stanley
- Microsoft Research AI4Science, UK. https://twitter.com/@megjanestanley
| | | |
Collapse
|
25
|
Arras P, Yoo HB, Pekar L, Clarke T, Friedrich L, Schröter C, Schanz J, Tonillo J, Siegmund V, Doerner A, Krah S, Guarnera E, Zielonka S, Evers A. AI/ML combined with next-generation sequencing of VHH immune repertoires enables the rapid identification of de novo humanized and sequence-optimized single domain antibodies: a prospective case study. Front Mol Biosci 2023; 10:1249247. [PMID: 37842638 PMCID: PMC10575757 DOI: 10.3389/fmolb.2023.1249247] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Accepted: 08/31/2023] [Indexed: 10/17/2023] Open
Abstract
Introduction: In this study, we demonstrate the feasibility of yeast surface display (YSD) and nextgeneration sequencing (NGS) in combination with artificial intelligence and machine learning methods (AI/ML) for the identification of de novo humanized single domain antibodies (sdAbs) with favorable early developability profiles. Methods: The display library was derived from a novel approach, in which VHH-based CDR3 regions obtained from a llama (Lama glama), immunized against NKp46, were grafted onto a humanized VHH backbone library that was diversified in CDR1 and CDR2. Following NGS analysis of sequence pools from two rounds of fluorescence-activated cell sorting we focused on four sequence clusters based on NGS frequency and enrichment analysis as well as in silico developability assessment. For each cluster, long short-term memory (LSTM) based deep generative models were trained and used for the in silico sampling of new sequences. Sequences were subjected to sequence- and structure-based in silico developability assessment to select a set of less than 10 sequences per cluster for production. Results: As demonstrated by binding kinetics and early developability assessment, this procedure represents a general strategy for the rapid and efficient design of potent and automatically humanized sdAb hits from screening selections with favorable early developability profiles.
Collapse
Affiliation(s)
- Paul Arras
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
- Institute for Organic Chemistry and Biochemistry, Technical University of Darmstadt, Darmstadt, Germany
| | - Han Byul Yoo
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
| | - Lukas Pekar
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
| | - Thomas Clarke
- Bioinformatics, EMD Serono, Billerica, MA, United States
| | - Lukas Friedrich
- Computational Chemistry and Biologics, Merck Healthcare KGaA, Darmstadt, Germany
| | | | - Jennifer Schanz
- ADCs & Targeted NBE Therapeutics, Merck KGaA, Darmstadt, Germany
| | - Jason Tonillo
- ADCs & Targeted NBE Therapeutics, Merck KGaA, Darmstadt, Germany
| | - Vanessa Siegmund
- Early Protein Supply and Characterization, Merck Healthcare KGaA, Darmstadt, Germany
| | - Achim Doerner
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
| | - Simon Krah
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
| | - Enrico Guarnera
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
| | - Stefan Zielonka
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
- Institute for Organic Chemistry and Biochemistry, Technical University of Darmstadt, Darmstadt, Germany
| | - Andreas Evers
- Antibody Discovery and Protein Engineering, Merck Healthcare KGaA, Darmstadt, Germany
| |
Collapse
|
26
|
Matos GDR, Pak S, Rizzo RC. Descriptor-Driven de Novo Design Algorithms for DOCK6 Using RDKit. J Chem Inf Model 2023; 63:5803-5822. [PMID: 37698425 PMCID: PMC10694857 DOI: 10.1021/acs.jcim.3c01031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2023]
Abstract
Structure-based methods that employ principles of de novo design can be used to construct small organic molecules from scratch using pre-existing fragment libraries to sample chemical space and are an important class of computational algorithms for drug-lead discovery. Here, we present a powerful new design method for DOCK6 that employs a Descriptor-Driven De Novo strategy (termed D3N) in which user-defined cheminformatics descriptors (and their target ranges) are calculated at each layer of growth using the open-source toolkit RDKit. The objective is to tailor ligand growth toward desirable regions of chemical space. The approach was extensively validated through: (1) comparison of cheminformatics descriptors computed using the new DOCK6/RDKit interface versus the standard Python/RDKit installation, (2) examination of descriptor distributions generated using D3N growth under different conditions (target ranges and environments), and (3) construction of ligands with very tight (pinpoint) descriptor ranges using clinically relevant compounds as a reference. Our testing confirms that the new DOCK6/RDKit integration is robust, showcases how the new D3N routines can be used to direct sampling around user-defined chemical spaces, and highlights the utility of on-the-fly descriptor calculations for ligand design to important drug targets.
Collapse
Affiliation(s)
- Guilherme Duarte Ramos Matos
- Department of Applied Mathematics & Statistics, Stony Brook University, Stony Brook, New York 11794, USA
- Instituto de Química, Universidade de Brasília, Brasília, Distrito Federal, 70910-900, Brazil
| | - Steven Pak
- Department of Pharmacological Sciences, Stony Brook University, Stony Brook, New York, 11794, USA
| | - Robert C. Rizzo
- Department of Applied Mathematics & Statistics, Stony Brook University, Stony Brook, New York 11794, USA
- Institute of Chemical Biology & Drug Discovery, Stony Brook University, Stony Brook, New York 11794, USA
- Laufer Center for Physical & Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, USA
| |
Collapse
|
27
|
Salas-Estrada L, Provasi D, Qiu X, Kaniskan HÜ, Huang XP, DiBerto JF, Lamim Ribeiro JM, Jin J, Roth BL, Filizola M. De Novo Design of κ-Opioid Receptor Antagonists Using a Generative Deep-Learning Framework. J Chem Inf Model 2023; 63:5056-5065. [PMID: 37555591 PMCID: PMC10466374 DOI: 10.1021/acs.jcim.3c00651] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2023] [Indexed: 08/10/2023]
Abstract
Likely effective pharmacological interventions for the treatment of opioid addiction include attempts to attenuate brain reward deficits during periods of abstinence. Pharmacological blockade of the κ-opioid receptor (KOR) has been shown to abolish brain reward deficits in rodents during withdrawal, as well as to reduce the escalation of opioid use in rats with extended access to opioids. Although KOR antagonists represent promising candidates for the treatment of opioid addiction, very few potent selective KOR antagonists are known to date and most of them exhibit significant safety concerns. Here, we used a generative deep-learning framework for the de novo design of chemotypes with putative KOR antagonistic activity. Molecules generated by models trained with this framework were prioritized for chemical synthesis based on their predicted optimal interactions with the receptor. Our models and proposed training protocol were experimentally validated by binding and functional assays.
Collapse
Affiliation(s)
- Leslie Salas-Estrada
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Davide Provasi
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Xing Qiu
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Husnu Ümit Kaniskan
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Xi-Ping Huang
- National
Institute of Mental Health, Psychoactive Drug Screening Program, Department
of Pharmacology, University of North Carolina
School of Medicine, Chapel Hill, North Carolina 27599, United States
| | - Jeffrey F. DiBerto
- National
Institute of Mental Health, Psychoactive Drug Screening Program, Department
of Pharmacology, University of North Carolina
School of Medicine, Chapel Hill, North Carolina 27599, United States
| | - João Marcelo Lamim Ribeiro
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Jian Jin
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
- Mount
Sinai Center for Therapeutics Discovery, Departments of Oncological
Sciences and Neuroscience, Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Bryan L. Roth
- National
Institute of Mental Health, Psychoactive Drug Screening Program, Department
of Pharmacology, University of North Carolina
School of Medicine, Chapel Hill, North Carolina 27599, United States
- Division
of Chemical Biology and Medicinal Chemistry, University of North Carolina at Chapel Hill Eshelman School of Pharmacy, Chapel Hill, North Carolina 27599, United States
| | - Marta Filizola
- Department
of Pharmacological Sciences, Icahn School
of Medicine at Mount Sinai, New York, New York 10029, United States
| |
Collapse
|
28
|
Zhai S, Tan Y, Zhang C, Hipolito CJ, Song L, Zhu C, Zhang Y, Duan H, Yin Y. PepScaf: Harnessing Machine Learning with In Vitro Selection toward De Novo Macrocyclic Peptides against IL-17C/IL-17RE Interaction. J Med Chem 2023; 66:11187-11200. [PMID: 37480587 DOI: 10.1021/acs.jmedchem.3c00627] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2023]
Abstract
The combination of library-based screening and artificial intelligence (AI) has been accelerating the discovery and optimization of hit ligands. However, the potential of AI to assist in de novo macrocyclic peptide ligand discovery has yet to be fully explored. In this study, an integrated AI framework called PepScaf was developed to extract the critical scaffold relative to bioactivity based on a vast dataset from an initial in vitro selection campaign against a model protein target, interleukin-17C (IL-17C). Taking the generated scaffold, a focused macrocyclic peptide library was rationally constructed to target IL-17C, yielding over 20 potent peptides that effectively inhibited IL-17C/IL-17RE interaction. Notably, the top two peptides displayed exceptional potency with IC50 values of 1.4 nM. This approach presents a viable methodology for more efficient macrocyclic peptide discovery, offering potential time and cost savings. Additionally, this is also the first report regarding the discovery of macrocyclic peptides against IL-17C/IL-17RE interaction.
Collapse
Affiliation(s)
- Silong Zhai
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yahong Tan
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Chengyun Zhang
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Christopher John Hipolito
- Screening & Compound Profiling, Quantitative Biosciences, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Lulu Song
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Cheng Zhu
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Youming Zhang
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
| | - Hongliang Duan
- School of Pharmaceutical Sciences, Zhejiang University of Technology, Hangzhou 310014, China
| | - Yizhen Yin
- State Key Laboratory of Microbial Technology, Institute of Microbial Technology, Shandong University, Qingdao 266237, China
- Shandong Research Institute of Industrial Technology, Jinan 250101, China
| |
Collapse
|
29
|
Chenthamarakshan V, Hoffman SC, Owen CD, Lukacik P, Strain-Damerell C, Fearon D, Malla TR, Tumber A, Schofield CJ, Duyvesteyn HM, Dejnirattisai W, Carrique L, Walter TS, Screaton GR, Matviiuk T, Mojsilovic A, Crain J, Walsh MA, Stuart DI, Das P. Accelerating drug target inhibitor discovery with a deep generative foundation model. SCIENCE ADVANCES 2023; 9:eadg7865. [PMID: 37343087 PMCID: PMC10284550 DOI: 10.1126/sciadv.adg7865] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 05/17/2023] [Indexed: 06/23/2023]
Abstract
Inhibitor discovery for emerging drug-target proteins is challenging, especially when target structure or active molecules are unknown. Here, we experimentally validate the broad utility of a deep generative framework trained at-scale on protein sequences, small molecules, and their mutual interactions-unbiased toward any specific target. We performed a protein sequence-conditioned sampling on the generative foundation model to design small-molecule inhibitors for two dissimilar targets: the spike protein receptor-binding domain (RBD) and the main protease from SARS-CoV-2. Despite using only the target sequence information during the model inference, micromolar-level inhibition was observed in vitro for two candidates out of four synthesized for each target. The most potent spike RBD inhibitor exhibited activity against several variants in live virus neutralization assays. These results establish that a single, broadly deployable generative foundation model for accelerated inhibitor discovery is effective and efficient, even in the absence of target structure or binder information.
Collapse
Affiliation(s)
| | - Samuel C. Hoffman
- IBM Research, Thomas J. Watson Research Center, Yorktown Heights, New York, NY, USA
| | - C. David Owen
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Petra Lukacik
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Claire Strain-Damerell
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Daren Fearon
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - Tika R. Malla
- Chemistry Research Laboratory, Department of Chemistry and the Ineos Oxford Institute for Antimicrobial Research, University of Oxford, 12 Mansfield Road, OX1 3TA Oxford, UK
| | - Anthony Tumber
- Chemistry Research Laboratory, Department of Chemistry and the Ineos Oxford Institute for Antimicrobial Research, University of Oxford, 12 Mansfield Road, OX1 3TA Oxford, UK
| | - Christopher J. Schofield
- Chemistry Research Laboratory, Department of Chemistry and the Ineos Oxford Institute for Antimicrobial Research, University of Oxford, 12 Mansfield Road, OX1 3TA Oxford, UK
| | - Helen M.E. Duyvesteyn
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Wanwisa Dejnirattisai
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | - Loic Carrique
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Thomas S. Walter
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Gavin R. Screaton
- Wellcome Centre for Human Genetics, Nuffield Department of Medicine, University of Oxford, Oxford OX3 7BN, UK
| | | | | | - Jason Crain
- IBM Research Europe, Hartree Centre, Daresbury WA4 4AD, UK
- Department of Biochemistry, University of Oxford, Oxford OX1 3QU, UK
| | - Martin A. Walsh
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Research Complex at Harwell, Harwell Science and Innovation Campus, OX11 0FA Didcot, UK
| | - David I. Stuart
- Diamond Light Source Ltd., Harwell Science and Innovation Campus, OX11 0DE Didcot, UK
- Division of Structural Biology, University of Oxford, The Wellcome Centre for Human Genetics, Headington, Oxford, UK
| | - Payel Das
- IBM Research, Thomas J. Watson Research Center, Yorktown Heights, New York, NY, USA
| |
Collapse
|
30
|
Srivathsa AV, Sadashivappa NM, Hegde AK, Radha S, Mahesh AR, Ammunje DN, Sen D, Theivendren P, Govindaraj S, Kunjiappan S, Pavadai P. A Review on Artificial Intelligence Approaches and Rational Approaches in Drug Discovery. Curr Pharm Des 2023; 29:1180-1192. [PMID: 37132148 DOI: 10.2174/1381612829666230428110542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 02/06/2023] [Accepted: 02/27/2023] [Indexed: 05/04/2023]
Abstract
Artificial intelligence (AI) speeds up the drug development process and reduces its time, as well as the cost which is of enormous importance in outbreaks such as COVID-19. It uses a set of machine learning algorithms that collects the available data from resources, categorises, processes and develops novel learning methodologies. Virtual screening is a successful application of AI, which is used in screening huge drug-like databases and filtering to a small number of compounds. The brain's thinking of AI is its neural networking which uses techniques such as Convoluted Neural Network (CNN), Recursive Neural Network (RNN) or Generative Adversial Neural Network (GANN). The application ranges from small molecule drug discovery to the development of vaccines. In the present review article, we discussed various techniques of drug design, structure and ligand-based, pharmacokinetics and toxicity prediction using AI. The rapid phase of discovery is the need of the hour and AI is a targeted approach to achieve this.
Collapse
Affiliation(s)
- Anjana Vidya Srivathsa
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Nandini Markuli Sadashivappa
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Apeksha Krishnamurthy Hegde
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Srimathi Radha
- Department of Pharmaceutical Chemistry, SRM College of Pharmacy, Faculty of Medicine and Health Sciences, SRM Institute of Science and Technology, Chengalpattu District, Kattankulathur, Tamil Nadu, 603203, India
| | - Agasa Ramu Mahesh
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Damodar Nayak Ammunje
- Department of Pharmacology, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| | - Debanjan Sen
- Department of Pharmaceutical Chemistry, BCDA College of Pharmacy & Technology, Hridaypur, Kolkata, 700127, West Bengal, India
| | - Panneerselvam Theivendren
- Department of Pharmaceutical Chemistry, Swamy Vivekanandha College of Pharmacy, Elayampalayam, Tiruchengode, 637205, India
| | - Saravanan Govindaraj
- Department of Pharmaceutical Chemistry, MNR College of Pharmacy, Fasalwadi, Sangareddy, 502 001, India
| | - Selvaraj Kunjiappan
- Department of Biotechnology, Kalasalingam Academy of Research and Education, Krishnankoil, 626126, India
| | - Parasuraman Pavadai
- Department of Pharmaceutical Chemistry, Faculty of Pharmacy, M.S. Ramaiah University of Applied Sciences, M.S.R. Nagar, Bengaluru, 560054, India
| |
Collapse
|
31
|
Janet JP, Mervin L, Engkvist O. Artificial intelligence in molecular de novo design: Integration with experiment. Curr Opin Struct Biol 2023; 80:102575. [PMID: 36966692 DOI: 10.1016/j.sbi.2023.102575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Revised: 02/09/2023] [Accepted: 02/18/2023] [Indexed: 06/04/2023]
Abstract
In this mini review, we capture the latest progress of applying artificial intelligence (AI) techniques based on deep learning architectures to molecular de novo design with a focus on integration with experimental validation. We will cover the progress and experimental validation of novel generative algorithms, the validation of QSAR models and how AI-based molecular de novo design is starting to become connected with chemistry automation. While progress has been made in the last few years, it is still early days. The experimental validations conducted thus far should be considered proof-of-principle, providing confidence that the field is moving in the right direction.
Collapse
Affiliation(s)
- Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Lewis Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
32
|
Ballarotto M, Willems S, Stiller T, Nawa F, Marschner JA, Grisoni F, Merk D. De Novo Design of Nurr1 Agonists via Fragment-Augmented Generative Deep Learning in Low-Data Regime. J Med Chem 2023. [PMID: 37256819 DOI: 10.1021/acs.jmedchem.3c00485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Generative neural networks trained on SMILES can design innovative bioactive molecules de novo. These so-called chemical language models (CLMs) have typically been trained on tens of template molecules for fine-tuning. However, it is challenging to apply CLM to orphan targets with few known ligands. We have fine-tuned a CLM with a single potent Nurr1 agonist as template in a fragment-augmented fashion and obtained novel Nurr1 agonists using sampling frequency for design prioritization. Nanomolar potency and binding affinity of the top-ranking design and its structural novelty compared to available Nurr1 ligands highlight its value as an early chemical tool and as a lead for Nurr1 agonist development, as well as the applicability of CLM in very low-data scenarios.
Collapse
Affiliation(s)
- Marco Ballarotto
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
- Department of Pharmaceutical Sciences, Università degli Studi di Perugia, 06123 Perugia, Italy
| | - Sabine Willems
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Tanja Stiller
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Felix Nawa
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Julian A Marschner
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Francesca Grisoni
- Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, 5612AZ Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, 3584CB Utrecht, The Netherlands
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| |
Collapse
|
33
|
Saebi M, Nan B, Herr JE, Wahlers J, Guo Z, Zurański AM, Kogej T, Norrby PO, Doyle AG, Chawla NV, Wiest O. On the use of real-world datasets for reaction yield prediction. Chem Sci 2023; 14:4997-5005. [PMID: 37206399 PMCID: PMC10189898 DOI: 10.1039/d2sc06041h] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Accepted: 03/09/2023] [Indexed: 09/30/2023] Open
Abstract
The lack of publicly available, large, and unbiased datasets is a key bottleneck for the application of machine learning (ML) methods in synthetic chemistry. Data from electronic laboratory notebooks (ELNs) could provide less biased, large datasets, but no such datasets have been made publicly available. The first real-world dataset from the ELNs of a large pharmaceutical company is disclosed and its relationship to high-throughput experimentation (HTE) datasets is described. For chemical yield predictions, a key task in chemical synthesis, an attributed graph neural network (AGNN) performs as well as or better than the best previous models on two HTE datasets for the Suzuki-Miyaura and Buchwald-Hartwig reactions. However, training the AGNN on an ELN dataset does not lead to a predictive model. The implications of using ELN data for training ML-based models are discussed in the context of yield predictions.
Collapse
Affiliation(s)
- Mandana Saebi
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Bozhao Nan
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - John E Herr
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - Jessica Wahlers
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| | - Zhichun Guo
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Andrzej M Zurański
- Department of Chemistry, Princeton University Princeton New Jersey 08544 USA
| | - Thierry Kogej
- Molecular AI, Discovery Sciences, R&D, AstraZeneca Pepparedsleden 1, SE-431 83 Mölndal Gothenburg Sweden
| | - Per-Ola Norrby
- Data Science and Modelling, Pharmaceutical Sciences, R&D, AstraZeneca Pepparedsleden 1, SE-431 83 Mölndal Gothenburg Sweden
| | - Abigail G Doyle
- Department of Chemistry, Princeton University Princeton New Jersey 08544 USA
- Department of Chemistry and Biochemistry, University of California Los Angeles California 90095 USA
| | - Nitesh V Chawla
- Department of Computer Science and Engineering and Lucy Family Institute for Data and Society, University of Notre Dame Notre Dame IN 46556 USA
| | - Olaf Wiest
- Department of Chemistry and Biochemistry, University of Notre Dame Notre Dame IN 46556 USA
| |
Collapse
|
34
|
Wang J, Zeng Y, Sun H, Wang J, Wang X, Jin R, Wang M, Zhang X, Cao D, Chen X, Hsieh CY, Hou T. Molecular Generation with Reduced Labeling through Constraint Architecture. J Chem Inf Model 2023. [PMID: 37184885 DOI: 10.1021/acs.jcim.3c00579] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
In the past few years, a number of machine learning (ML)-based molecular generative models have been proposed for generating molecules with desirable properties, but they all require a large amount of label data of pharmacological and physicochemical properties. However, experimental determination of these labels, especially bioactivity labels, is very expensive. In this study, we analyze the dependence of various multi-property molecule generation models on biological activity label data and propose Frag-G/M, a fragment-based multi-constraint molecular generation framework based on conditional transformer, recurrent neural networks (RNNs), and reinforcement learning (RL). The experimental results illustrate that, using the same number of labels, Frag-G/M can generate more desired molecules than the baselines (several times more than the baselines). Moreover, compared with the known active compounds, the molecules generated by Frag-G/M exhibit higher scaffold diversity than those generated by the baselines, thus making it more promising to be used in real-world drug discovery scenarios.
Collapse
Affiliation(s)
- Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
- School of Computer Science, Wuhan University, Wuhan, Hubei 430072, P. R. China
| | - Yundian Zeng
- College of Control Science and Engineering, Zhejiang University, Hangzhou, Zhejiang 310027, P. R. China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing, Jiangsu 210009, P. R. China
| | - Junmei Wang
- Department of Pharmaceutical Sciences and Computational Chemical Genomics Screening Center, School of Pharmacy, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Xiaorui Wang
- State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Macau 999078, P. R. China
| | - Ruofan Jin
- College of Life Science, Zhejiang University, Hangzhou, Zhejiang 310027, P. R. China
| | - Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| | - Xujun Zhang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, Hunan 410004, P. R. China
| | - Xi Chen
- School of Computer Science, Wuhan University, Wuhan, Hubei 430072, P. R. China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, Zhejiang 310058, P. R. China
| |
Collapse
|
35
|
Ji C, Zheng Y, Wang R, Cai Y, Wu H. Graph Polish: A Novel Graph Generation Paradigm for Molecular Optimization. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2023; 34:2323-2337. [PMID: 34520363 DOI: 10.1109/tnnls.2021.3106392] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
Molecular optimization, which transforms a given input molecule X into another Y with desired properties, is essential in molecular drug discovery. The traditional approaches either suffer from sample-inefficient learning or ignore information that can be captured with the supervised learning of optimized molecule pairs. In this study, we present a novel molecular optimization paradigm, Graph Polish. In this paradigm, with the guidance of the source and target molecule pairs of the desired properties, a heuristic optimization solution can be derived: given an input molecule, we first predict which atom can be viewed as the optimization center, and then the nearby regions are optimized around this center. We then propose an effective and efficient learning framework, Teacher and Student polish, to capture the dependencies in the optimization steps. A teacher component automatically identifies and annotates the optimization centers and the preservation, removal, and addition of some parts of the molecules; a student component learns these knowledges and applies them to a new molecule. The proposed paradigm can offer an intuitive interpretation for the molecular optimization result. Experiments with multiple optimization tasks are conducted on several benchmark datasets. The proposed approach achieves a significant advantage over the six state-of-the-art baseline methods. Also, extensive studies are conducted to validate the effectiveness, explainability, and time savings of the novel optimization paradigm.
Collapse
|
36
|
Anstine D, Isayev O. Generative Models as an Emerging Paradigm in the Chemical Sciences. J Am Chem Soc 2023; 145:8736-8750. [PMID: 37052978 PMCID: PMC10141264 DOI: 10.1021/jacs.2c13467] [Citation(s) in RCA: 44] [Impact Index Per Article: 44.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2022] [Indexed: 04/14/2023]
Abstract
Traditional computational approaches to design chemical species are limited by the need to compute properties for a vast number of candidates, e.g., by discriminative modeling. Therefore, inverse design methods aim to start from the desired property and optimize a corresponding chemical structure. From a machine learning viewpoint, the inverse design problem can be addressed through so-called generative modeling. Mathematically, discriminative models are defined by learning the probability distribution function of properties given the molecular or material structure. In contrast, a generative model seeks to exploit the joint probability of a chemical species with target characteristics. The overarching idea of generative modeling is to implement a system that produces novel compounds that are expected to have a desired set of chemical features, effectively sidestepping issues found in the forward design process. In this contribution, we overview and critically analyze popular generative algorithms like generative adversarial networks, variational autoencoders, flow, and diffusion models. We highlight key differences between each of the models, provide insights into recent success stories, and discuss outstanding challenges for realizing generative modeling discovered solutions in chemical applications.
Collapse
Affiliation(s)
- Dylan
M. Anstine
- Department
of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Olexandr Isayev
- Department
of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
37
|
Salas-Estrada L, Provasi D, Qui X, Kaniskan HÜ, Huang XP, DiBerto JF, Ribeiro JML, Jin J, Roth BL, Filizola M. De Novo Design of κ-Opioid Receptor Antagonists Using a Generative Deep Learning Framework. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.04.25.537995. [PMID: 37162828 PMCID: PMC10168226 DOI: 10.1101/2023.04.25.537995] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/11/2023]
Abstract
Likely effective pharmacological interventions for the treatment of opioid addiction include attempts to attenuate brain reward deficits during periods of abstinence. Pharmacological blockade of the κ-opioid receptor (KOR) has been shown to abolish brain reward deficits in rodents during withdrawal, as well as to reduce the escalation of opioid use in rats with extended access to opioids. Although KOR antagonists represent promising candidates for the treatment of opioid addiction, very few potent selective KOR antagonists are known to date and most of them exhibit significant safety concerns. Here, we used a generative deep learning framework for the de novo design of chemotypes with putative KOR antagonistic activity. Molecules generated by models trained with this framework were prioritized for chemical synthesis based on their predicted optimal interactions with the receptor. Our models and proposed training protocol were experimentally validated by binding and functional assays.
Collapse
|
38
|
Liu X, Zhang W, Tong X, Zhong F, Li Z, Xiong Z, Xiong J, Wu X, Fu Z, Tan X, Liu Z, Zhang S, Jiang H, Li X, Zheng M. MolFilterGAN: a progressively augmented generative adversarial network for triaging AI-designed molecules. J Cheminform 2023; 15:42. [PMID: 37031191 PMCID: PMC10082991 DOI: 10.1186/s13321-023-00711-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 03/14/2023] [Indexed: 04/10/2023] Open
Abstract
Artificial intelligence (AI)-based molecular design methods, especially deep generative models for generating novel molecule structures, have gratified our imagination to explore unknown chemical space without relying on brute-force exploration. However, whether designed by AI or human experts, the molecules need to be accessibly synthesized and biologically evaluated, and the trial-and-error process remains a resources-intensive endeavor. Therefore, AI-based drug design methods face a major challenge of how to prioritize the molecular structures with potential for subsequent drug development. This study indicates that common filtering approaches based on traditional screening metrics fail to differentiate AI-designed molecules. To address this issue, we propose a novel molecular filtering method, MolFilterGAN, based on a progressively augmented generative adversarial network. Comparative analysis shows that MolFilterGAN outperforms conventional screening approaches based on drug-likeness or synthetic ability metrics. Retrospective analysis of AI-designed discoidin domain receptor 1 (DDR1) inhibitors shows that MolFilterGAN significantly increases the efficiency of molecular triaging. Further evaluation of MolFilterGAN on eight external ligand sets suggests that MolFilterGAN is useful in triaging or enriching bioactive compounds across a wide range of target types. These results highlighted the importance of MolFilterGAN in evaluating molecules integrally and further accelerating molecular discovery especially combined with advanced AI generative models.
Collapse
Affiliation(s)
- Xiaohong Liu
- Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- AlphaMa Inc., No. 108, Yuxin Road, Suzhou Industrial Park, Suzhou, 215128, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaochu Tong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Feisheng Zhong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Zhaojun Li
- AlphaMa Inc., No. 108, Yuxin Road, Suzhou Industrial Park, Suzhou, 215128, China
| | - Zhaoping Xiong
- Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jiacheng Xiong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaolong Wu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Pharmacy, East China University of Science and Technology, 130 Meilong Road, Shanghai, 200237, China
| | - Zunyun Fu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Xiaoqin Tan
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- ByteDance AI Lab, No. 1999 Yishan Road, Shanghai, 201103, China
| | - Zhiguo Liu
- AlphaMa Inc., No. 108, Yuxin Road, Suzhou Industrial Park, Suzhou, 215128, China
| | - Sulin Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Hualiang Jiang
- Shanghai Institute for Advanced Immunochemical Studies, and School of Life Science and Technology, ShanghaiTech University, 393 Middle Huaxia Road, Shanghai, 201210, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, 310024, Hangzhou, China.
| |
Collapse
|
39
|
Grisoni F. Chemical language models for de novo drug design: Challenges and opportunities. Curr Opin Struct Biol 2023; 79:102527. [PMID: 36738564 DOI: 10.1016/j.sbi.2023.102527] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/07/2022] [Accepted: 12/20/2022] [Indexed: 02/05/2023]
Abstract
Generative deep learning is accelerating de novo drug design, by allowing the generation of molecules with desired properties on demand. Chemical language models - which generate new molecules in the form of strings using deep learning - have been particularly successful in this endeavour. Thanks to advances in natural language processing methods and interdisciplinary collaborations, chemical language models are expected to become increasingly relevant in drug discovery. This minireview provides an overview of the current state-of-the-art of chemical language models for de novo design, and analyses current limitations, challenges, and advantages. Finally, a perspective on future opportunities is provided.
Collapse
Affiliation(s)
- Francesca Grisoni
- Eindhoven University of Technology, Institute for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven, Netherlands; Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Netherlands.
| |
Collapse
|
40
|
Isert C, Atz K, Schneider G. Structure-based drug design with geometric deep learning. Curr Opin Struct Biol 2023; 79:102548. [PMID: 36842415 DOI: 10.1016/j.sbi.2023.102548] [Citation(s) in RCA: 30] [Impact Index Per Article: 30.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2022] [Revised: 01/16/2023] [Accepted: 01/24/2023] [Indexed: 02/26/2023]
Abstract
Structure-based drug design uses three-dimensional geometric information of macromolecules, such as proteins or nucleic acids, to identify suitable ligands. Geometric deep learning, an emerging concept of neural-network-based machine learning, has been applied to macromolecular structures. This review provides an overview of the recent applications of geometric deep learning in bioorganic and medicinal chemistry, highlighting its potential for structure-based drug discovery and design. Emphasis is placed on molecular property prediction, ligand binding site and pose prediction, and structure-based de novo molecular design. The current challenges and opportunities are highlighted, and a forecast of the future of geometric deep learning for drug discovery is presented.
Collapse
Affiliation(s)
- Clemens Isert
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, Zurich, 8093, Switzerland
| | - Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, Zurich, 8093, Switzerland
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, Zurich, 8093, Switzerland; ETH Singapore SEC Ltd, 1 CREATE Way, #06-01 CREATE Tower, Singapore, 8093, Singapore.
| |
Collapse
|
41
|
Chen Y, Wang Z, Wang L, Wang J, Li P, Cao D, Zeng X, Ye X, Sakurai T. Deep generative model for drug design from protein target sequence. J Cheminform 2023; 15:38. [PMID: 36978179 PMCID: PMC10052801 DOI: 10.1186/s13321-023-00702-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 02/18/2023] [Indexed: 03/30/2023] Open
Abstract
Drug discovery for a protein target is a laborious and costly process. Deep learning (DL) methods have been applied to drug discovery and successfully generated novel molecular structures, and they can substantially reduce development time and costs. However, most of them rely on prior knowledge, either by drawing on the structure and properties of known molecules to generate similar candidate molecules or extracting information on the binding sites of protein pockets to obtain molecules that can bind to them. In this paper, DeepTarget, an end-to-end DL model, was proposed to generate novel molecules solely relying on the amino acid sequence of the target protein to reduce the heavy reliance on prior knowledge. DeepTarget includes three modules: Amino Acid Sequence Embedding (AASE), Structural Feature Inference (SFI), and Molecule Generation (MG). AASE generates embeddings from the amino acid sequence of the target protein. SFI inferences the potential structural features of the synthesized molecule, and MG seeks to construct the eventual molecule. The validity of the generated molecules was demonstrated by a benchmark platform of molecular generation models. The interaction between the generated molecules and the target proteins was also verified on the basis of two metrics, drug-target affinity and molecular docking. The results of the experiments indicated the efficacy of the model for direct molecule generation solely conditioned on amino acid sequence.
Collapse
Affiliation(s)
- Yangyang Chen
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Zixu Wang
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Lei Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China
| | - Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon, 21983, Republic of Korea
- Bioinformatics and Molecular Design Research Center (BMDRC), Incheon, 21983, Republic of Korea
| | - Pengyong Li
- School of Computer Science and Technology, Xidian University, Xian, 710071, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China.
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, People's Republic of China.
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| |
Collapse
|
42
|
Grasso D, Galderisi S, Santucci A, Bernini A. Pharmacological Chaperones and Protein Conformational Diseases: Approaches of Computational Structural Biology. Int J Mol Sci 2023; 24:ijms24065819. [PMID: 36982893 PMCID: PMC10054308 DOI: 10.3390/ijms24065819] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 03/09/2023] [Accepted: 03/16/2023] [Indexed: 03/30/2023] Open
Abstract
Whenever a protein fails to fold into its native structure, a profound detrimental effect is likely to occur, and a disease is often developed. Protein conformational disorders arise when proteins adopt abnormal conformations due to a pathological gene variant that turns into gain/loss of function or improper localization/degradation. Pharmacological chaperones are small molecules restoring the correct folding of a protein suitable for treating conformational diseases. Small molecules like these bind poorly folded proteins similarly to physiological chaperones, bridging non-covalent interactions (hydrogen bonds, electrostatic interactions, and van der Waals contacts) loosened or lost due to mutations. Pharmacological chaperone development involves, among other things, structural biology investigation of the target protein and its misfolding and refolding. Such research can take advantage of computational methods at many stages. Here, we present an up-to-date review of the computational structural biology tools and approaches regarding protein stability evaluation, binding pocket discovery and druggability, drug repurposing, and virtual ligand screening. The tools are presented as organized in an ideal workflow oriented at pharmacological chaperones' rational design, also with the treatment of rare diseases in mind.
Collapse
Affiliation(s)
- Daniela Grasso
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Silvia Galderisi
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Annalisa Santucci
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| | - Andrea Bernini
- Department of Biotechnology, Chemistry, and Pharmacy, University of Siena, 53100 Siena, Italy
| |
Collapse
|
43
|
Ivanenkov YA, Polykovskiy D, Bezrukov D, Zagribelnyy B, Aladinskiy V, Kamya P, Aliper A, Ren F, Zhavoronkov A. Chemistry42: An AI-Driven Platform for Molecular Design and Optimization. J Chem Inf Model 2023; 63:695-701. [PMID: 36728505 PMCID: PMC9930109 DOI: 10.1021/acs.jcim.2c01191] [Citation(s) in RCA: 29] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Indexed: 02/03/2023]
Abstract
Chemistry42 is a software platform for de novo small molecule design and optimization that integrates Artificial Intelligence (AI) techniques with computational and medicinal chemistry methodologies. Chemistry42 efficiently generates novel molecular structures with optimized properties validated in both in vitro and in vivo studies and is available through licensing or collaboration. Chemistry42 is the core component of Insilico Medicine's Pharma.ai drug discovery suite. Pharma.ai also includes PandaOmics for target discovery and multiomics data analysis, and inClinico─a data-driven multimodal forecast of a clinical trial's probability of success (PoS). In this paper, we demonstrate how the platform can be used to efficiently find novel molecular structures against DDR1 and CDK20.
Collapse
Affiliation(s)
- Yan A. Ivanenkov
- Insilico
Medicine Kong Kong Ltd., Unit 310, 3/F, Building 8W, Phase 2, Hong Kong Science Park, Pak Shek Kok, Hong Kong
| | - Daniil Polykovskiy
- Insilico
Medicine Canada Inc., 3710-1250 René-Lévesque Blvd W, Montreal, Quebec, H3B
4W8 Canada
| | - Dmitry Bezrukov
- Insilico
Medicine Kong Kong Ltd., Unit 310, 3/F, Building 8W, Phase 2, Hong Kong Science Park, Pak Shek Kok, Hong Kong
| | - Bogdan Zagribelnyy
- Insilico
Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, PO Box 145748, Abu Dhabi, UAE
| | - Vladimir Aladinskiy
- Insilico
Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, PO Box 145748, Abu Dhabi, UAE
| | - Petrina Kamya
- Insilico
Medicine Canada Inc., 3710-1250 René-Lévesque Blvd W, Montreal, Quebec, H3B
4W8 Canada
| | - Alex Aliper
- Insilico
Medicine AI Limited, Level 6, Unit 08, Block A, IRENA HQ Building, Masdar City, PO Box 145748, Abu Dhabi, UAE
| | - Feng Ren
- Insilico
Medicine Shanghai Ltd., Suite 901, Tower C, Changtai Plaza, 2889 Jinke Road, Pudong New District, Shanghai 201203, China
| | - Alex Zhavoronkov
- Insilico
Medicine Kong Kong Ltd., Unit 310, 3/F, Building 8W, Phase 2, Hong Kong Science Park, Pak Shek Kok, Hong Kong
| |
Collapse
|
44
|
Kumar M, Nguyen TPN, Kaur J, Singh TG, Soni D, Singh R, Kumar P. Opportunities and challenges in application of artificial intelligence in pharmacology. Pharmacol Rep 2023; 75:3-18. [PMID: 36624355 PMCID: PMC9838466 DOI: 10.1007/s43440-022-00445-1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 12/23/2022] [Accepted: 12/25/2022] [Indexed: 01/11/2023]
Abstract
Artificial intelligence (AI) is a machine science that can mimic human behaviour like intelligent analysis of data. AI functions with specialized algorithms and integrates with deep and machine learning. Living in the digital world can generate a huge amount of medical data every day. Therefore, we need an automated and reliable evaluation tool that can make decisions more accurately and faster. Machine learning has the potential to learn, understand and analyse the data used in healthcare systems. In the last few years, AI is known to be employed in various fields in pharmaceutical science especially in pharmacological research. It helps in the analysis of preclinical (laboratory animals) and clinical (in human) trial data. AI also plays important role in various processes such as drug discovery/manufacturing, diagnosis of big data for disease identification, personalized treatment, clinical trial research, radiotherapy, surgical robotics, smart electronic health records, and epidemic outbreak prediction. Moreover, AI has been used in the evaluation of biomarkers and diseases. In this review, we explain various models and general processes of machine learning and their role in pharmacological science. Therefore, AI with deep learning and machine learning could be relevant in pharmacological research.
Collapse
Affiliation(s)
- Mandeep Kumar
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
| | - T P Nhung Nguyen
- Department of Pharmacy, Unit of Pharmacology and Toxicology, University of Genoa, Genoa, Italy
- Department of Pharmacy, Da Nang University of Medical Technology and Pharmacy, Da Nang, Vietnam
| | - Jasleen Kaur
- Department of Pharmacology and Toxicology, National Institute of Pharmaceutical Education and Research (NIPER), Lucknow, Uttar Pradesh, 226002, India
| | | | - Divya Soni
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Randhir Singh
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India
| | - Puneet Kumar
- Department of Pharmacology, Central University of Punjab, Ghudda, Bathinda, Punjab, 151401, India.
| |
Collapse
|
45
|
Mondal PP, Galodha A, Verma VK, Singh V, Show PL, Awasthi MK, Lall B, Anees S, Pollmann K, Jain R. Review on machine learning-based bioprocess optimization, monitoring, and control systems. BIORESOURCE TECHNOLOGY 2023; 370:128523. [PMID: 36565820 DOI: 10.1016/j.biortech.2022.128523] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/29/2022] [Revised: 12/17/2022] [Accepted: 12/20/2022] [Indexed: 06/17/2023]
Abstract
Machine Learning is quickly becoming an impending game changer for transforming big data thrust from the bioprocessing industry into actionable output. However, the complex data set from bioprocess, lagging cyber-integrated sensor system, and issues with storage scalability limit machine learning real-time application. Hence, it is imperative to know the state of technology to address prevailing issues. This review first gives an insight into the basic understanding of the machine learning domain and discusses its complexities for more comprehensive applications. Followed by an outline of how relevant machine learning models are for statistical and logical analysis of the enormous datasets generated to control bioprocess operations. Then this review critically discusses the current knowledge, its limitations, and future aspects in different subfields of the bioprocessing industry. Further, this review discusses the prospects of adopting a hybrid method to dovetail different modeling strategies, cyber-networking, and integrated sensors to develop new digital biotechnologies.
Collapse
Affiliation(s)
- Partha Pratim Mondal
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz-Khas, New Delhi 110016, India
| | - Abhinav Galodha
- School of Interdisciplinary Research, Indian Institute of Technology Delhi, Hauz-Khas, New Delhi 110016, India
| | - Vishal Kumar Verma
- Department of Biochemical Engineering and Biotechnology, Indian Institute of Technology Delhi, Hauz-Khas, New Delhi 110016, India
| | - Vijai Singh
- Department of Biosciences, School of Science, Indrashil University, Rajpur, Mehsana, 382715, Gujarat, India
| | - Pau Loke Show
- Zhejiang Provincial Key Laboratory for Subtropical Water Environment and Marine Biological Resources Protection, Wenzhou University, Wenzhou 325035, China; Department of Sustainable Engineering, Saveetha School of Engineering, SIMATS, Chennai 602105, India; Department of Chemical and Environmental Engineering, University of Nottingham, Malaysia, 43500 Semenyih, Selangor Darul Ehsan, Malaysia
| | - Mukesh Kumar Awasthi
- College of Natural Resources and Environment, Northwest A&F University, Yangling, Shaanxi Province 712100, China
| | - Brejesh Lall
- Electrical Engineering Department, Indian Institute of Technology Delhi, Hauz-Khas, New Delhi 110016, India
| | - Sanya Anees
- Department of Electronics and Communication Engineering, Indian Institute of Information Technology Guwahati, Bongora, Guwahati 781015, India
| | - Katrin Pollmann
- Helmholtz-Zentrum Dresden-Rossendorf, Helmhholtz Institute Freiberg for Resource Technology, Bautzner Landstrasse 400, 01328 Dresden, Germany
| | - Rohan Jain
- Helmholtz-Zentrum Dresden-Rossendorf, Helmhholtz Institute Freiberg for Resource Technology, Bautzner Landstrasse 400, 01328 Dresden, Germany.
| |
Collapse
|
46
|
Sarkar C, Das B, Rawat VS, Wahlang JB, Nongpiur A, Tiewsoh I, Lyngdoh NM, Das D, Bidarolli M, Sony HT. Artificial Intelligence and Machine Learning Technology Driven Modern Drug Discovery and Development. Int J Mol Sci 2023; 24:ijms24032026. [PMID: 36768346 PMCID: PMC9916967 DOI: 10.3390/ijms24032026] [Citation(s) in RCA: 27] [Impact Index Per Article: 27.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Revised: 12/27/2022] [Accepted: 12/28/2022] [Indexed: 01/22/2023] Open
Abstract
The discovery and advances of medicines may be considered as the ultimate relevant translational science effort that adds to human invulnerability and happiness. But advancing a fresh medication is a quite convoluted, costly, and protracted operation, normally costing USD ~2.6 billion and consuming a mean time span of 12 years. Methods to cut back expenditure and hasten new drug discovery have prompted an arduous and compelling brainstorming exercise in the pharmaceutical industry. The engagement of Artificial Intelligence (AI), including the deep-learning (DL) component in particular, has been facilitated by the employment of classified big data, in concert with strikingly reinforced computing prowess and cloud storage, across all fields. AI has energized computer-facilitated drug discovery. An unrestricted espousing of machine learning (ML), especially DL, in many scientific specialties, and the technological refinements in computing hardware and software, in concert with various aspects of the problem, sustain this progress. ML algorithms have been extensively engaged for computer-facilitated drug discovery. DL methods, such as artificial neural networks (ANNs) comprising multiple buried processing layers, have of late seen a resurgence due to their capability to power automatic attribute elicitations from the input data, coupled with their ability to obtain nonlinear input-output pertinencies. Such features of DL methods augment classical ML techniques which bank on human-contrived molecular descriptors. A major part of the early reluctance concerning utility of AI in pharmaceutical discovery has begun to melt, thereby advancing medicinal chemistry. AI, along with modern experimental technical knowledge, is anticipated to invigorate the quest for new and improved pharmaceuticals in an expeditious, economical, and increasingly compelling manner. DL-facilitated methods have just initiated kickstarting for some integral issues in drug discovery. Many technological advances, such as "message-passing paradigms", "spatial-symmetry-preserving networks", "hybrid de novo design", and other ingenious ML exemplars, will definitely come to be pervasively widespread and help dissect many of the biggest, and most intriguing inquiries. Open data allocation and model augmentation will exert a decisive hold during the progress of drug discovery employing AI. This review will address the impending utilizations of AI to refine and bolster the drug discovery operation.
Collapse
Affiliation(s)
- Chayna Sarkar
- Department of Pharmacology, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Biswadeep Das
- Department of Pharmacology, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
- Correspondence: ; Tel./Fax: +91-135-708-856-0009
| | - Vikram Singh Rawat
- Department of Psychiatry, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
| | - Julie Birdie Wahlang
- Department of Pharmacology, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Arvind Nongpiur
- Department of Psychiatry, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Iadarilang Tiewsoh
- Department of Medicine, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Nari M. Lyngdoh
- Department of Anesthesiology, North Eastern Indira Gandhi Regional Institute of Health and Medical Sciences (NEIGRIHMS), Mawdiangdiang, Shillong 793018, Meghalaya, India
| | - Debasmita Das
- Department of Computer Science and Engineering, Vellore Institute of Technology, Vellore Campus, Tiruvalam Road, Katpadi, Vellore 632014, Tamil Nadu, India
| | - Manjunath Bidarolli
- Department of Pharmacology, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
| | - Hannah Theresa Sony
- Department of Pharmacology, All India Institute of Medical Sciences (AIIMS), Virbhadra Road, Rishikesh 249203, Uttarakhand, India
| |
Collapse
|
47
|
Moret M, Pachon Angona I, Cotos L, Yan S, Atz K, Brunner C, Baumgartner M, Grisoni F, Schneider G. Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat Commun 2023; 14:114. [PMID: 36611029 PMCID: PMC9825622 DOI: 10.1038/s41467-022-35692-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 12/19/2022] [Indexed: 01/09/2023] Open
Abstract
Generative chemical language models (CLMs) can be used for de novo molecular structure generation by learning from a textual representation of molecules. Here, we show that hybrid CLMs can additionally leverage the bioactivity information available for the training compounds. To computationally design ligands of phosphoinositide 3-kinase gamma (PI3Kγ), a collection of virtual molecules was created with a generative CLM. This virtual compound library was refined using a CLM-based classifier for bioactivity prediction. This second hybrid CLM was pretrained with patented molecular structures and fine-tuned with known PI3Kγ ligands. Several of the computer-generated molecular designs were commercially available, enabling fast prescreening and preliminary experimental validation. A new PI3Kγ ligand with sub-micromolar activity was identified, highlighting the method's scaffold-hopping potential. Chemical synthesis and biochemical testing of two of the top-ranked de novo designed molecules and their derivatives corroborated the model's ability to generate PI3Kγ ligands with medium to low nanomolar activity for hit-to-lead expansion. The most potent compounds led to pronounced inhibition of PI3K-dependent Akt phosphorylation in a medulloblastoma cell model, demonstrating efficacy of PI3Kγ ligands in PI3K/Akt pathway repression in human tumor cells. The results positively advocate hybrid CLMs for virtual compound screening and activity-focused molecular design.
Collapse
Affiliation(s)
- Michael Moret
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Irene Pachon Angona
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Leandro Cotos
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Shen Yan
- University of Zurich, University Children's Hospital, Children's Research Center, Pediatric Molecular Neuro-Oncology Research, Lengghalde 5, 8008, Zurich, Switzerland
| | - Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Cyrill Brunner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Martin Baumgartner
- University of Zurich, University Children's Hospital, Children's Research Center, Pediatric Molecular Neuro-Oncology Research, Lengghalde 5, 8008, Zurich, Switzerland
| | - Francesca Grisoni
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
- Eindhoven University of Technology, Institute for Complex Molecular Systems and Eindhoven Artificial Intelligence Systems Institute, Department of Biomedical Engineering, Groene Loper 7, 5612AZ, Eindhoven, The Netherlands.
- Center for 393 Living Technologies, Alliance TU/e, WUR, UU, UMC 394 Utrecht, Utrecht, 3584 CB, The Netherlands.
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
- ETH Singapore SEC Ltd, 1 CREATE Way, #06-01 CREATE Tower, Singapore, 138602, Singapore.
| |
Collapse
|
48
|
Atz K, Guba W, Grether U, Schneider G. Machine Learning and Computational Chemistry for the Endocannabinoid System. Methods Mol Biol 2023; 2576:477-493. [PMID: 36152211 DOI: 10.1007/978-1-0716-2728-0_39] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Computational methods in medicinal chemistry facilitate drug discovery and design. In particular, machine learning methodologies have recently gained increasing attention. This chapter provides a structured overview of the current state of computational chemistry and its applications for the interrogation of the endocannabinoid system (ECS), highlighting methods in structure-based drug design, virtual screening, ligand-based quantitative structure-activity relationship (QSAR) modeling, and de novo molecular design. We emphasize emerging methods in machine learning and anticipate a forecast of future opportunities of computational medicinal chemistry for the ECS.
Collapse
Affiliation(s)
- Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland
| | - Wolfgang Guba
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Uwe Grether
- Roche Pharma Research & Early Development, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland.
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Zurich, Switzerland
- ETH Singapore SEC Ltd, Singapore, Singapore
| |
Collapse
|
49
|
Umedera K, Yoshimori A, Chen H, Kouji H, Nakamura H, Bajorath J. DeepCubist: Molecular Generator for Designing Peptidomimetics based on Complex three-dimensional scaffolds. J Comput Aided Mol Des 2023; 37:107-115. [PMID: 36462089 PMCID: PMC9876871 DOI: 10.1007/s10822-022-00493-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Accepted: 11/23/2022] [Indexed: 12/04/2022]
Abstract
Mimicking bioactive conformations of peptide segments involved in the formation of protein-protein interfaces with small molecules is thought to represent a promising strategy for the design of protein-protein interaction (PPI) inhibitors. For compound design, the use of three-dimensional (3D) scaffolds rich in sp3-centers makes it possible to precisely mimic bioactive peptide conformations. Herein, we introduce DeepCubist, a molecular generator for designing peptidomimetics based on 3D scaffolds. Firstly, enumerated 3D scaffolds are superposed on a target peptide conformation to identify a preferred template structure for designing peptidomimetics. Secondly, heteroatoms and unsaturated bonds are introduced into the template via a deep generative model to produce candidate compounds. DeepCubist was applied to design peptidomimetics of exemplary peptide turn, helix, and loop structures in pharmaceutical targets engaging in PPIs.
Collapse
Affiliation(s)
- Kohei Umedera
- School of Life Science and Technology, Tokyo Institute of Technology, 4259, Nagatsuta-cho, Midori-ku, 226-8503 Yokohama, Japan ,Department of Life Science Informatics, LIMES Program Unit Chemical Biology and Medicinal Chemistry, B-IT, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Atsushi Yoshimori
- Institute for Theoretical Medicine, Inc, 26-1, Muraoka-Higashi 2-chome, 251-8555 Fujisawa, Kanagawa Japan
| | - Hengwei Chen
- Department of Life Science Informatics, LIMES Program Unit Chemical Biology and Medicinal Chemistry, B-IT, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| | - Hiroyuki Kouji
- Oita University Institute of Advanced Medicine, Inc, 17-20, Higashi Kasuga-machi, 870-0037 Oita City, Oita Japan
| | - Hiroyuki Nakamura
- School of Life Science and Technology, Tokyo Institute of Technology, 4259, Nagatsuta-cho, Midori-ku, 226-8503 Yokohama, Japan ,Laboratory for Chemistry and Life Science, Institute of Innovative Research, Tokyo Institute of Technology, 4259, Nagatsuta-cho, Midori-ku, 226-8503 Yokohama, Japan
| | - Jürgen Bajorath
- Department of Life Science Informatics, LIMES Program Unit Chemical Biology and Medicinal Chemistry, B-IT, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, D-53115 Bonn, Germany
| |
Collapse
|
50
|
van Tilborg D, Alenicheva A, Grisoni F. Exposing the Limitations of Molecular Machine Learning with Activity Cliffs. J Chem Inf Model 2022; 62:5938-5951. [PMID: 36456532 PMCID: PMC9749029 DOI: 10.1021/acs.jcim.2c01073] [Citation(s) in RCA: 46] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2022] [Indexed: 12/03/2022]
Abstract
Machine learning has become a crucial tool in drug discovery and chemistry at large, e.g., to predict molecular properties, such as bioactivity, with high accuracy. However, activity cliffs─pairs of molecules that are highly similar in their structure but exhibit large differences in potency─have received limited attention for their effect on model performance. Not only are these edge cases informative for molecule discovery and optimization but also models that are well equipped to accurately predict the potency of activity cliffs have increased potential for prospective applications. Our work aims to fill the current knowledge gap on best-practice machine learning methods in the presence of activity cliffs. We benchmarked a total of 24 machine and deep learning approaches on curated bioactivity data from 30 macromolecular targets for their performance on activity cliff compounds. While all methods struggled in the presence of activity cliffs, machine learning approaches based on molecular descriptors outperformed more complex deep learning methods. Our findings highlight large case-by-case differences in performance, advocating for (a) the inclusion of dedicated "activity-cliff-centered" metrics during model development and evaluation and (b) the development of novel algorithms to better predict the properties of activity cliffs. To this end, the methods, metrics, and results of this study have been encapsulated into an open-access benchmarking platform named MoleculeACE (Activity Cliff Estimation, available on GitHub at: https://github.com/molML/MoleculeACE). MoleculeACE is designed to steer the community toward addressing the pressing but overlooked limitation of molecular machine learning models posed by activity cliffs.
Collapse
Affiliation(s)
- Derek van Tilborg
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| | | | - Francesca Grisoni
- Institute
for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven University of Technology, 5612AZEindhoven, The Netherlands
- Centre
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, 3584CBUtrecht, The Netherlands
| |
Collapse
|