1
|
Nakata S, Mori Y, Tanaka S. Navigating Ultralarge Virtual Chemical Spaces with Product-of-Experts Chemical Language Models. J Chem Inf Model 2024; 64:7873-7884. [PMID: 39413401 DOI: 10.1021/acs.jcim.4c01214] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2024]
Abstract
Ultralarge virtual chemical spaces have emerged as a valuable resource for drug discovery, providing access to billions of make-on-demand compounds with high synthetic success rates. Chemical language models can potentially accelerate the exploration of these vast spaces through direct compound generation. However, existing models are not designed to navigate specific virtual chemical spaces and often overlook synthetic accessibility. To address this gap, we introduce product-of-experts (PoE) chemical language models, a modular and scalable approach to navigating ultralarge virtual chemical spaces. This method allows for controlled compound generation within a desired chemical space by combining a prior model pretrained on the target space with expert and anti-expert models fine-tuned using external property-specific data sets. We demonstrate that the PoE chemical language model can generate compounds with desirable properties, such as those that favorably dock to dopamine receptor D2 (DRD2) and are predicted to cross the blood-brain barrier (BBB), while ensuring that the majority of generated compounds are present within the target chemical space. Our results highlight the potential of chemical language models for navigating ultralarge virtual chemical spaces, and we anticipate that this study will motivate further research in this direction. The source code and data are freely available at https://github.com/shuyana/poeclm.
Collapse
Affiliation(s)
- Shuya Nakata
- Graduate School of System Informatics, Kobe University, Kobe 657-8501, Japan
| | - Yoshiharu Mori
- Graduate School of System Informatics, Kobe University, Kobe 657-8501, Japan
| | - Shigenori Tanaka
- Graduate School of System Informatics, Kobe University, Kobe 657-8501, Japan
| |
Collapse
|
2
|
Hank EC, Sai M, Kasch T, Meijer I, Marschner JA, Merk D. Development of Tailless Homologue Receptor (TLX) Agonist Chemical Tools. J Med Chem 2024; 67:16598-16611. [PMID: 39236094 DOI: 10.1021/acs.jmedchem.4c01443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2024]
Abstract
The human tailless homologue receptor (TLX) is a ligand-activated transcription factor acting as a master regulator of neural stem cell homeostasis. Despite its promising potential in neurodegenerative disease treatment, TLX ligands are rare but required to explore phenotypic effects of TLX modulation and for target validation. We have systematically studied and optimized a TLX agonist scaffold obtained by fragment fusion. Structural modification enabled the development of two TLX agonists endowed with nanomolar potency and binding affinity. Both exhibited favorable chemical tool characteristics including high selectivity and low toxicity. Most notably, the TLX agonists comprise different scaffolds and display high chemical diversity, enabling their use as a set for target identification and validation studies.
Collapse
Affiliation(s)
- Emily C Hank
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Minh Sai
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Till Kasch
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Isabelle Meijer
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Julian A Marschner
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| |
Collapse
|
3
|
Isigkeit L, Hörmann T, Schallmayer E, Scholz K, Lillich FF, Ehrler JHM, Hufnagel B, Büchner J, Marschner JA, Pabel J, Proschak E, Merk D. Automated design of multi-target ligands by generative deep learning. Nat Commun 2024; 15:7946. [PMID: 39261471 PMCID: PMC11390726 DOI: 10.1038/s41467-024-52060-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Accepted: 08/23/2024] [Indexed: 09/13/2024] Open
Abstract
Generative deep learning models enable data-driven de novo design of molecules with tailored features. Chemical language models (CLM) trained on string representations of molecules such as SMILES have been successfully employed to design new chemical entities with experimentally confirmed activity on intended targets. Here, we probe the application of CLM to generate multi-target ligands for designed polypharmacology. We capitalize on the ability of CLM to learn from small fine-tuning sets of molecules and successfully bias the model towards designing drug-like molecules with similarity to known ligands of target pairs of interest. Designs obtained from CLM after pooled fine-tuning are predicted active on both proteins of interest and comprise pharmacophore elements of ligands for both targets in one molecule. Synthesis and testing of twelve computationally favored CLM designs for six target pairs reveals modulation of at least one intended protein by all selected designs with up to double-digit nanomolar potency and confirms seven compounds as designed dual ligands. These results corroborate CLM for multi-target de novo design as source of innovation in drug discovery.
Collapse
Affiliation(s)
- Laura Isigkeit
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Tim Hörmann
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Espen Schallmayer
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Katharina Scholz
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Felix F Lillich
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, 60596, Frankfurt, Germany
| | - Johanna H M Ehrler
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Benedikt Hufnagel
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Jasmin Büchner
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
| | - Julian A Marschner
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Jörg Pabel
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany
| | - Ewgenij Proschak
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, 60596, Frankfurt, Germany
| | - Daniel Merk
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, 60438, Frankfurt, Germany.
- Ludwig-Maximilians-Universität München, Department of Pharmacy, 81377, Munich, Germany.
| |
Collapse
|
4
|
Reynders M, Willems S, Marschner JA, Wein T, Merk D, Thorn-Seshold O. A High-Quality Photoswitchable Probe that Selectively and Potently Regulates the Transcription Factor RORγ. Angew Chem Int Ed Engl 2024:e202410139. [PMID: 39248642 DOI: 10.1002/anie.202410139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2024] [Revised: 09/05/2024] [Accepted: 09/06/2024] [Indexed: 09/10/2024]
Abstract
Retinoic acid receptor-related orphan receptor γ (RORγ) is a nuclear hormone receptor with multiple biological functions in circadian clock regulation, inflammation, and immunity. Its cyclic temporal role in circadian rhythms, and cell-specific activity in the immune system, make it an intriguing target for spatially and temporally localised pharmacology. To create tools that can study RORγ biology with appropriate spatiotemporal resolution, we designed light-dependent inverse RORγ agonists by building azobenzene photoswitches into ligand consensus structures. Optimizations gave photoswitchable RORγ inhibitors combining a large degree of potency photocontrol, with remarkable on-target potency, and excellent selectivity over related off-target receptors. This still rare combination of performance features distinguishes them as high quality photopharmaceutical probes, which can now serve as high precision tools to study the spatial and dynamic intricacies of RORγ action in signaling and in inflammatory disorders.
Collapse
Affiliation(s)
- Martin Reynders
- Department of Pharmacy, Ludwig Maximilian University of Munich, Butenandtstr. 7, 81377, Munich, Germany
| | - Sabine Willems
- Department of Pharmacy, Ludwig Maximilian University of Munich, Butenandtstr. 7, 81377, Munich, Germany
| | - Julian A Marschner
- Department of Pharmacy, Ludwig Maximilian University of Munich, Butenandtstr. 7, 81377, Munich, Germany
| | - Thomas Wein
- Department of Pharmacy, Ludwig Maximilian University of Munich, Butenandtstr. 7, 81377, Munich, Germany
| | - Daniel Merk
- Department of Pharmacy, Ludwig Maximilian University of Munich, Butenandtstr. 7, 81377, Munich, Germany
| | - Oliver Thorn-Seshold
- Faculty of Chemistry and Food Chemistry, Technical University of Dresden, Bergstr. 66, 01069, Dresden, Germany
| |
Collapse
|
5
|
Özçelik R, de Ruiter S, Criscuolo E, Grisoni F. Chemical language modeling with structured state space sequence models. Nat Commun 2024; 15:6176. [PMID: 39039051 PMCID: PMC11263548 DOI: 10.1038/s41467-024-50469-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2023] [Accepted: 07/05/2024] [Indexed: 07/24/2024] Open
Abstract
Generative deep learning is reshaping drug design. Chemical language models (CLMs) - which generate molecules in the form of molecular strings - bear particular promise for this endeavor. Here, we introduce a recent deep learning architecture, termed Structured State Space Sequence (S4) model, into de novo drug design. In addition to its unprecedented performance in various fields, S4 has shown remarkable capabilities to learn the global properties of sequences. This aspect is intriguing in chemical language modeling, where complex molecular properties like bioactivity can 'emerge' from separated portions in the molecular string. This observation gives rise to the following question: Can S4 advance chemical language modeling for de novo design? To provide an answer, we systematically benchmark S4 with state-of-the-art CLMs on an array of drug discovery tasks, such as the identification of bioactive compounds, and the design of drug-like molecules and natural products. S4 shows a superior capacity to learn complex molecular properties, while at the same time exploring diverse scaffolds. Finally, when applied prospectively to kinase inhibition, S4 designs eight of out ten molecules that are predicted as highly active by molecular dynamics simulations. Taken together, these findings advocate for the introduction of S4 into chemical language modeling - uncovering its untapped potential in the molecular sciences.
Collapse
Affiliation(s)
- Rıza Özçelik
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | - Sarah de Ruiter
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Emanuele Criscuolo
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
| | - Francesca Grisoni
- Institute for Complex Molecular Systems and Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands.
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands.
| |
Collapse
|
6
|
Salehi S, Schallmayer E, Bandomir N, Kärcher A, Güth JF, Heitel P. Screening of Chelidonium majus isoquinoline alkaloids reveals berberine and chelidonine as selective ligands for the nuclear receptors RORβ and HNF4α, respectively. Arch Pharm (Weinheim) 2024; 357:e2300756. [PMID: 38501877 DOI: 10.1002/ardp.202300756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2023] [Revised: 02/27/2024] [Accepted: 02/28/2024] [Indexed: 03/20/2024]
Abstract
The nuclear receptors hepatocyte nuclear factor 4α (HNF4α) and retinoic acid receptor-related orphan receptor-β (RORβ) are ligand-regulated transcription factors and potential drug targets for metabolic disorders. However, there is a lack of small molecular, selective ligands to explore the therapeutic potential in further detail. Here, we report the discovery of greater celandine (Chelidonium majus) isoquinoline alkaloids as nuclear receptor modulators: Berberine is a selective RORβ inverse agonist and modulated target genes involved in the circadian clock, photoreceptor cell development, and neuronal function. The structurally related chelidonine was identified as a ligand for the constitutively active HNF4α receptor, with nanomolar potency in a cellular reporter gene assay. In human liver cancer cells naturally expressing high levels of HNF4α, chelidonine acted as an inverse agonist and downregulated genes associated with gluconeogenesis and drug metabolism. Both berberine and chelidonine are promising tool compounds to further investigate their target nuclear receptors and for drug discovery.
Collapse
Affiliation(s)
- Sohrab Salehi
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Frankfurt am Main, Germany
- Department of Prosthodontics, Center for Dentistry and Oral Medicine (Carolinum), Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Espen Schallmayer
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Nils Bandomir
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Annette Kärcher
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Jan-Frederik Güth
- Department of Prosthodontics, Center for Dentistry and Oral Medicine (Carolinum), Goethe University Frankfurt, Frankfurt am Main, Germany
| | - Pascal Heitel
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, Frankfurt am Main, Germany
| |
Collapse
|
7
|
Guo J, Schwaller P. Augmented Memory: Sample-Efficient Generative Molecular Design with Reinforcement Learning. JACS AU 2024; 4:2160-2172. [PMID: 38938817 PMCID: PMC11200228 DOI: 10.1021/jacsau.4c00066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 03/29/2024] [Accepted: 04/01/2024] [Indexed: 06/29/2024]
Abstract
Sample efficiency is a fundamental challenge in de novo molecular design. Ideally, molecular generative models should learn to satisfy a desired objective under minimal calls to oracles (computational property predictors). This problem becomes more apparent when using oracles that can provide increased predictive accuracy but impose significant computational cost. Consequently, designing molecules that are optimized for such oracles cannot be achieved under a practical computational budget. Molecular generative models based on simplified molecular-input line-entry system (SMILES) have shown remarkable sample efficiency when coupled with reinforcement learning, as demonstrated in the practical molecular optimization (PMO) benchmark. Here, we first show that experience replay drastically improves the performance of multiple previously proposed algorithms. Next, we propose a novel algorithm called Augmented Memory that combines data augmentation with experience replay. We show that scores obtained from oracle calls can be reused to update the model multiple times. We compare Augmented Memory to previously proposed algorithms and show significantly enhanced sample efficiency in an exploitation task, a drug discovery case study requiring both exploration and exploitation, and a materials design case study optimizing explicitly for quantum-mechanical properties. Our method achieves a new state-of-the-art in sample-efficient de novo molecular design, outperforming all of the previously reported methods. The code is available at https://github.com/schwallergroup/augmented_memory.
Collapse
Affiliation(s)
- Jeff Guo
- Laboratory
of Artificial Chemical Intelligence (LIAC), Institut des Sciences
et Ingénierie Chimiques, Ecole Polytechnique
Fédérale de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Centre of Competence in Research (NCCR) Catalysis, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne 1015, Switzerland
| | - Philippe Schwaller
- Laboratory
of Artificial Chemical Intelligence (LIAC), Institut des Sciences
et Ingénierie Chimiques, Ecole Polytechnique
Fédérale de Lausanne (EPFL), Lausanne 1015, Switzerland
- National
Centre of Competence in Research (NCCR) Catalysis, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne 1015, Switzerland
| |
Collapse
|
8
|
Isigkeit L, Schallmayer E, Busch R, Brunello L, Menge A, Elson L, Müller S, Knapp S, Stolz A, Marschner JA, Merk D. Chemogenomics for NR1 nuclear hormone receptors. Nat Commun 2024; 15:5201. [PMID: 38890295 PMCID: PMC11189487 DOI: 10.1038/s41467-024-49493-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 06/07/2024] [Indexed: 06/20/2024] Open
Abstract
Nuclear receptors (NRs) regulate transcription in response to ligand binding and NR modulation allows pharmacological control of gene expression. Although some NRs are relevant as drug targets, the NR1 family, which comprises 19 NRs binding to hormones, vitamins, and lipid metabolites, has only been partially explored from a translational perspective. To enable systematic target identification and validation for this protein family in phenotypic settings, we present an NR1 chemogenomic (CG) compound set optimized for complementary activity/selectivity profiles and chemical diversity. Based on broad profiling of candidates for specificity, toxicity, and off-target liabilities, sixty-nine comprehensively annotated NR1 agonists, antagonists and inverse agonists covering all members of the NR1 family and meeting potency and selectivity standards are included in the final NR1 CG set. Proof-of-concept application of this set reveals effects of NR1 members in autophagy, neuroinflammation and cancer cell death, and confirms the suitability of the set for target identification and validation.
Collapse
Affiliation(s)
- Laura Isigkeit
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
| | - Espen Schallmayer
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
| | - Romy Busch
- Ludwig-Maximilians-Universität (LMU) München, Department of Pharmacy, Munich, Germany
| | - Lorene Brunello
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
- Buchmann Institute for Molecular Life Sciences and Institute of Biochemistry 2, Goethe University Frankfurt, Frankfurt, Germany
| | - Amelie Menge
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
- Buchmann Institute for Molecular Life Sciences and Institute of Biochemistry 2, Goethe University Frankfurt, Frankfurt, Germany
| | - Lewis Elson
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
- Buchmann Institute for Molecular Life Sciences and Institute of Biochemistry 2, Goethe University Frankfurt, Frankfurt, Germany
| | - Susanne Müller
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
- Buchmann Institute for Molecular Life Sciences and Institute of Biochemistry 2, Goethe University Frankfurt, Frankfurt, Germany
| | - Stefan Knapp
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany
- Buchmann Institute for Molecular Life Sciences and Institute of Biochemistry 2, Goethe University Frankfurt, Frankfurt, Germany
| | - Alexandra Stolz
- Buchmann Institute for Molecular Life Sciences and Institute of Biochemistry 2, Goethe University Frankfurt, Frankfurt, Germany
| | - Julian A Marschner
- Ludwig-Maximilians-Universität (LMU) München, Department of Pharmacy, Munich, Germany
| | - Daniel Merk
- Goethe University Frankfurt, Institute of Pharmaceutical Chemistry, Frankfurt, Germany.
- Ludwig-Maximilians-Universität (LMU) München, Department of Pharmacy, Munich, Germany.
| |
Collapse
|
9
|
Gangwal A, Ansari A, Ahmad I, Azad AK, Kumarasamy V, Subramaniyan V, Wong LS. Generative artificial intelligence in drug discovery: basic framework, recent advances, challenges, and opportunities. Front Pharmacol 2024; 15:1331062. [PMID: 38384298 PMCID: PMC10879372 DOI: 10.3389/fphar.2024.1331062] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Accepted: 01/17/2024] [Indexed: 02/23/2024] Open
Abstract
There are two main ways to discover or design small drug molecules. The first involves fine-tuning existing molecules or commercially successful drugs through quantitative structure-activity relationships and virtual screening. The second approach involves generating new molecules through de novo drug design or inverse quantitative structure-activity relationship. Both methods aim to get a drug molecule with the best pharmacokinetic and pharmacodynamic profiles. However, bringing a new drug to market is an expensive and time-consuming endeavor, with the average cost being estimated at around $2.5 billion. One of the biggest challenges is screening the vast number of potential drug candidates to find one that is both safe and effective. The development of artificial intelligence in recent years has been phenomenal, ushering in a revolution in many fields. The field of pharmaceutical sciences has also significantly benefited from multiple applications of artificial intelligence, especially drug discovery projects. Artificial intelligence models are finding use in molecular property prediction, molecule generation, virtual screening, synthesis planning, repurposing, among others. Lately, generative artificial intelligence has gained popularity across domains for its ability to generate entirely new data, such as images, sentences, audios, videos, novel chemical molecules, etc. Generative artificial intelligence has also delivered promising results in drug discovery and development. This review article delves into the fundamentals and framework of various generative artificial intelligence models in the context of drug discovery via de novo drug design approach. Various basic and advanced models have been discussed, along with their recent applications. The review also explores recent examples and advances in the generative artificial intelligence approach, as well as the challenges and ongoing efforts to fully harness the potential of generative artificial intelligence in generating novel drug molecules in a faster and more affordable manner. Some clinical-level assets generated form generative artificial intelligence have also been discussed in this review to show the ever-increasing application of artificial intelligence in drug discovery through commercial partnerships.
Collapse
Affiliation(s)
- Amit Gangwal
- Department of Natural Product Chemistry, Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Azim Ansari
- Computer Aided Drug Design Center Shri Vile Parle Kelavani Mandal’s Institute of Pharmacy, Dhule, Maharashtra, India
| | - Iqrar Ahmad
- Department of Pharmaceutical Chemistry, Prof. Ravindra Nikam College of Pharmacy, Dhule, India
| | - Abul Kalam Azad
- Faculty of Pharmacy, University College of MAIWP International, Batu Caves, Malaysia
| | - Vinoth Kumarasamy
- Department of Parasitology and Medical Entomology, Faculty of Medicine, Universiti Kebangsaan Malaysia, Cheras, Malaysia
| | - Vetriselvan Subramaniyan
- Pharmacology Unit, Jeffrey Cheah School of Medicine and Health Sciences, Monash University Malaysia, Selangor, Malaysia
- School of Bioengineering and Biosciences, Lovely Professional University, Phagwara, Punjab, India
| | - Ling Shing Wong
- Faculty of Health and Life Sciences, INTI International University, Nilai, Malaysia
| |
Collapse
|
10
|
Tropsha A, Isayev O, Varnek A, Schneider G, Cherkasov A. Integrating QSAR modelling and deep learning in drug discovery: the emergence of deep QSAR. Nat Rev Drug Discov 2024; 23:141-155. [PMID: 38066301 DOI: 10.1038/s41573-023-00832-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/21/2023] [Indexed: 02/08/2024]
Abstract
Quantitative structure-activity relationship (QSAR) modelling, an approach that was introduced 60 years ago, is widely used in computer-aided drug design. In recent years, progress in artificial intelligence techniques, such as deep learning, the rapid growth of databases of molecules for virtual screening and dramatic improvements in computational power have supported the emergence of a new field of QSAR applications that we term 'deep QSAR'. Marking a decade from the pioneering applications of deep QSAR to tasks involved in small-molecule drug discovery, we herein describe key advances in the field, including deep generative and reinforcement learning approaches in molecular design, deep learning models for synthetic planning and the application of deep QSAR models in structure-based virtual screening. We also reflect on the emergence of quantum computing, which promises to further accelerate deep QSAR applications and the need for open-source and democratized resources to support computer-aided drug design.
Collapse
Affiliation(s)
| | | | | | | | - Artem Cherkasov
- University of British Columbia, Vancouver, BC, Canada.
- Photonic Inc., Coquitlam, BC, Canada.
| |
Collapse
|
11
|
Adouvi G, Nawa F, Ballarotto M, Rüger LA, Knümann L, Kasch T, Arifi S, Schubert-Zsilavecz M, Willems S, Marschner JA, Pabel J, Merk D. Structural Fusion of Natural and Synthetic Ligand Features Boosts RXR Agonist Potency. J Med Chem 2023; 66:16762-16771. [PMID: 38064686 DOI: 10.1021/acs.jmedchem.3c01435] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
The retinoid X receptors (RXRs) are ligand-activated transcription factors involved in, for example, differentiation and apoptosis regulation. Currently used reference RXR agonists suffer from insufficient specificity and poor physicochemical properties, and improved tools are needed to capture the unexplored therapeutic potential of RXR. Endogenous vitamin A-derived RXR ligands and the natural product RXR agonist valerenic acid comprise acrylic acid residues with varying substitution patterns to engage the critical ionic contact with the binding site arginine. To mimic and exploit this natural ligand motif, we probed its structural fusion with synthetic RXR modulator scaffolds, which had profound effects on agonist activity and remarkably boosted potency of an oxaprozin-derived RXR agonist chemotype. Bioisosteric replacement of the acrylic acid to overcome its pan-assay interference compounds (PAINS) character enabled the development of a highly optimized RXR agonist chemical probe.
Collapse
Affiliation(s)
- Gustave Adouvi
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Felix Nawa
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Marco Ballarotto
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Lorena Andrea Rüger
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | - Loris Knümann
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Till Kasch
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Silvia Arifi
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | | | - Sabine Willems
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Julian A Marschner
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Jörg Pabel
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Daniel Merk
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| |
Collapse
|
12
|
Ochiai T, Inukai T, Akiyama M, Furui K, Ohue M, Matsumori N, Inuki S, Uesugi M, Sunazuka T, Kikuchi K, Kakeya H, Sakakibara Y. Variational autoencoder-based chemical latent space for large molecular structures with 3D complexity. Commun Chem 2023; 6:249. [PMID: 37973971 PMCID: PMC10654724 DOI: 10.1038/s42004-023-01054-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023] Open
Abstract
The structural diversity of chemical libraries, which are systematic collections of compounds that have potential to bind to biomolecules, can be represented by chemical latent space. A chemical latent space is a projection of a compound structure into a mathematical space based on several molecular features, and it can express structural diversity within a compound library in order to explore a broader chemical space and generate novel compound structures for drug candidates. In this study, we developed a deep-learning method, called NP-VAE (Natural Product-oriented Variational Autoencoder), based on variational autoencoder for managing hard-to-analyze datasets from DrugBank and large molecular structures such as natural compounds with chirality, an essential factor in the 3D complexity of compounds. NP-VAE was successful in constructing the chemical latent space from large-sized compounds that were unable to be handled in existing methods, achieving higher reconstruction accuracy, and demonstrating stable performance as a generative model across various indices. Furthermore, by exploring the acquired latent space, we succeeded in comprehensively analyzing a compound library containing natural compounds and generating novel compound structures with optimized functions.
Collapse
Grants
- 22H04901 Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 17H06410 Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04885 Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04880 Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04881 Ministry of Education, Culture, Sports, Science and Technology (MEXT)
- 23H04887 Ministry of Education, Culture, Sports, Science and Technology (MEXT)
Collapse
Affiliation(s)
- Toshiki Ochiai
- Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan
| | - Tensei Inukai
- Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan
| | - Manato Akiyama
- Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan
| | - Kairi Furui
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8501, Japan
| | - Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Yokohama, Kanagawa, 226-8501, Japan
| | - Nobuaki Matsumori
- Department of Chemistry, Graduate School of Science, Kyushu University, Fukuoka, Fukuoka, 819-0395, Japan
| | - Shinsuke Inuki
- Division of Medicinal Frontier Sciences, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Kyoto, 606-8501, Japan
| | - Motonari Uesugi
- Institute for Chemical Research and WPI-iCeMS, Kyoto University, Uji, Kyoto, 611-0011, Japan
| | - Toshiaki Sunazuka
- Omura Satoshi Memorial Institute and Graduate School of Infection Control Sciences, Kitasato University, Minato-ku, Tokyo, 108-8641, Japan
| | - Kazuya Kikuchi
- Department of Applied Chemistry, Graduate School of Engineering, Osaka University, Suita, Osaka, 565-0871, Japan
- Immunology Frontier Research Centre, Osaka University, Suita, Osaka, 565-0871, Japan
| | - Hideaki Kakeya
- Division of Medicinal Frontier Sciences, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Kyoto, 606-8501, Japan
| | - Yasubumi Sakakibara
- Department of Biosciences and Informatics, Keio University, Yokohama, Kanagawa, 223-8522, Japan.
- Department of Data Science, Kitasato University School of Frontier Engineering, Sagamihara, Kanagawa, 252-0373, Japan.
| |
Collapse
|
13
|
Mullowney MW, Duncan KR, Elsayed SS, Garg N, van der Hooft JJJ, Martin NI, Meijer D, Terlouw BR, Biermann F, Blin K, Durairaj J, Gorostiola González M, Helfrich EJN, Huber F, Leopold-Messer S, Rajan K, de Rond T, van Santen JA, Sorokina M, Balunas MJ, Beniddir MA, van Bergeijk DA, Carroll LM, Clark CM, Clevert DA, Dejong CA, Du C, Ferrinho S, Grisoni F, Hofstetter A, Jespers W, Kalinina OV, Kautsar SA, Kim H, Leao TF, Masschelein J, Rees ER, Reher R, Reker D, Schwaller P, Segler M, Skinnider MA, Walker AS, Willighagen EL, Zdrazil B, Ziemert N, Goss RJM, Guyomard P, Volkamer A, Gerwick WH, Kim HU, Müller R, van Wezel GP, van Westen GJP, Hirsch AKH, Linington RG, Robinson SL, Medema MH. Artificial intelligence for natural product drug discovery. Nat Rev Drug Discov 2023; 22:895-916. [PMID: 37697042 DOI: 10.1038/s41573-023-00774-7] [Citation(s) in RCA: 33] [Impact Index Per Article: 33.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2023] [Indexed: 09/13/2023]
Abstract
Developments in computational omics technologies have provided new means to access the hidden diversity of natural products, unearthing new potential for drug discovery. In parallel, artificial intelligence approaches such as machine learning have led to exciting developments in the computational drug design field, facilitating biological activity prediction and de novo drug design for molecular targets of interest. Here, we describe current and future synergies between these developments to effectively identify drug candidates from the plethora of molecules produced by nature. We also discuss how to address key challenges in realizing the potential of these synergies, such as the need for high-quality datasets to train deep learning algorithms and appropriate strategies for algorithm validation.
Collapse
Affiliation(s)
| | - Katherine R Duncan
- Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
| | - Somayah S Elsayed
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Neha Garg
- School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA
| | - Justin J J van der Hooft
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
| | - Nathaniel I Martin
- Biological Chemistry Group, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - David Meijer
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Barbara R Terlouw
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
| | - Friederike Biermann
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Kai Blin
- The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
| | | | - Marina Gorostiola González
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
- ONCODE institute, Leiden, The Netherlands
| | - Eric J N Helfrich
- Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany
- LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
| | - Florian Huber
- Center for Digitalization and Digitality, Hochschule Düsseldorf, Düsseldorf, Germany
| | - Stefan Leopold-Messer
- Institut für Mikrobiologie, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
| | - Kohulan Rajan
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Jena, Germany
| | - Tristan de Rond
- School of Chemical Sciences, University of Auckland, Auckland, New Zealand
| | - Jeffrey A van Santen
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
| | - Maria Sorokina
- Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University, Jena, Germany
- Pharmaceuticals R&D, Bayer AG, Berlin, Germany
| | - Marcy J Balunas
- Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA
- Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
| | - Mehdi A Beniddir
- Équipe "Chimie des Substances Naturelles", Université Paris-Saclay, CNRS, BioCIS, Orsay, France
| | - Doris A van Bergeijk
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | - Laura M Carroll
- Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
| | - Chase M Clark
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | | | | | - Chao Du
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
| | | | - Francesca Grisoni
- Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | | | - Willem Jespers
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
| | - Olga V Kalinina
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Drug Bioinformatics, Medical Faculty, Saarland University, Homburg, Germany
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
| | | | - Hyunwoo Kim
- College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University Seoul, Goyang-si, Republic of Korea
| | - Tiago F Leao
- Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
| | - Joleen Masschelein
- Center for Microbiology, VIB-KU Leuven, Heverlee, Belgium
- Department of Biology, KU Leuven, Heverlee, Belgium
| | - Evan R Rees
- Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
| | - Raphael Reher
- Institute of Pharmaceutical Biology and Biotechnology, University of Marburg, Marburg, Germany
- Institute of Pharmacy, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany
| | - Daniel Reker
- Department of Biomedical Engineering, Duke University, Durham, NC, USA
- Duke Microbiome Center, Duke University, Durham, NC, USA
| | - Philippe Schwaller
- Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
| | | | - Michael A Skinnider
- Adapsyn Bioscience, Hamilton, Ontario, Canada
- Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
| | - Allison S Walker
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
| | - Egon L Willighagen
- Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
| | - Barbara Zdrazil
- European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, UK
| | - Nadine Ziemert
- Interfaculty Institute for Microbiology and Infection Medicine Tuebingen (IMIT), Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen, Germany
| | | | - Pierre Guyomard
- Bonsai team, CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Université de Lille, Villeneuve d'Ascq Cedex, France
| | - Andrea Volkamer
- Center for Bioinformatics, Saarland University, Saarbrücken, Germany
- In silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
| | - William H Gerwick
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
| | - Hyun Uk Kim
- Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
| | - Rolf Müller
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany
- Department of Pharmacy, Saarland University, Saarbrücken, Germany
- German Center for infection research (DZIF), Braunschweig, Germany
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany
| | - Gilles P van Wezel
- Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
- Netherlands Institute of Ecology, NIOO-KNAW, Wageningen, The Netherlands
| | - Gerard J P van Westen
- Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
| | - Anna K H Hirsch
- Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany.
- Department of Pharmacy, Saarland University, Saarbrücken, Germany.
- German Center for infection research (DZIF), Braunschweig, Germany.
- Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany.
| | - Roger G Linington
- Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada.
| | - Serina L Robinson
- Department of Environmental Microbiology, Eawag: Swiss Federal Institute for Aquatic Science and Technology, Dübendorf, Switzerland.
| | - Marnix H Medema
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands.
- Institute of Biology, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
14
|
Sai M, Vietor J, Kornmayer M, Egner M, López-García Ú, Höfner G, Pabel J, Marschner JA, Wein T, Merk D. Structure-Guided Design of Nurr1 Agonists Derived from the Natural Ligand Dihydroxyindole. J Med Chem 2023; 66:13556-13567. [PMID: 37751901 PMCID: PMC10578347 DOI: 10.1021/acs.jmedchem.3c00852] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2023] [Indexed: 09/28/2023]
Abstract
The neuroprotective transcription factor Nurr1 was recently found to bind the dopamine metabolite 5,6-dihydroxyindole (DHI) providing access to Nurr1 ligand design from a natural template. We screened a custom set of 14 k extended DHI analogues in silico for optimized descendants to select 24 candidates for microscale synthesis and in vitro testing. Three out of six primary hits were validated as novel Nurr1 agonists with up to sub-micromolar binding affinity, highlighting the druggability of the Nurr1 surface region lining helix 12. In vitro profiling confirmed cellular target engagement of DHI descendants and demonstrated remarkable additive effects of combined Nurr1 agonist treatment, indicating diverse binding sites mediating Nurr1 activation, which may open new avenues in Nurr1 modulation.
Collapse
Affiliation(s)
| | | | - Moritz Kornmayer
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Markus Egner
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Úrsula López-García
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Georg Höfner
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Jörg Pabel
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Julian A. Marschner
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Thomas Wein
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität München, 81377 Munich, Germany
| |
Collapse
|
15
|
Lamanna G, Delre P, Marcou G, Saviano M, Varnek A, Horvath D, Mangiatordi GF. GENERA: A Combined Genetic/Deep-Learning Algorithm for Multiobjective Target-Oriented De Novo Design. J Chem Inf Model 2023; 63:5107-5119. [PMID: 37556857 PMCID: PMC10466378 DOI: 10.1021/acs.jcim.3c00963] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2023] [Indexed: 08/11/2023]
Abstract
This study introduces a new de novo design algorithm called GENERA that combines the capabilities of a deep-learning algorithm for automated drug-like analogue design, called DeLA-Drug, with a genetic algorithm for generating molecules with desired target-oriented properties. Specifically, GENERA was applied to the angiotensin-converting enzyme 2 (ACE2) target, which is implicated in many pathological conditions, including COVID-19. The ability of GENERA to de novo design promising candidates for a specific target was assessed using two docking programs, PLANTS and GLIDE. A fitness function based on the Pareto dominance resulting from computed PLANTS and GLIDE scores was applied to demonstrate the algorithm's ability to perform multiobjective optimizations effectively. GENERA can quickly generate focused libraries that produce better scores compared to a starting set of known ACE-2 binders. This study is the first to utilize a DL-based algorithm designed for analogue generation as a mutational operator within a GA framework, representing an innovative approach to target-oriented de novo design.
Collapse
Affiliation(s)
- Giuseppe Lamanna
- Chemistry
Department, University of Bari “Aldo
Moro”, Via E.
Orabona, 4, I-70125 Bari, Italy
- CNR
− Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Pietro Delre
- CNR
− Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Gilles Marcou
- Laboratoire
de Chémoinformatique UMR7140, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Michele Saviano
- CNR
− Institute of Crystallography, Via Vivaldi 43, 81100 Caserta, Italy
| | - Alexandre Varnek
- Laboratoire
de Chémoinformatique UMR7140, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Dragos Horvath
- Laboratoire
de Chémoinformatique UMR7140, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | | |
Collapse
|
16
|
Zhang W, Zhang K, Huang J. A Simple Way to Incorporate Target Structural Information in Molecular Generative Models. J Chem Inf Model 2023. [PMID: 37318828 DOI: 10.1021/acs.jcim.3c00293] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
Deep learning generative models are now being applied in various fields including drug discovery. In this work, we propose a novel approach to include target 3D structural information in molecular generative models for structure-based drug design. The method combines a message-passing neural network model that predicts docking scores with a generative neural network model as its reward function to navigate the chemical space searching for molecules that bind favorably with a specific target. A key feature of the method is the construction of target-specific molecular sets for training, designed to overcome potential transferability issues of surrogate docking models through a two-round training process. Consequently, this enables accurate guided exploration of the chemical space without reliance on the collection of prior knowledge about active and inactive compounds for the specific target. Tests on eight target proteins showed a 100-fold increase in hit generation compared to conventional docking calculations and the ability to generate molecules similar to approved drugs or known active ligands for specific targets without prior knowledge. This method provides a general and highly efficient solution for structure-based molecular generation.
Collapse
Affiliation(s)
- Wenyi Zhang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
- Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| | - Kaiyue Zhang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| | - Jing Huang
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
- Institute of Biology, Westlake Institute for Advanced Study, 18 Shilongshan Road, Hangzhou, Zhejiang 310024, China
| |
Collapse
|
17
|
Ballarotto M, Willems S, Stiller T, Nawa F, Marschner JA, Grisoni F, Merk D. De Novo Design of Nurr1 Agonists via Fragment-Augmented Generative Deep Learning in Low-Data Regime. J Med Chem 2023. [PMID: 37256819 DOI: 10.1021/acs.jmedchem.3c00485] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Generative neural networks trained on SMILES can design innovative bioactive molecules de novo. These so-called chemical language models (CLMs) have typically been trained on tens of template molecules for fine-tuning. However, it is challenging to apply CLM to orphan targets with few known ligands. We have fine-tuned a CLM with a single potent Nurr1 agonist as template in a fragment-augmented fashion and obtained novel Nurr1 agonists using sampling frequency for design prioritization. Nanomolar potency and binding affinity of the top-ranking design and its structural novelty compared to available Nurr1 ligands highlight its value as an early chemical tool and as a lead for Nurr1 agonist development, as well as the applicability of CLM in very low-data scenarios.
Collapse
Affiliation(s)
- Marco Ballarotto
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
- Department of Pharmaceutical Sciences, Università degli Studi di Perugia, 06123 Perugia, Italy
| | - Sabine Willems
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Tanja Stiller
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Felix Nawa
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Julian A Marschner
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Francesca Grisoni
- Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, 5612AZ Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, 3584CB Utrecht, The Netherlands
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| |
Collapse
|
18
|
Vietor J, Gege C, Stiller T, Busch R, Schallmayer E, Kohlhof H, Höfner G, Pabel J, Marschner JA, Merk D. Development of a Potent Nurr1 Agonist Tool for In Vivo Applications. J Med Chem 2023; 66:6391-6402. [PMID: 37127285 PMCID: PMC10184128 DOI: 10.1021/acs.jmedchem.3c00415] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
Nuclear receptor related 1 (Nurr1) is a neuroprotective transcription factor and an emerging target in neurodegenerative diseases. Despite strong evidence for a role in Parkinson's and Alzheimer's disease, pharmacological control and validation of Nurr1 are hindered by a lack of suitable ligands. We have discovered considerable Nurr1 activation by the clinically studied dihydroorotate dehydrogenase (DHODH) inhibitor vidofludimus calcium and systematically optimized this scaffold to a Nurr1 agonist with nanomolar potency, strong activation efficacy, and pronounced preference over the highly related receptors Nur77 and NOR1. The optimized compound induced Nurr1-regulated gene expression in astrocytes and exhibited favorable pharmacokinetics in rats, thus emerging as a superior chemical tool to study Nurr1 activation in vitro and in vivo.
Collapse
Affiliation(s)
- Jan Vietor
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | | | - Tanja Stiller
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Romy Busch
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Espen Schallmayer
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany
| | | | - Georg Höfner
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Jörg Pabel
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Julian A Marschner
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
| | - Daniel Merk
- Department of Pharmacy, Ludwig-Maximilians-Universität (LMU) München, 81377 Munich, Germany
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, 60438 Frankfurt, Germany
| |
Collapse
|
19
|
Grisoni F. Chemical language models for de novo drug design: Challenges and opportunities. Curr Opin Struct Biol 2023; 79:102527. [PMID: 36738564 DOI: 10.1016/j.sbi.2023.102527] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 12/07/2022] [Accepted: 12/20/2022] [Indexed: 02/05/2023]
Abstract
Generative deep learning is accelerating de novo drug design, by allowing the generation of molecules with desired properties on demand. Chemical language models - which generate new molecules in the form of strings using deep learning - have been particularly successful in this endeavour. Thanks to advances in natural language processing methods and interdisciplinary collaborations, chemical language models are expected to become increasingly relevant in drug discovery. This minireview provides an overview of the current state-of-the-art of chemical language models for de novo design, and analyses current limitations, challenges, and advantages. Finally, a perspective on future opportunities is provided.
Collapse
Affiliation(s)
- Francesca Grisoni
- Eindhoven University of Technology, Institute for Complex Molecular Systems and Dept. Biomedical Engineering, Eindhoven, Netherlands; Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Netherlands.
| |
Collapse
|
20
|
Zhang H, Saravanan KM, Wei Y, Jiao Y, Yang Y, Pan Y, Wu X, Zhang JZH. Deep Learning-Based Bioactive Therapeutic Peptide Generation and Screening. J Chem Inf Model 2023; 63:835-845. [PMID: 36724090 DOI: 10.1021/acs.jcim.2c01485] [Citation(s) in RCA: 11] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
Many bioactive peptides demonstrated therapeutic effects over complicated diseases, such as antiviral, antibacterial, anticancer, etc. It is possible to generate a large number of potentially bioactive peptides using deep learning in a manner analogous to the generation of de novo chemical compounds using the acquired bioactive peptides as a training set. Such generative techniques would be significant for drug development since peptides are much easier and cheaper to synthesize than compounds. Despite the limited availability of deep learning-based peptide-generating models, we have built an LSTM model (called LSTM_Pep) to generate de novo peptides and fine-tuned the model to generate de novo peptides with specific prospective therapeutic benefits. Remarkably, the Antimicrobial Peptide Database has been effectively utilized to generate various kinds of potential active de novo peptides. We proposed a pipeline for screening those generated peptides for a given target and used the main protease of SARS-COV-2 as a proof-of-concept. Moreover, we have developed a deep learning-based protein-peptide prediction model (DeepPep) for rapid screening of the generated peptides for the given targets. Together with the generating model, we have demonstrated that iteratively fine-tuning training, generating, and screening peptides for higher-predicted binding affinity peptides can be achieved. Our work sheds light on developing deep learning-based methods and pipelines to effectively generate and obtain bioactive peptides with a specific therapeutic effect and showcases how artificial intelligence can help discover de novo bioactive peptides that can bind to a particular target.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai 600073, Tamil Nadu, India
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China
| | - Yang Jiao
- Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for infectious disease, State Key Discipline of Infectious Disease, Shenzhen Third People's Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen 518112, China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,Faculty of Computer Science and Control Engineering, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Xuli Wu
- School of Medicine, Shenzhen University, Shenzhen 518060, Guangdong, China
| | - John Z H Zhang
- Shenzhen Institute of Synthetic Biology, Faculty of Synthetic Biology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, Guangdong, China.,East China Normal University, Shanghai 200062, China.,NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
21
|
Zhang J, Chen B, Zhang C, Sun N, Huang X, Wang W, Fu W. Modes of action insights from the crystallographic structures of retinoic acid receptor-related orphan receptor-γt (RORγt). Eur J Med Chem 2023; 247:115039. [PMID: 36566711 DOI: 10.1016/j.ejmech.2022.115039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2022] [Revised: 11/29/2022] [Accepted: 12/19/2022] [Indexed: 12/24/2022]
Abstract
RORγt plays an important role in mediating IL-17 production and some tumor cells. It has four functional domains, of which the ligand-binding domain (LBD) is responsible for binding agonists to recruit co-activators or inverse agonists to prevent co-activator recruiting the agonists. Thus, potent ligands targeting the LBD of this protein could provide novel treatments for cancer and autoimmune diseases. In this perspective, we summarized and discussed various modes of action (MOA) of RORγt-ligand binding structures. The ligands can bind with RORγt at either orthosteric site or the allosteric site, and the binding modes at these two sites are different for agonists and inverse agonist. At the orthosteric site, the binding of agonist is to stabilize the H479-Y502-F506 triplet interaction network of RORγt. The binding of inverse agonist features as these four apparent ways: (1) blocking the entrance of the agonist pocket in RORγt; (2) directly breaking the H479-Y502 pair interactions; (3) destabilizing the triplet H479-Y502-F506 interaction network through perturbing the conformation of the side chain in M358 at the bottom of the binding pocket; (4) and destabilizing the triplet H479-Y502-F506 through changing the conformation of the side chain of residue W317 side chain. At the allosteric site of RORγt, the binding of inverse agonist was found recently to inhibit the activation of protein by interacting directly with H12, which results in unfolding of helix 11' and orientation of H12 to directly block cofactor peptide binding. This overview of recent advances in the RORγt structures is expected to provide a guidance of designing more potent drugs to treat RORγt-related diseases.
Collapse
Affiliation(s)
- Junjie Zhang
- School of Pharmacy & Minhang Hospital, Fudan University, Shanghai, 201301, PR China
| | - Baiyu Chen
- School of Pharmacy & Minhang Hospital, Fudan University, Shanghai, 201301, PR China
| | - Chao Zhang
- School of Pharmacy & Minhang Hospital, Fudan University, Shanghai, 201301, PR China
| | - Nannan Sun
- School of Pharmacy & Minhang Hospital, Fudan University, Shanghai, 201301, PR China
| | - Xiaoqin Huang
- Center for Research Computing, Office of Information Technology, Center for Theoretical Biological Physics, Rice University, Houston, TX, 77030, USA
| | - Wuqing Wang
- School of Pharmacy & Minhang Hospital, Fudan University, Shanghai, 201301, PR China
| | - Wei Fu
- School of Pharmacy & Minhang Hospital, Fudan University, Shanghai, 201301, PR China.
| |
Collapse
|
22
|
Moret M, Pachon Angona I, Cotos L, Yan S, Atz K, Brunner C, Baumgartner M, Grisoni F, Schneider G. Leveraging molecular structure and bioactivity with chemical language models for de novo drug design. Nat Commun 2023; 14:114. [PMID: 36611029 PMCID: PMC9825622 DOI: 10.1038/s41467-022-35692-6] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 12/19/2022] [Indexed: 01/09/2023] Open
Abstract
Generative chemical language models (CLMs) can be used for de novo molecular structure generation by learning from a textual representation of molecules. Here, we show that hybrid CLMs can additionally leverage the bioactivity information available for the training compounds. To computationally design ligands of phosphoinositide 3-kinase gamma (PI3Kγ), a collection of virtual molecules was created with a generative CLM. This virtual compound library was refined using a CLM-based classifier for bioactivity prediction. This second hybrid CLM was pretrained with patented molecular structures and fine-tuned with known PI3Kγ ligands. Several of the computer-generated molecular designs were commercially available, enabling fast prescreening and preliminary experimental validation. A new PI3Kγ ligand with sub-micromolar activity was identified, highlighting the method's scaffold-hopping potential. Chemical synthesis and biochemical testing of two of the top-ranked de novo designed molecules and their derivatives corroborated the model's ability to generate PI3Kγ ligands with medium to low nanomolar activity for hit-to-lead expansion. The most potent compounds led to pronounced inhibition of PI3K-dependent Akt phosphorylation in a medulloblastoma cell model, demonstrating efficacy of PI3Kγ ligands in PI3K/Akt pathway repression in human tumor cells. The results positively advocate hybrid CLMs for virtual compound screening and activity-focused molecular design.
Collapse
Affiliation(s)
- Michael Moret
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Irene Pachon Angona
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Leandro Cotos
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Shen Yan
- University of Zurich, University Children's Hospital, Children's Research Center, Pediatric Molecular Neuro-Oncology Research, Lengghalde 5, 8008, Zurich, Switzerland
| | - Kenneth Atz
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Cyrill Brunner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland
| | - Martin Baumgartner
- University of Zurich, University Children's Hospital, Children's Research Center, Pediatric Molecular Neuro-Oncology Research, Lengghalde 5, 8008, Zurich, Switzerland
| | - Francesca Grisoni
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
- Eindhoven University of Technology, Institute for Complex Molecular Systems and Eindhoven Artificial Intelligence Systems Institute, Department of Biomedical Engineering, Groene Loper 7, 5612AZ, Eindhoven, The Netherlands.
- Center for 393 Living Technologies, Alliance TU/e, WUR, UU, UMC 394 Utrecht, Utrecht, 3584 CB, The Netherlands.
| | - Gisbert Schneider
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, 8093, Zurich, Switzerland.
- ETH Singapore SEC Ltd, 1 CREATE Way, #06-01 CREATE Tower, Singapore, 138602, Singapore.
| |
Collapse
|
23
|
Noguchi S, Inoue J. Exploration of Chemical Space Guided by PixelCNN for Fragment-Based De Novo Drug Discovery. J Chem Inf Model 2022; 62:5988-6001. [PMID: 36454646 DOI: 10.1021/acs.jcim.2c01345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
We report a novel framework for achieving fragment-based molecular design using pixel convolutional neural network (PixelCNN) combined with the simplified molecular input line entry system (SMILES) as molecular representation. While a widely used recurrent neural network (RNN) assumes monotonically decaying correlations in strings, PixelCNN captures a periodicity among characters of SMILES. Thus, PixelCNN provides us with a novel solution for the analysis of chemical space by extracting the periodicity of molecular structures that will be buried in SMILES. Moreover, this characteristic enables us to generate molecules by combining several simple building blocks, such as a benzene ring and side-chain structures, which contributes to the effective exploration of chemical space by step-by-step searching for molecules from a target fragment. In conclusion, PixelCNN could be a powerful approach focusing on the periodicity of molecules to explore chemical space for the fragment-based molecular design.
Collapse
Affiliation(s)
- Satoshi Noguchi
- Department of Advanced Interdisciplinary Studies, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| | - Junya Inoue
- Institute for Industrial Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba277-0082, Japan.,Department of Materials Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo113-8656, Japan.,Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| |
Collapse
|
24
|
Thomas M, O’Boyle NM, Bender A, de Graaf C. Augmented Hill-Climb increases reinforcement learning efficiency for language-based de novo molecule generation. J Cheminform 2022; 14:68. [PMID: 36192789 PMCID: PMC9531503 DOI: 10.1186/s13321-022-00646-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Accepted: 09/23/2022] [Indexed: 11/10/2022] Open
Abstract
A plethora of AI-based techniques now exists to conduct de novo molecule generation that can devise molecules conditioned towards a particular endpoint in the context of drug design. One popular approach is using reinforcement learning to update a recurrent neural network or language-based de novo molecule generator. However, reinforcement learning can be inefficient, sometimes requiring up to 105 molecules to be sampled to optimize more complex objectives, which poses a limitation when using computationally expensive scoring functions like docking or computer-aided synthesis planning models. In this work, we propose a reinforcement learning strategy called Augmented Hill-Climb based on a simple, hypothesis-driven hybrid between REINVENT and Hill-Climb that improves sample-efficiency by addressing the limitations of both currently used strategies. We compare its ability to optimize several docking tasks with REINVENT and benchmark this strategy against other commonly used reinforcement learning strategies including REINFORCE, REINVENT (version 1 and 2), Hill-Climb and best agent reminder. We find that optimization ability is improved ~ 1.5-fold and sample-efficiency is improved ~ 45-fold compared to REINVENT while still delivering appealing chemistry as output. Diversity filters were used, and their parameters were tuned to overcome observed failure modes that take advantage of certain diversity filter configurations. We find that Augmented Hill-Climb outperforms the other reinforcement learning strategies used on six tasks, especially in the early stages of training or for more difficult objectives. Lastly, we show improved performance not only on recurrent neural networks but also on a reinforcement learning stabilized transformer architecture. Overall, we show that Augmented Hill-Climb improves sample-efficiency for language-based de novo molecule generation conditioning via reinforcement learning, compared to the current state-of-the-art. This makes more computationally expensive scoring functions, such as docking, more accessible on a relevant timescale.
Collapse
Affiliation(s)
- Morgan Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW UK
| | - Noel M. O’Boyle
- Computational Chemistry, Sosei Heptares, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG UK
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW UK
| | - Chris de Graaf
- Computational Chemistry, Sosei Heptares, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG UK
| |
Collapse
|
25
|
Veríssimo GC, Serafim MSM, Kronenberger T, Ferreira RS, Honorio KM, Maltarollo VG. Designing drugs when there is low data availability: one-shot learning and other approaches to face the issues of a long-term concern. Expert Opin Drug Discov 2022; 17:929-947. [PMID: 35983695 DOI: 10.1080/17460441.2022.2114451] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Modern drug discovery generally is accessed by useful information from previous large databases or uncovering novel data. The lack of biological and/or chemical data tends to slow the development of scientific research and innovation. Here, approaches that may help provide solutions to generate or obtain enough relevant data or improve/accelerate existing methods within the last five years were reviewed. AREAS COVERED One-shot learning (OSL) approaches, structural modeling, molecular docking, scoring function space (SFS), molecular dynamics (MD), and quantum mechanics (QM) may be used to amplify the amount of available data to drug design and discovery campaigns, presenting methods, their perspectives, and discussions to be employed in the near future. EXPERT OPINION Recent works have successfully used these techniques to solve a range of issues in the face of data scarcity, including complex problems such as the challenging scenario of drug design aimed at intrinsically disordered proteins and the evaluation of potential adverse effects in a clinical scenario. These examples show that it is possible to improve and kickstart research from scarce available data to design and discover new potential drugs.
Collapse
Affiliation(s)
- Gabriel C Veríssimo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Mateus Sá M Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Medical Oncology and Pneumology, Internal Medicine VIII, University Hospital of Tübingen, Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland
| | - Rafaela S Ferreira
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia M Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Vinícius G Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| |
Collapse
|
26
|
S. V. SS, Law JN, Tripp CE, Duplyakin D, Skordilis E, Biagioni D, Paton RS, St. John PC. Multi-objective goal-directed optimization of de novo stable organic radicals for aqueous redox flow batteries. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00506-3] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
AbstractAdvances in the field of goal-directed molecular optimization offer the promise of finding feasible candidates for even the most challenging molecular design applications. One example of a fundamental design challenge is the search for novel stable radical scaffolds for an aqueous redox flow battery that simultaneously satisfy redox requirements at the anode and cathode, as relatively few stable organic radicals are known to exist. To meet this challenge, we develop a new open-source molecular optimization framework based on AlphaZero coupled with a fast, machine-learning-derived surrogate objective trained with nearly 100,000 quantum chemistry simulations. The objective function comprises two graph neural networks: one that predicts adiabatic oxidation and reduction potentials and a second that predicts electron density and local three-dimensional environment, previously shown to be correlated with radical persistence and stability. With no hard-coded knowledge of organic chemistry, the reinforcement learning agent finds molecule candidates that satisfy a precise combination of redox, stability and synthesizability requirements defined at the quantum chemistry level, many of which have reasonable predicted retrosynthetic pathways. The optimized molecules show that alternative stable radical scaffolds may offer a unique profile of stability and redox potentials to enable low-cost symmetric aqueous redox flow batteries.
Collapse
|
27
|
Zhang H, Gong X, Peng Y, Saravanan KM, Bian H, Zhang JZH, Wei Y, Pan Y, Yang Y. An Efficient Modern Strategy to Screen Drug Candidates Targeting RdRp of SARS-CoV-2 With Potentially High Selectivity and Specificity. Front Chem 2022; 10:933102. [PMID: 35903186 PMCID: PMC9315156 DOI: 10.3389/fchem.2022.933102] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2022] [Accepted: 06/06/2022] [Indexed: 01/18/2023] Open
Abstract
Desired drug candidates should have both a high potential binding chance and high specificity. Recently, many drug screening strategies have been developed to screen compounds with high possible binding chances or high binding affinity. However, there is still no good solution to detect whether those selected compounds possess high specificity. Here, we developed a reverse DFCNN (Dense Fully Connected Neural Network) and a reverse docking protocol to check a given compound’s ability to bind diversified targets and estimate its specificity with homemade formulas. We used the RNA-dependent RNA polymerase (RdRp) target as a proof-of-concept example to identify drug candidates with high selectivity and high specificity. We first used a previously developed hybrid screening method to find drug candidates from an 8888-size compound database. The hybrid screening method takes advantage of the deep learning-based method, traditional molecular docking, molecular dynamics simulation, and binding free energy calculated by metadynamics, which should be powerful in selecting high binding affinity candidates. Also, we integrated the reverse DFCNN and reversed docking against a diversified 102 proteins to the pipeline for assessing the specificity of those selected candidates, and finally got compounds that have both predicted selectivity and specificity. Among the eight selected candidates, Platycodin D and Tubeimoside III were confirmed to effectively inhibit SARS-CoV-2 replication in vitro with EC50 values of 619.5 and 265.5 nM, respectively. Our study discovered that Tubeimoside III could inhibit SARS-CoV-2 replication potently for the first time. Furthermore, the underlying mechanisms of Platycodin D and Tubeimoside III inhibiting SARS-CoV-2 are highly possible by blocking the RdRp cavity according to our screening procedure. In addition, the careful analysis predicted common critical residues involved in the binding with active inhibitors Platycodin D and Tubeimoside III, Azithromycin, and Pralatrexate, which hopefully promote the development of non-covalent binding inhibitors against RdRp.
Collapse
Affiliation(s)
- Haiping Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
- *Correspondence: Yang Yang, ; Haiping Zhang,
| | - Xiaohua Gong
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, State Key Discipline of Infectious Disease, Shenzhen Third People’s Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
| | - Yun Peng
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, State Key Discipline of Infectious Disease, Shenzhen Third People’s Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
| | - Konda Mani Saravanan
- Department of Biotechnology, Bharath Institute of Higher Education and Research, Chennai, , India
| | - Hengwei Bian
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - John Z. H. Zhang
- Shenzhen Institute of Synthetic Biology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yanjie Wei
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yi Pan
- Center for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, China
| | - Yang Yang
- Shenzhen Key Laboratory of Pathogen and Immunity, National Clinical Research Center for Infectious Disease, State Key Discipline of Infectious Disease, Shenzhen Third People’s Hospital, Second Hospital Affiliated to Southern University of Science and Technology, Shenzhen, China
- *Correspondence: Yang Yang, ; Haiping Zhang,
| |
Collapse
|
28
|
Bender A, Schneider N, Segler M, Patrick Walters W, Engkvist O, Rodrigues T. Evaluation guidelines for machine learning tools in the chemical sciences. Nat Rev Chem 2022; 6:428-442. [PMID: 37117429 DOI: 10.1038/s41570-022-00391-9] [Citation(s) in RCA: 29] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/13/2022] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) promises to tackle the grand challenges in chemistry and speed up the generation, improvement and/or ordering of research hypotheses. Despite the overarching applicability of ML workflows, one usually finds diverse evaluation study designs. The current heterogeneity in evaluation techniques and metrics leads to difficulty in (or the impossibility of) comparing and assessing the relevance of new algorithms. Ultimately, this may delay the digitalization of chemistry at scale and confuse method developers, experimentalists, reviewers and journal editors. In this Perspective, we critically discuss a set of method development and evaluation guidelines for different types of ML-based publications, emphasizing supervised learning. We provide a diverse collection of examples from various authors and disciplines in chemistry. While taking into account varying accessibility across research groups, our recommendations focus on reporting completeness and standardizing comparisons between tools. We aim to further contribute to improved ML transparency and credibility by suggesting a checklist of retro-/prospective tests and dissecting their importance. We envisage that the wide adoption and continuous update of best practices will encourage an informed use of ML on real-world problems related to the chemical sciences.
Collapse
|
29
|
Bung N, Krishnan SR, Roy A. An In Silico Explainable Multiparameter Optimization Approach for De Novo Drug Design against Proteins from the Central Nervous System. J Chem Inf Model 2022; 62:2685-2695. [PMID: 35581002 DOI: 10.1021/acs.jcim.2c00462] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The aim of drug design and development is to produce a drug that can inhibit the target protein and possess a balanced physicochemical and toxicity profile. Traditionally, this is a multistep process where different parameters such as activity and physicochemical and pharmacokinetic properties are optimized sequentially, which often leads to high attrition rate during later stages of drug design and development. We have developed a deep learning-based de novo drug design method that can design novel small molecules by optimizing target specificity as well as multiple parameters (including late-stage parameters) in a single step. All possible combinations of parameters were optimized to understand the effect of each parameter over the other parameters. An explainable predictive model was used to identify the molecular fragments responsible for the property being optimized. The proposed method was applied against the human 5-hydroxy tryptamine receptor 1B (5-HT1B), a protein from the central nervous system (CNS). Various physicochemical properties specific to CNS drugs were considered along with the target specificity and blood-brain barrier permeability (BBBP), which act as an additional challenge for CNS drug delivery. The contribution of each parameter toward molecule design was identified by analyzing the properties of generated small molecules from optimization of all possible parameter combinations. The final optimized generative model was able to design similar inhibitors compared to known inhibitors of 5-HT1B. In addition, the functional groups of the generated small molecules that guide the BBBP predictive model were identified through feature attribution techniques.
Collapse
Affiliation(s)
- Navneet Bung
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India
| | | | - Arijit Roy
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad 500081, India
| |
Collapse
|
30
|
Xie W, Wang F, Li Y, Lai L, Pei J. Advances and Challenges in De Novo Drug Design Using Three-Dimensional Deep Generative Models. J Chem Inf Model 2022; 62:2269-2279. [PMID: 35544331 DOI: 10.1021/acs.jcim.2c00042] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A persistent goal for de novo drug design is to generate novel chemical compounds with desirable properties in a labor-, time-, and cost-efficient manner. Deep generative models provide alternative routes to this goal. Numerous model architectures and optimization strategies have been explored in recent years, most of which have been developed to generate two-dimensional molecular structures. Some generative models aiming at three-dimensional (3D) molecule generation have also been proposed, gaining attention for their unique advantages and potential to directly design drug-like molecules in a target-conditioning manner. This review highlights current developments in 3D molecular generative models combined with deep learning and discusses future directions for de novo drug design.
Collapse
Affiliation(s)
- Weixin Xie
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Fanhao Wang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Yibo Li
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Science at BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
31
|
Cyclobutane-containing scaffolds in bioactive small molecules. TRENDS IN CHEMISTRY 2022. [DOI: 10.1016/j.trechm.2022.04.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
32
|
A Consensus Compound/Bioactivity Dataset for Data-Driven Drug Design and Chemogenomics. Molecules 2022; 27:molecules27082513. [PMID: 35458710 PMCID: PMC9028877 DOI: 10.3390/molecules27082513] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Revised: 03/31/2022] [Accepted: 04/10/2022] [Indexed: 02/01/2023] Open
Abstract
Publicly available compound and bioactivity databases provide an essential basis for data-driven applications in life-science research and drug design. By analyzing several bioactivity repositories, we discovered differences in compound and target coverage advocating the combined use of data from multiple sources. Using data from ChEMBL, PubChem, IUPHAR/BPS, BindingDB, and Probes & Drugs, we assembled a consensus dataset focusing on small molecules with bioactivity on human macromolecular targets. This allowed an improved coverage of compound space and targets, and an automated comparison and curation of structural and bioactivity data to reveal potentially erroneous entries and increase confidence. The consensus dataset comprised of more than 1.1 million compounds with over 10.9 million bioactivity data points with annotations on assay type and bioactivity confidence, providing a useful ensemble for computational applications in drug design and chemogenomics.
Collapse
|
33
|
Bos PH, Houang EM, Ranalli F, Leffler AE, Boyles NA, Eyrich VA, Luria Y, Katz D, Tang H, Abel R, Bhat S. AutoDesigner, a De Novo Design Algorithm for Rapidly Exploring Large Chemical Space for Lead Optimization: Application to the Design and Synthesis of d-Amino Acid Oxidase Inhibitors. J Chem Inf Model 2022; 62:1905-1915. [PMID: 35417149 DOI: 10.1021/acs.jcim.2c00072] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The lead optimization stage of a drug discovery program generally involves the design, synthesis, and assaying of hundreds to thousands of compounds. The design phase is usually carried out via traditional medicinal chemistry approaches and/or structure-based drug design (SBDD) when suitable structural information is available. Two of the major limitations of this approach are (1) difficulty in rapidly designing potent molecules that adhere to myriad project criteria, or the multiparameter optimization (MPO) problem, and (2) the relatively small number of molecules explored compared to the vast size of chemical space. To address these limitations, we have developed AutoDesigner, a de novo design algorithm. AutoDesigner employs a cloud-native, multistage search algorithm to carry out successive rounds of chemical space exploration and filtering. Millions to billions of virtual molecules are explored and optimized while adhering to a customizable set of project criteria such as physicochemical properties and potency. Additionally, the algorithm only requires a single ligand with measurable affinity and a putative binding model as a starting point, making it amenable to the early stages of an SBDD project where limited data are available. To assess the effectiveness of AutoDesigner, we applied it to the design of novel inhibitors of d-amino acid oxidase (DAO), a target for the treatment of schizophrenia. AutoDesigner was able to generate and efficiently explore over 1 billion molecules to successfully address a variety of project goals. The compounds generated by AutoDesigner that were synthesized and assayed (1) simultaneously met not only physicochemical criteria, clearance, and central nervous system (CNS) penetration (Kp,uu) cutoffs but also potency thresholds and (2) fully utilize structural data to discover and explore novel interactions and a previously unexplored subpocket in the DAO active site. The reported data demonstrate that AutoDesigner can play a key role in accelerating the discovery of novel, potent chemical matter within the constraints of a given drug discovery lead optimization campaign.
Collapse
Affiliation(s)
- Pieter H Bos
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Evelyne M Houang
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Fabio Ranalli
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Abba E Leffler
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Nicholas A Boyles
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Volker A Eyrich
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Yuval Luria
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Dana Katz
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Haifeng Tang
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Robert Abel
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| | - Sathesh Bhat
- Schrödinger, Inc., 1540 Broadway, 24th Floor, New York, New York 10036, United States
| |
Collapse
|
34
|
Creanza TM, Lamanna G, Delre P, Contino M, Corriero N, Saviano M, Mangiatordi GF, Ancona N. DeLA-Drug: A Deep Learning Algorithm for Automated Design of Druglike Analogues. J Chem Inf Model 2022; 62:1411-1424. [PMID: 35294184 DOI: 10.1021/acs.jcim.2c00205] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In this paper, we present a deep learning algorithm for automated design of druglike analogues (DeLA-Drug), a recurrent neural network (RNN) model composed of two long short-term memory (LSTM) layers and conceived for data-driven generation of similar-to-bioactive compounds. DeLA-Drug captures the syntax of SMILES strings of more than 1 million compounds belonging to the ChEMBL28 database and, by employing a new strategy called sampling with substitutions (SWS), generates molecules starting from a single user-defined query compound. Remarkably, the algorithm preserves druglikeness and synthetic accessibility of the known bioactive compounds present in the ChEMBL28 repository. The absence of any time-demanding fine-tuning procedure enables DeLA-Drug to perform a fast generation of focused libraries for further high-throughput screening and makes it a suitable tool for performing de novo design even in low-data regimes. To provide a concrete idea of its applicability, DeLA-Drug was applied to the cannabinoid receptor subtype 2 (CB2R), a known target involved in different pathological conditions such as cancer and neurodegeneration. DeLA-Drug, available as a free web platform (http://www.ba.ic.cnr.it/softwareic/deladrugportal/), can help medicinal chemists interested in generating analogues of compounds already available in their laboratories and, for this reason, good candidates for an easy and low-cost synthesis.
Collapse
Affiliation(s)
- Teresa Maria Creanza
- CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| | - Giuseppe Lamanna
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Pietro Delre
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Marialessandra Contino
- Department of Pharmacy─Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy
| | - Nicola Corriero
- CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Michele Saviano
- CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | | | - Nicola Ancona
- CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| |
Collapse
|
35
|
Moret M, Grisoni F, Katzberger P, Schneider G. Perplexity-Based Molecule Ranking and Bias Estimation of Chemical Language Models. J Chem Inf Model 2022; 62:1199-1206. [PMID: 35191696 PMCID: PMC8924923 DOI: 10.1021/acs.jcim.2c00079] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2022] [Indexed: 02/07/2023]
Abstract
Chemical language models (CLMs) can be employed to design molecules with desired properties. CLMs generate new chemical structures in the form of textual representations, such as the simplified molecular input line entry system (SMILES) strings. However, the quality of these de novo generated molecules is difficult to assess a priori. In this study, we apply the perplexity metric to determine the degree to which the molecules generated by a CLM match the desired design objectives. This model-intrinsic score allows identifying and ranking the most promising molecular designs based on the probabilities learned by the CLM. Using perplexity to compare "greedy" (beam search) with "explorative" (multinomial sampling) methods for SMILES generation, certain advantages of multinomial sampling become apparent. Additionally, perplexity scoring is performed to identify undesired model biases introduced during model training and allows the development of a new ranking system to remove those undesired biases.
Collapse
Affiliation(s)
- Michael Moret
- Department
of Chemistry and Applied Biosciences, ETH
Zurich, RETHINK, Vladimir-Prelog-Weg 4, Zurich 8093, Switzerland
| | - Francesca Grisoni
- Institute
for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Groene Loper 7, Eindhoven 5612AZ, Netherlands
- Center
for Living Technologies, Alliance TU/e,
WUR, UU, UMC Utrecht, Princetonlaan 6, Utrecht 3584 CB, The Netherlands
| | - Paul Katzberger
- Department
of Chemistry and Applied Biosciences, ETH
Zurich, RETHINK, Vladimir-Prelog-Weg 4, Zurich 8093, Switzerland
| | - Gisbert Schneider
- Department
of Chemistry and Applied Biosciences, ETH
Zurich, RETHINK, Vladimir-Prelog-Weg 4, Zurich 8093, Switzerland
- ETH
Singapore SEC Ltd., 1
CREATE Way, #06-01 CREATE Tower, Singapore 138602, Singapore
| |
Collapse
|
36
|
Martinelli DD. Generative machine learning for de novo drug discovery: A systematic review. Comput Biol Med 2022; 145:105403. [PMID: 35339849 DOI: 10.1016/j.compbiomed.2022.105403] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 03/10/2022] [Accepted: 03/11/2022] [Indexed: 02/08/2023]
Abstract
Recent research on artificial intelligence indicates that machine learning algorithms can auto-generate novel drug-like molecules. Generative models have revolutionized de novo drug discovery, rendering the explorative process more efficient. Several model frameworks and input formats have been proposed to enhance the performance of intelligent algorithms in generative molecular design. In this systematic literature review of experimental articles and reviews over the last five years, machine learning models, challenges associated with computational molecule design along with proposed solutions, and molecular encoding methods are discussed. A query-based search of the PubMed, ScienceDirect, Springer, Wiley Online Library, arXiv, MDPI, bioRxiv, and IEEE Xplore databases yielded 87 studies. Twelve additional studies were identified via citation searching. Of the articles in which machine learning was implemented, six prominent algorithms were identified: long short-term memory recurrent neural networks (LSTM-RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), adversarial autoencoders (AAEs), evolutionary algorithms, and gated recurrent unit (GRU-RNNs). Furthermore, eight central challenges were designated: homogeneity of generated molecular libraries, deficient synthesizability, limited assay data, model interpretability, incapacity for multi-property optimization, incomparability, restricted molecule size, and uncertainty in model evaluation. Molecules were encoded either as strings, which were occasionally augmented using randomization, as 2D graphs, or as 3D graphs. Statistical analysis and visualization are performed to illustrate how approaches to machine learning in de novo drug design have evolved over the past five years. Finally, future opportunities and reservations are discussed.
Collapse
|
37
|
Faudone G, Zhubi R, Celik F, Knapp S, Chaikuad A, Heering J, Merk D. Design of a Potent TLX Agonist by Rational Fragment Fusion. J Med Chem 2022; 65:2288-2296. [PMID: 34989568 DOI: 10.1021/acs.jmedchem.1c01757] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
As a master regulator of neurogenesis, the orphan nuclear receptor tailless homologue (TLX, NR2E1) maintains neuronal stem cell homeostasis by acting as a transcriptional repressor of tumor suppressor genes. It is hence considered as an appealing target for the treatment of neurodegenerative diseases, but a lack of potent TLX modulators as tools to probe pharmacological TLX control hinders further validation of its promising potential. Here, we report the development of a potent TLX agonist based on fragment screening, pharmacophore modeling, and fragment fusion. Pharmacophore similarity of a fragment screening hit and the TLX ligand ccrp2 provided a rational basis for fragment linkage, which resulted in several TLX activator scaffolds. Among them, the fused compound 10 evolved as a valuable TLX agonist tool with submicromolar potency and high selectivity over related nuclear receptors, rendering it suitable for functional studies on TLX.
Collapse
Affiliation(s)
- Giuseppe Faudone
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, D-60438 Frankfurt, Germany
| | - Rezart Zhubi
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, D-60438 Frankfurt, Germany.,Structural Genomics Consortium, BMLS, Goethe University Frankfurt, D-60438 Frankfurt, Germany
| | - Fatih Celik
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, D-60438 Frankfurt, Germany
| | - Stefan Knapp
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, D-60438 Frankfurt, Germany.,Structural Genomics Consortium, BMLS, Goethe University Frankfurt, D-60438 Frankfurt, Germany
| | - Apirat Chaikuad
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, D-60438 Frankfurt, Germany.,Structural Genomics Consortium, BMLS, Goethe University Frankfurt, D-60438 Frankfurt, Germany
| | - Jan Heering
- Fraunhofer Institute for Translational Medicine and Pharmacology ITMP, D-60596 Frankfurt, Germany
| | - Daniel Merk
- Institute of Pharmaceutical Chemistry, Goethe University Frankfurt, D-60438 Frankfurt, Germany.,Department of Pharmacy, Ludwig-Maximilians-Universität München, D-81377 Munich, Germany
| |
Collapse
|