1
|
Peteani G, Huynh MTD, Gerebtzoff G, Rodríguez-Pérez R. Application of machine learning models for property prediction to targeted protein degraders. Nat Commun 2024; 15:5764. [PMID: 38982061 PMCID: PMC11233499 DOI: 10.1038/s41467-024-49979-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2023] [Accepted: 06/21/2024] [Indexed: 07/11/2024] Open
Abstract
Machine learning (ML) systems can model quantitative structure-property relationships (QSPR) using existing experimental data and make property predictions for new molecules. With the advent of modalities such as targeted protein degraders (TPD), the applicability of QSPR models is questioned and ML usage in TPD-centric projects remains limited. Herein, ML models are developed and evaluated for TPDs' property predictions, including passive permeability, metabolic clearance, cytochrome P450 inhibition, plasma protein binding, and lipophilicity. Interestingly, performance on TPDs is comparable to that of other modalities. Predictions for glues and heterobifunctionals often yield lower and higher errors, respectively. For permeability, CYP3A4 inhibition, and human and rat microsomal clearance, misclassification errors into high and low risk categories are lower than 4% for glues and 15% for heterobifunctionals. For all modalities, misclassification errors range from 0.8% to 8.1%. Investigated transfer learning strategies improve predictions for heterobifunctionals. This is the first comprehensive evaluation of ML for the prediction of absorption, distribution, metabolism, and excretion (ADME) and physicochemical properties of TPD molecules, including heterobifunctional and molecular glue sub-modalities. Taken together, our investigations show that ML-based QSPR models are applicable to TPDs and support ML usage for TPDs' design, to potentially accelerate drug discovery.
Collapse
Affiliation(s)
- Giulia Peteani
- Novartis Biomedical Research, Novartis Campus, 4002, Basel, Switzerland
| | | | | | | |
Collapse
|
2
|
Agea MI, Čmelo I, Dehaen W, Chen Y, Kirchmair J, Sedlák D, Bartůněk P, Šícho M, Svozil D. Chemical space exploration with Molpher: Generating and assessing a glucocorticoid receptor ligand library. Mol Inform 2024:e202300316. [PMID: 38979783 DOI: 10.1002/minf.202300316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2023] [Revised: 04/23/2024] [Accepted: 04/24/2024] [Indexed: 07/10/2024]
Abstract
Computational exploration of chemical space is crucial in modern cheminformatics research for accelerating the discovery of new biologically active compounds. In this study, we present a detailed analysis of the chemical library of potential glucocorticoid receptor (GR) ligands generated by the molecular generator, Molpher. To generate the targeted GR library and construct the classification models, structures from the ChEMBL database as well as from the internal IMG library, which was experimentally screened for biological activity in the primary luciferase reporter cell assay, were utilized. The composition of the targeted GR ligand library was compared with a reference library that randomly samples chemical space. A random forest model was used to determine the biological activity of ligands, incorporating its applicability domain using conformal prediction. It was demonstrated that the GR library is significantly enriched with GR ligands compared to the random library. Furthermore, a prospective analysis demonstrated that Molpher successfully designed compounds, which were subsequently experimentally confirmed to be active on the GR. A collection of 34 potential new GR ligands was also identified. Moreover, an important contribution of this study is the establishment of a comprehensive workflow for evaluating computationally generated ligands, particularly those with potential activity against targets that are challenging to dock.
Collapse
Affiliation(s)
- M Isabel Agea
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Ivan Čmelo
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Wim Dehaen
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
- Department of Organic Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Ya Chen
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146, Hamburg, Germany
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090, Vienna, Austria
| | - Johannes Kirchmair
- Center for Bioinformatics (ZBH), Department of Informatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, 20146, Hamburg, Germany
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, 1090, Vienna, Austria
| | - David Sedlák
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| | - Petr Bartůněk
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| | - Martin Šícho
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
| | - Daniel Svozil
- Department of Informatics and Chemistry & CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Faculty of Chemical Technology, University of Chemistry and Technology, Prague, 16628, Czech Republic
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Institute of Molecular Genetics of the Czech Academy of Sciences, Prague, 14220, Czech Republic
| |
Collapse
|
3
|
Fotsch C, Basu D, Case R, Chen Q, Koneru PC, Lo MC, Ngo R, Sharma P, Vaish A, Yi X, Zech SG, Hodder P. Creating a more strategic small molecule biophysical hit characterization workflow. SLAS DISCOVERY : ADVANCING LIFE SCIENCES R & D 2024; 29:100159. [PMID: 38723666 DOI: 10.1016/j.slasd.2024.100159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/10/2024] [Revised: 03/16/2024] [Accepted: 05/06/2024] [Indexed: 05/21/2024]
Abstract
To confirm target engagement of hits from our high-throughput screening efforts, we ran biophysical assays on several hundreds of hits from 15 different high-throughput screening campaigns. Analyzing the biophysical assay results from these screening campaigns led us to conclude that we could be more strategic in our biophysical analysis of hits by first confirming activity in a thermal shift assay (TSA) and then confirming activity in either a surface plasmon resonance (SPR) assay or a temperature-related intensity change (TRIC) assay. To understand how this new workflow shapes the quality of the final hits, we compared TSA/SPR or TSA/TRIC confirmed and unconfirmed hits to one another using four measures of compound quality: quantitative estimate of drug-likeness (QED), Pan-Assay Interference Compounds (PAINS), promiscuity, and aqueous solubility. In general, we found that the biophysically confirmed hits performed better in the compound quality metrics than the unconfirmed hits, demonstrating that our workflow not only confirmed target engagement of the hits but also enriched for higher quality hits.
Collapse
Affiliation(s)
- Christopher Fotsch
- Lead Discovery and Characterization Group, Amgen Research, Thousand Oaks CA, USA.
| | - Debaleena Basu
- Lead Discovery and Characterization Group, Amgen Research, South San Francisco CA, USA
| | - Ryan Case
- Lead Discovery and Characterization Group, Amgen Research, South San Francisco CA, USA
| | - Qing Chen
- Lead Discovery and Characterization Group, Amgen Research, Thousand Oaks CA, USA
| | - Pratibha C Koneru
- Lead Discovery and Characterization Group, Amgen Research, South San Francisco CA, USA
| | - Mei-Chu Lo
- Lead Discovery and Characterization Group, Amgen Research, South San Francisco CA, USA
| | - Rachel Ngo
- Lead Discovery and Characterization Group, Amgen Research, South San Francisco CA, USA
| | - Pooja Sharma
- Lead Discovery and Characterization Group, Amgen Research, Thousand Oaks CA, USA
| | - Amit Vaish
- Lead Discovery and Characterization Group, Amgen Research, Thousand Oaks CA, USA
| | - Xiang Yi
- Lead Discovery and Characterization Group, Amgen Research, South San Francisco CA, USA
| | - Stephan G Zech
- Lead Discovery and Characterization Group, Amgen Research, Thousand Oaks CA, USA
| | - Peter Hodder
- Lead Discovery and Characterization Group, Amgen Research, Thousand Oaks CA, USA
| |
Collapse
|
4
|
Tan L, Hirte S, Palmacci V, Stork C, Kirchmair J. Tackling assay interference associated with small molecules. Nat Rev Chem 2024; 8:319-339. [PMID: 38622244 DOI: 10.1038/s41570-024-00593-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/29/2024] [Indexed: 04/17/2024]
Abstract
Biochemical and cell-based assays are essential to discovering and optimizing efficacious and safe drugs, agrochemicals and cosmetics. However, false assay readouts stemming from colloidal aggregation, chemical reactivity, chelation, light signal attenuation and emission, membrane disruption, and other interference mechanisms remain a considerable challenge in screening synthetic compounds and natural products. To address assay interference, a range of powerful experimental approaches are available and in silico methods are now gaining traction. This Review begins with an overview of the scope and limitations of experimental approaches for tackling assay interference. It then focuses on theoretical methods, discusses strategies for their integration with experimental approaches, and provides recommendations for best practices. The Review closes with a summary of the critical facts and an outlook on potential future developments.
Collapse
Affiliation(s)
- Lu Tan
- Drug Discovery Sciences, Boehringer Ingelheim RCV GmbH & Co KG, Vienna, Austria
| | - Steffen Hirte
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, Vienna, Austria
| | - Vincenzo Palmacci
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, Vienna, Austria
| | - Conrad Stork
- Department of Informatics, Center for Bioinformatics, Faculty of Mathematics, Informatics and Natural Sciences, Universität Hamburg, Hamburg, Germany
- BASF SE, Ludwigshafen am Rhein, Germany
| | - Johannes Kirchmair
- Department of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, Faculty of Life Sciences, University of Vienna, Vienna, Austria.
- Christian Doppler Laboratory for Molecular Informatics in the Biosciences, Department for Pharmaceutical Sciences, University of Vienna, Vienna, Austria.
| |
Collapse
|
5
|
Thomas JR, Shelton C, Murphy J, Brittain S, Bray MA, Aspesi P, Concannon J, King FJ, Ihry RJ, Ho DJ, Henault M, Hadjikyriacou A, Neri M, Sigoillot FD, Pham HT, Shum M, Barys L, Jones MD, Martin EJ, Blechschmidt A, Rieffel S, Troxler TJ, Mapa FA, Jenkins JL, Jain RK, Kutchukian PS, Schirle M, Renner S. Enhancing the Small-Scale Screenable Biological Space beyond Known Chemogenomics Libraries with Gray Chemical Matter─Compounds with Novel Mechanisms from High-Throughput Screening Profiles. ACS Chem Biol 2024; 19:938-952. [PMID: 38565185 PMCID: PMC11040606 DOI: 10.1021/acschembio.3c00737] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2023] [Revised: 02/28/2024] [Accepted: 03/01/2024] [Indexed: 04/04/2024]
Abstract
Phenotypic assays have become an established approach to drug discovery. Greater disease relevance is often achieved through cellular models with increased complexity and more detailed readouts, such as gene expression or advanced imaging. However, the intricate nature and cost of these assays impose limitations on their screening capacity, often restricting screens to well-characterized small compound sets such as chemogenomics libraries. Here, we outline a cheminformatics approach to identify a small set of compounds with likely novel mechanisms of action (MoAs), expanding the MoA search space for throughput limited phenotypic assays. Our approach is based on mining existing large-scale, phenotypic high-throughput screening (HTS) data. It enables the identification of chemotypes that exhibit selectivity across multiple cell-based assays, which are characterized by persistent and broad structure activity relationships (SAR). We validate the effectiveness of our approach in broad cellular profiling assays (Cell Painting, DRUG-seq, and Promotor Signature Profiling) and chemical proteomics experiments. These experiments revealed that the compounds behave similarly to known chemogenetic libraries, but with a notable bias toward novel protein targets. To foster collaboration and advance research in this area, we have curated a public set of such compounds based on the PubChem BioAssay dataset and made it available for use by the scientific community.
Collapse
Affiliation(s)
- Jason R. Thomas
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Claude Shelton
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Jason Murphy
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Scott Brittain
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Mark-Anthony Bray
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Peter Aspesi
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - John Concannon
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Frederick J. King
- Novartis
Biomedical Research, San Diego, California 92121, United States
| | - Robert J. Ihry
- Novartis
Biomedical Research, San Diego, California 92121, United States
| | - Daniel J. Ho
- Novartis
Biomedical Research, San Diego, California 92121, United States
| | - Martin Henault
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | | | - Marilisa Neri
- Novartis
Biomedical Research, Basel 4056, Switzerland
| | | | - Helen T. Pham
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Matthew Shum
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Louise Barys
- Novartis
Biomedical Research, Basel 4056, Switzerland
| | - Michael D. Jones
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Eric J. Martin
- Novartis
Biomedical Research, Emeryville, California 94608, United States
| | | | | | | | - Felipa A. Mapa
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Jeremy L. Jenkins
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Rishi K. Jain
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | | | - Markus Schirle
- Novartis
Biomedical Research, Cambridge, Massachusetts 02139, United States
| | | |
Collapse
|
6
|
Shen L, Fang J, Liu L, Yang F, Jenkins JL, Kutchukian PS, Wang H. Pocket Crafter: a 3D generative modeling based workflow for the rapid generation of hit molecules in drug discovery. J Cheminform 2024; 16:33. [PMID: 38515171 PMCID: PMC10958880 DOI: 10.1186/s13321-024-00829-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Accepted: 03/16/2024] [Indexed: 03/23/2024] Open
Abstract
We present a user-friendly molecular generative pipeline called Pocket Crafter, specifically designed to facilitate hit finding activity in the drug discovery process. This workflow utilized a three-dimensional (3D) generative modeling method Pocket2Mol, for the de novo design of molecules in spatial perspective for the targeted protein structures, followed by filters for chemical-physical properties and drug-likeness, structure-activity relationship analysis, and clustering to generate top virtual hit scaffolds. In our WDR5 case study, we acquired a focused set of 2029 compounds after a targeted searching within Novartis archived library based on the virtual scaffolds. Subsequently, we experimentally profiled these compounds, resulting in a novel chemical scaffold series that demonstrated activity in biochemical and biophysical assays. Pocket Crafter successfully prototyped an effective end-to-end 3D generative chemistry-based workflow for the exploration of new chemical scaffolds, which represents a promising approach in early drug discovery for hit identification.
Collapse
Affiliation(s)
- Lingling Shen
- Novartis Biomedical Research, Cambridge, MA, 02139, USA.
| | - Jian Fang
- Novartis Biomedical Research, Cambridge, MA, 02139, USA
| | - Lulu Liu
- Novartis Biomedical Research, Cambridge, MA, 02139, USA
| | - Fei Yang
- Novartis Biomedical Research, Cambridge, MA, 02139, USA
| | | | | | - He Wang
- Novartis Biomedical Research, Cambridge, MA, 02139, USA.
| |
Collapse
|
7
|
Kailass K, Casalena D, Jenane L, McEdwards G, Auld DS, Sadovski O, Kaye EG, Hudson E, Nettleton D, Currie MA, Beharry AA. Tight-Binding Small-Molecule Carboxylesterase 2 Inhibitors Reduce Intracellular Irinotecan Activation. J Med Chem 2024; 67:2019-2030. [PMID: 38265364 DOI: 10.1021/acs.jmedchem.3c01850] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2024]
Abstract
As the primary enzyme responsible for the activatable conversion of Irinotecan (CPT-11) to SN-38, carboxylesterase 2 (CES2) is a significant predictive biomarker toward CPT-11-based treatments for pancreatic ductal adenocarcinoma (PDAC). High SN-38 levels from high CES2 activity lead to harmful effects, including life-threatening diarrhea. While alternate strategies have been explored, CES2 inhibition presents an effective strategy to directly alter the pharmacokinetics of CPT-11 conversion, ultimately controlling the amount of SN-38 produced. To address this, we conducted a high-throughput screening to discover 18 small-molecule CES2 inhibitors. The inhibitors are validated by dose-response and counter-screening and 16 of these inhibitors demonstrate selectivity for CES2. These 16 inhibitors inhibit CES2 in cells, indicating cell permeability, and they show inhibition of CPT-11 conversion with the purified enzyme. The top five inhibitors prohibited cell death mediated by CPT-11 when preincubated in PDAC cells. Three of these inhibitors displayed a tight-binding mechanism of action with a strong binding affinity.
Collapse
Affiliation(s)
- Karishma Kailass
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada L5L 1C6
| | - Dominick Casalena
- Novartis Institutes for Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Lina Jenane
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada L5L 1C6
| | - Gregor McEdwards
- Department of Biology, University of Toronto Mississauga, Mississauga, Ontario, Canada, L5L 1C6
| | - Douglas S Auld
- Novartis Institutes for Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Oleg Sadovski
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada L5L 1C6
| | - Esther G Kaye
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada L5L 1C6
| | - Elyse Hudson
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada L5L 1C6
| | - David Nettleton
- Novartis Institutes for Biomedical Research, Cambridge, Massachusetts 02139, United States
| | - Mark A Currie
- Department of Biology, University of Toronto Mississauga, Mississauga, Ontario, Canada, L5L 1C6
| | - Andrew A Beharry
- Department of Chemical and Physical Sciences, University of Toronto Mississauga, Mississauga, Ontario, Canada L5L 1C6
| |
Collapse
|
8
|
Collie GW, Clark MA, Keefe AD, Madin A, Read JA, Rivers EL, Zhang Y. Screening Ultra-Large Encoded Compound Libraries Leads to Novel Protein-Ligand Interactions and High Selectivity. J Med Chem 2024; 67:864-884. [PMID: 38197367 PMCID: PMC10823476 DOI: 10.1021/acs.jmedchem.3c01861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 11/17/2023] [Accepted: 12/04/2023] [Indexed: 01/11/2024]
Abstract
The DNA-encoded library (DEL) discovery platform has emerged as a powerful technology for hit identification in recent years. It has become one of the major parallel workstreams for small molecule drug discovery along with other strategies such as HTS and data mining. For many researchers working in the DEL field, it has become increasingly evident that many hits and leads discovered via DEL screening bind to target proteins with unique and unprecedented binding modes. This Perspective is our attempt to analyze reports of DEL screening with the purpose of providing a rigorous and useful account of the binding modes observed for DEL-derived ligands with a focus on binding mode novelty.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Ying Zhang
- X-Chem,
Inc., Waltham, Massachusetts 02453, United States
| |
Collapse
|
9
|
Choung OH, Vianello R, Segler M, Stiefl N, Jiménez-Luna J. Extracting medicinal chemistry intuition via preference machine learning. Nat Commun 2023; 14:6651. [PMID: 37907461 PMCID: PMC10618272 DOI: 10.1038/s41467-023-42242-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Accepted: 09/21/2023] [Indexed: 11/02/2023] Open
Abstract
The lead optimization process in drug discovery campaigns is an arduous endeavour where the input of many medicinal chemists is weighed in order to reach a desired molecular property profile. Building the expertise to successfully drive such projects collaboratively is a very time-consuming process that typically spans many years within a chemist's career. In this work we aim to replicate this process by applying artificial intelligence learning-to-rank techniques on feedback that was obtained from 35 chemists at Novartis over the course of several months. We exemplify the usefulness of the learned proxies in routine tasks such as compound prioritization, motif rationalization, and biased de novo drug design. Annotated response data is provided, and developed models and code made available through a permissive open-source license.
Collapse
Affiliation(s)
- Oh-Hyeon Choung
- Novartis Institutes for Biomedical Research, 4002, Basel, Switzerland
| | - Riccardo Vianello
- Novartis Institutes for Biomedical Research, 4002, Basel, Switzerland
| | - Marwin Segler
- Microsoft Research AI4Science, CB1 2FB, Cambridge, UK
| | - Nikolaus Stiefl
- Novartis Institutes for Biomedical Research, 4002, Basel, Switzerland.
| | | |
Collapse
|
10
|
Götz J, Jackl MK, Jindakun C, Marziale AN, André J, Gosling DJ, Springer C, Palmieri M, Reck M, Luneau A, Brocklehurst CE, Bode JW. High-throughput synthesis provides data for predicting molecular properties and reaction success. SCIENCE ADVANCES 2023; 9:eadj2314. [PMID: 37889964 PMCID: PMC10610918 DOI: 10.1126/sciadv.adj2314] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 09/26/2023] [Indexed: 10/29/2023]
Abstract
The generation of attractive scaffolds for drug discovery efforts requires the expeditious synthesis of diverse analogues from readily available building blocks. This endeavor necessitates a trade-off between diversity and ease of access and is further complicated by uncertainty about the synthesizability and pharmacokinetic properties of the resulting compounds. Here, we document a platform that leverages photocatalytic N-heterocycle synthesis, high-throughput experimentation, automated purification, and physicochemical assays on 1152 discrete reactions. Together, the data generated allow rational predictions of the synthesizability of stereochemically diverse C-substituted N-saturated heterocycles with deep learning and reveal unexpected trends on the relationship between structure and properties. This study exemplifies how organic chemists can exploit state-of-the-art technologies to markedly increase throughput and confidence in the preparation of drug-like molecules.
Collapse
Affiliation(s)
- Julian Götz
- Laboratory of Organic Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland
| | - Moritz K. Jackl
- Laboratory of Organic Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland
| | - Chalupat Jindakun
- Laboratory of Organic Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland
| | - Alexander N. Marziale
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Jérôme André
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Daniel J. Gosling
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Clayton Springer
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, Cambridge, MA 02139, USA
| | - Marco Palmieri
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Marcel Reck
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Alexandre Luneau
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Cara E. Brocklehurst
- Global Discovery Chemistry, Novartis Institutes for Biomedical Research, Novartis Pharma AG, 4056 Basel, Switzerland
| | - Jeffrey W. Bode
- Laboratory of Organic Chemistry, Department of Chemistry and Applied Biosciences, ETH Zürich, 8093 Zürich, Switzerland
| |
Collapse
|
11
|
Beckers M, Sturm N, Sirockin F, Fechner N, Stiefl N. Prediction of Small-Molecule Developability Using Large-Scale In Silico ADMET Models. J Med Chem 2023; 66:14047-14060. [PMID: 37815201 DOI: 10.1021/acs.jmedchem.3c01083] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/11/2023]
Abstract
Early in silico assessment of the potential of a series of compounds to deliver a drug is one of the major challenges in computer-assisted drug design. The goal is to identify the right chemical series of compounds out of a large chemical space to then subsequently prioritize the molecules with the highest potential to become a drug. Although multiple approaches to assess compounds have been developed over decades, the quality of these predictors is often not good enough and compounds that agree with the respective estimates are not necessarily druglike. Here, we report a novel deep learning approach that leverages large-scale predictions of ∼100 ADMET assays to assess the potential of a compound to become a relevant drug candidate. The resulting score, which we termed bPK score, substantially outperforms previous approaches and showed strong discriminative performance on data sets where previous approaches did not.
Collapse
Affiliation(s)
- Maximilian Beckers
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002 Basel, Switzerland
| | - Noé Sturm
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002 Basel, Switzerland
| | - Finton Sirockin
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002 Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002 Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002 Basel, Switzerland
| |
Collapse
|
12
|
Zhu W, Liu X, Li Q, Gao F, Liu T, Chen X, Zhang M, Aliper A, Ren F, Ding X, Zhavoronkov A. Discovery of novel and selective SIK2 inhibitors by the application of AlphaFold structures and generative models. Bioorg Med Chem 2023; 91:117414. [PMID: 37467565 DOI: 10.1016/j.bmc.2023.117414] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Revised: 07/06/2023] [Accepted: 07/11/2023] [Indexed: 07/21/2023]
Abstract
Salt-inducible kinase 2 (SIK2) has been recognized as a potential target for anti-inflammation and anti-cancer therapy. In this paper, based on the binding pose of the reported compound (GLPG-3970, 3) with AlphaFold protein structure, a series of hinge cores were generated via AI-generative models (Chemistry42). After the molecular docking, synthesis, and biological evaluation, a hit molecule (7f) targeting SIK2 was obtained with a novel scaffold. Further SAR exploration led to the discovery of compound 8g with superior potency against SIK2 compared with the reported inhibitors. Furthermore, 8g also demonstrated excellent selectivity over other AMPK kinases, favorable in vitro ADMET profiles and decent cellular activities. This work provides an alternative approach to the discovery of novel and selective kinase inhibitors.
Collapse
Affiliation(s)
- Wei Zhu
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Xiaosong Liu
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Qi Li
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Feng Gao
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Tingting Liu
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Xiaojing Chen
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Man Zhang
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Alex Aliper
- Insilico Medicine AI Limited, Masdar City, Abu Dhabi 145748, UAE
| | - Feng Ren
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Xiao Ding
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China.
| | - Alex Zhavoronkov
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China; Insilico Medicine AI Limited, Masdar City, Abu Dhabi 145748, UAE.
| |
Collapse
|
13
|
Lanini J, Santarossa G, Sirockin F, Lewis R, Fechner N, Misztela H, Lewis S, Maziarz K, Stanley M, Segler M, Stiefl N, Schneider N. PREFER: A New Predictive Modeling Framework for Molecular Discovery. J Chem Inf Model 2023; 63:4497-4504. [PMID: 37487018 DOI: 10.1021/acs.jcim.3c00523] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.7.7) and based on AutoSklearn (version 0.14.7), that allows comparison between different molecular representations and common machine-learning models. We provide an overview of the design of our framework and show exemplary use cases and results of several representation-model combinations on diverse data sets, both public and in-house. Finally, we discuss the use of PREFER on small data sets. The code of the framework is freely available on GitHub.
Collapse
Affiliation(s)
- Jessica Lanini
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Gianluca Santarossa
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Finton Sirockin
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Richard Lewis
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | | | - Sarah Lewis
- Microsoft Research AI4Science, Cambridge CB1 2FB, U.K
| | | | - Megan Stanley
- Microsoft Research AI4Science, Cambridge CB1 2FB, U.K
| | - Marwin Segler
- Microsoft Research AI4Science, Cambridge CB1 2FB, U.K
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Nadine Schneider
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| |
Collapse
|
14
|
Quancard J, Vulpetti A, Bach A, Cox B, Guéret SM, Hartung IV, Koolman HF, Laufer S, Messinger J, Sbardella G, Craft R. The European Federation for Medicinal Chemistry and Chemical Biology (EFMC) Best Practice Initiative: Hit Generation. ChemMedChem 2023; 18:e202300002. [PMID: 36892096 DOI: 10.1002/cmdc.202300002] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 02/14/2023] [Indexed: 03/10/2023]
Abstract
Hit generation is a crucial step in drug discovery that will determine the speed and chance of success of identifying drug candidates. Many strategies are now available to identify chemical starting points, or hits, and each biological target warrants a tailored approach. In this set of best practices, we detail the essential approaches for target centric hit generation and the opportunities and challenges they come with. We then provide guidance on how to validate hits to ensure medicinal chemistry is only performed on compounds and scaffolds that engage the target of interest and have the desired mode of action. Finally, we discuss the design of integrated hit generation strategies that combine several approaches to maximize the chance of identifying high quality starting points to ensure a successful drug discovery campaign.
Collapse
Affiliation(s)
- Jean Quancard
- Global Discovery Chemistry, Novartis Institute for Biomedical Research, Novartis Pharma AG, Novartis Campus, 4056, Basel, Switzerland
| | - Anna Vulpetti
- Global Discovery Chemistry, Novartis Institute for Biomedical Research, Novartis Pharma AG, Novartis Campus, 4056, Basel, Switzerland
| | - Anders Bach
- Department of Drug Design and Pharmacology, Faculty of Health and Medical Sciences, University of Copenhagen, Universitetsparken 2, 2100, Copenhagen, Denmark
| | - Brian Cox
- School of Life Sciences, University of Sussex, Brighton, BN1 9RH, UK
| | - Stéphanie M Guéret
- Medicinal Chemistry, Research and Early Development Cardiovascular, Renal and Metabolism, BioPharmaceuticals R&D, AstraZeneca, 43183, Gothenburg, Sweden
| | - Ingo V Hartung
- Medicinal Chemistry, Global R&D, Merck Healthcare KGaA, Frankfurter Straße 250, 64293, Darmstadt, Germany
| | - Hannes F Koolman
- Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Straße 65, 88397, Biberach an der Riss, Germany
| | - Stefan Laufer
- Pharmaceutical & Medicinal Chemistry, Institute of Pharmacy & Biochemistry, Tübingen Center for Academic Drug Discovery, Auf der Morgenstelle 8, 72070, Tübingen, Germany
| | - Josef Messinger
- Medicine Design, Orionpharma, Orionintie 1, 02101, Espoo, Finland
| | - Gianluca Sbardella
- Department of Pharmacy, Epigenetic Med Chem Lab, University of Salerno, Via Giovanni Paolo II 132, 84084, Fisciano (SA), Italy
| | - Russell Craft
- Medicinal chemistry, Symeres, Kadijk 3, 9747 AT, Groningen, The Netherlands
| |
Collapse
|
15
|
Di Lascio E, Gerebtzoff G, Rodríguez-Pérez R. Systematic Evaluation of Local and Global Machine Learning Models for the Prediction of ADME Properties. Mol Pharm 2023; 20:1758-1767. [PMID: 36745394 DOI: 10.1021/acs.molpharmaceut.2c00962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) has become an indispensable tool to predict absorption, distribution, metabolism, and excretion (ADME) properties in pharmaceutical research. ML algorithms are trained on molecular structures and corresponding ADME assay data to develop quantitative structure-property relationship (QSPR) models. Traditional QSPR models were trained on compound sets of limited size. With the advent of more complex ML algorithms and data availability, training sets have become larger and more diverse. Most common training approaches consist in either training a model with a small set of similar compounds, namely, compounds designed for the same drug discovery project or chemical series (local model approach) or with a larger set of diverse compounds (global model approach). Global models are built with all experimental data available for an assay, combining compound data from different projects and disease areas. Despite the ML progress made so far, the choice of the appropriate data composition for building ML models is still unclear. Herein, a systematic evaluation of local and global ML models was performed for 10 different experimental assays and 112 drug discovery projects. Results show a consistent superior performance of global models for ADME property predictions. Diagnostic analyses were also carried out to investigate the influence of training set size, structural diversity, and data shift in the relative performance of local and global ML models. Training set and structural diversity did not have an impact in the relative performance on the methods. Instead, data shift helped to identify the projects with larger performance differences between local and global models. Results presented in this work can be leveraged to improve ML-based ADME properties predictions and thus decision-making in drug discovery projects.
Collapse
Affiliation(s)
- Elena Di Lascio
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | | |
Collapse
|
16
|
Rodríguez-Pérez R, Trunzer M, Schneider N, Faller B, Gerebtzoff G. Multispecies Machine Learning Predictions of In Vitro Intrinsic Clearance with Uncertainty Quantification Analyses. Mol Pharm 2023; 20:383-394. [PMID: 36437712 DOI: 10.1021/acs.molpharmaceut.2c00680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In pharmaceutical research, compounds are optimized for metabolic stability to avoid a too fast elimination of the drug. Intrinsic clearance (CLint) measured in liver microsomes or hepatocytes is an important parameter during lead optimization. In this work, machine learning models were developed to relate the compound structure to microsomal metabolic stability and predict CLint for new compounds. A multitask (MT) learning architecture was introduced to model the CLint of six species simultaneously, giving as a result a multispecies machine learning model. MT graph neural network (MT-GNN) regression was identified as the top-performing method, and an ensemble of 10 MT-GNN models was evaluated prospectively. Geometric mean fold errors were consistently smaller than 2-fold. Moreover, high precision values were obtained in the prediction of "high" (>300 μL/min/mg) and "low" (<100 μL/min/mg) CLint compounds. Precision values ranged from 80 to 94% for low CLint predictions and from 75 to 97% for high CLint predictions, depending on the species. Uncertainty on experimental values and model predictions was systematically quantified. Experimental variability (aleatoric uncertainty) of all historical Novartis in vitro clearance experiments was analyzed. Interestingly, MT-GNN models' performance approached assays' experimental variability. Moreover, uncertainty estimation in predictions (epistemic uncertainty) enabled identifying predictions associated with lower and higher error. Taken together, our manuscript combines a multispecies deep learning model and large-scale uncertainty analyses to improve CLint predictions and facilitate early informed decisions for compound prioritization.
Collapse
Affiliation(s)
| | - Markus Trunzer
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Nadine Schneider
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Bernard Faller
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| |
Collapse
|
17
|
Lai A, Schaub J, Steinbeck C, Schymanski EL. An algorithm to classify homologous series within compound datasets. J Cheminform 2022; 14:85. [PMID: 36510332 PMCID: PMC9746203 DOI: 10.1186/s13321-022-00663-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/27/2022] [Indexed: 12/15/2022] Open
Abstract
Homologous series are groups of related compounds that share the same core structure attached to a motif that repeats to different degrees. Compounds forming homologous series are of interest in multiple domains, including natural products, environmental chemistry, and drug design. However, many homologous compounds remain unannotated as such in compound datasets, which poses obstacles to understanding chemical diversity and their analytical identification via database matching. To overcome these challenges, an algorithm to detect homologous series within compound datasets was developed and implemented using the RDKit. The algorithm takes a list of molecules as SMILES strings and a monomer (i.e., repeating unit) encoded as SMARTS as its main inputs. In an iterative process, substructure matching of repeating units, molecule fragmentation, and core detection lead to homologous series classification through grouping of identical cores. Three open compound datasets from environmental chemistry (NORMAN Suspect List Exchange, NORMAN-SLE), exposomics (PubChemLite for Exposomics), and natural products (the COlleCtion of Open NatUral producTs, COCONUT) were subject to homologous series classification using the algorithm. Over 2000, 12,000, and 5000 series with CH2 repeating units were classified in the NORMAN-SLE, PubChemLite, and COCONUT respectively. Validation of classified series was performed using published homologous series and structure categories, including a comparison with a similar existing method for categorising PFAS compounds. The OngLai algorithm and its implementation for classifying homologues are openly available at: https://github.com/adelenelai/onglai-classify-homologues .
Collapse
Affiliation(s)
- Adelene Lai
- grid.16008.3f0000 0001 2295 9843Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg ,grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Jonas Schaub
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Christoph Steinbeck
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Emma L. Schymanski
- grid.16008.3f0000 0001 2295 9843Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| |
Collapse
|
18
|
Beckers M, Fechner N, Stiefl N. 25 Years of Small-Molecule Optimization at Novartis: A Retrospective Analysis of Chemical Series Evolution. J Chem Inf Model 2022; 62:6002-6021. [PMID: 36351293 DOI: 10.1021/acs.jcim.2c00785] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
In the drug development process, optimization of properties and biological activities of small molecules is an important task to obtain drug candidates with optimal efficacy when first applied in subsequent clinical studies. However, despite its importance, large-scale investigations of the optimization process in early drug discovery are lacking, likely due to the absence of historical records of different chemical series used in past projects. Here, we report a retrospective reconstruction of ∼3000 chemical series from the Novartis compound database, which allows us to characterize the general properties of chemical series as well as the time evolution of structural properties, ADMET properties, and target activities. Our data-driven approach allows us to substantiate common MedChem knowledge. We find that size, fraction of sp3-hybridized carbon atoms (Fsp3), and the density of stereocenters tend to increase during optimization, while the aromaticity of the compounds decreases. On the ADMET side, solubility tends to increase and permeability decreases, while safety-related properties tend to improve. Importantly, while ligand efficiency decreases due to molecular growth over time, target activities and lipophilic efficiency tend to improve. This emphasizes the heavy-atom count and log D as important parameters to monitor, especially as we further show that the decrease in permeability can be explained with the increase in molecular size. We highlight overlaps, shortcomings, and differences of the computationally reconstructed chemical series compared to the series used in recent internal drug discovery projects and investigate the relation to historical projects.
Collapse
Affiliation(s)
- Maximilian Beckers
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002Basel, Switzerland
| |
Collapse
|
19
|
Janin YL. On drug discovery against infectious diseases and academic medicinal chemistry contributions. Beilstein J Org Chem 2022; 18:1355-1378. [PMID: 36247982 PMCID: PMC9531561 DOI: 10.3762/bjoc.18.141] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2022] [Accepted: 09/21/2022] [Indexed: 11/23/2022] Open
Abstract
This perspective is an attempt to document the problems that medicinal chemists are facing in drug discovery. It is also trying to identify relevant/possible, research areas in which academics can have an impact and should thus be the subject of grant calls. Accordingly, it describes how hit discovery happens, how compounds to be screened are selected from available chemicals and the possible reasons for the recurrent paucity of useful/exploitable results reported. This is followed by the successful hit to lead stories leading to recent and original antibacterials which are, or about to be, used in human medicine. Then, illustrated considerations and suggestions are made on the possible inputs of academic medicinal chemists. This starts with the observation that discovering a “good” hit in the course of a screening campaign still rely on a lot of luck – which is within the reach of academics –, that the hit to lead process requires a lot of chemistry and that if public–private partnerships can be important throughout these stages, they are absolute requirements for clinical trials. Concerning suggestions to improve the current hit success rate, one academic input in organic chemistry would be to identify new and pertinent chemical space, design synthetic accesses to reach these and prepare the corresponding chemical libraries. Concerning hit to lead programs on a given target, if no new hits are available, previously reported leads along with new structural data can be pertinent starting points to design, prepare and assay original analogues. In conclusion, this text is an actual plea illustrating that, in many countries, academic research in medicinal chemistry should be more funded, especially in the therapeutic area neglected by the industry. At the least, such funds would provide the intensive to secure series of hopefully relevant chemical entities which appears to often lack when considering the results of academic as well as industrial screening campaigns.
Collapse
Affiliation(s)
- Yves L Janin
- Structure et Instabilité des Génomes (StrInG), Muséum National d'Histoire Naturelle, INSERM, CNRS, Alliance Sorbonne Université, 75005 Paris, France
| |
Collapse
|
20
|
He J, Nittinger E, Tyrchan C, Czechtizky W, Patronov A, Bjerrum EJ, Engkvist O. Transformer-based molecular optimization beyond matched molecular pairs. J Cheminform 2022; 14:18. [PMID: 35346368 PMCID: PMC8962145 DOI: 10.1186/s13321-022-00599-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 03/11/2022] [Indexed: 11/11/2022] Open
Abstract
Molecular optimization aims to improve the drug profile of a starting molecule. It is a fundamental problem in drug discovery but challenging due to (i) the requirement of simultaneous optimization of multiple properties and (ii) the large chemical space to explore. Recently, deep learning methods have been proposed to solve this task by mimicking the chemist’s intuition in terms of matched molecular pairs (MMPs). Although MMPs is a widely used strategy by medicinal chemists, it offers limited capability in terms of exploring the space of structural modifications, therefore does not cover the complete space of solutions. Often more general transformations beyond the nature of MMPs are feasible and/or necessary, e.g. simultaneous modifications of the starting molecule at different places including the core scaffold. This study aims to provide a general methodology that offers more general structural modifications beyond MMPs. In particular, the same Transformer architecture is trained on different datasets. These datasets consist of a set of molecular pairs which reflect different types of transformations. Beyond MMP transformation, datasets reflecting general structural changes are constructed from ChEMBL based on two approaches: Tanimoto similarity (allows for multiple modifications) and scaffold matching (allows for multiple modifications but keep the scaffold constant) respectively. We investigate how the model behavior can be altered by tailoring the dataset while using the same model architecture. Our results show that the models trained on differently prepared datasets transform a given starting molecule in a way that it reflects the nature of the dataset used for training the model. These models could complement each other and unlock the capability for the chemists to pursue different options for improving a starting molecule.
Collapse
|
21
|
Ertl P, Gerebtzoff G, Lewis RA, Muenkler H, Schneider N, Sirockin F, Stiefl N, Tosco P. Chemical reactivity prediction: current methods and different application areas. Mol Inform 2021; 41:e2100277. [PMID: 34964302 DOI: 10.1002/minf.202100277] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Accepted: 12/28/2021] [Indexed: 11/10/2022]
Abstract
The ability to predict chemical reactivity of a molecule is highly desirable in drug discovery, both ex vivo (synthetic route planning, formulation, stability) and in vivo: metabolic reactions determine pharmacodynamics, pharmacokinetics and potential toxic effects, and early assessment of liabilities is vital to reduce attrition rates in later stages of development. Quantum mechanics offer a precise description of the interactions between electrons and orbitals in the breaking and forming of new bonds. Modern algorithms and faster computers have allowed the study of more complex systems in a punctual and accurate fashion, and answers for chemical questions around stability and reactivity can now be provided. Through machine learning, predictive models can be built out of descriptors derived from quantum mechanics and cheminformatics, even in the absence of experimental data to train on. In this article, current progress on computational reactivity prediction is reviewed: applications to problems in drug design, such as modelling of metabolism and covalent inhibition, are highlighted and unmet challenges are posed.
Collapse
Affiliation(s)
| | | | - Richard A Lewis
- Computer-Aided Drug Design, Eli Lilly and Company Limited, Windlesham, SWITZERLAND
| | - Hagen Muenkler
- Novartis Institutes for BioMedical Research Inc, SWITZERLAND
| | | | | | | | - Paolo Tosco
- Novartis Institutes for BioMedical Research Inc, SWITZERLAND
| |
Collapse
|
22
|
Knez D, Hrast M, Frlan R, Pišlar A, Žakelj S, Kos J, Gobec S. Indoles and 1-(3-(benzyloxy)benzyl)piperazines: Reversible and selective monoamine oxidase B inhibitors identified by screening an in-house compound library. Bioorg Chem 2021; 119:105581. [PMID: 34990933 DOI: 10.1016/j.bioorg.2021.105581] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 12/06/2021] [Accepted: 12/20/2021] [Indexed: 11/02/2022]
Abstract
The therapeutic indications for monoamine oxidases A and B (MAO-A and MAO-B) inhibitors that have emerged from biological studies on animal and cellular models of neurological and oncological diseases have focused drug discovery projects upon identifying reversible MAO inhibitors. Screening of our in-house academic compound library identified two hit compounds that inhibit MAO-B with IC50 values in micromolar range. Two series of indole (23 analogues) and 3-(benzyloxy)benzyl)piperazine (16 analogues) MAO-B inhibitors were derived from hits, and screened for their structure-activity relationships. Both series yielded low micromolar selective inhibitors of human MAO-B, namely indole 2 (IC50 = 12.63 ± 1.21 µM) and piperazine 39 (IC50 = 19.25 ± 4.89 µM), which is comparable to selective MAO-B inhibitor isatin (IC50 = 6.10 ± 2.81 µM), yet less potent in comparison to safinamide (IC50 = 0.029 ± 0.002 µM). Selective MAO-B inhibitors 2, 14, 38 and 39 exhibited favourable permeation of the blood-brain barrier and low cytotoxicity in the human neuroblastoma cell line SH-SY5Y.
Collapse
Affiliation(s)
- Damijan Knez
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia.
| | - Martina Hrast
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia
| | - Rok Frlan
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia
| | - Anja Pišlar
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia
| | - Simon Žakelj
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia
| | - Janko Kos
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia; Department of Biotechnology, Jožef Stefan Institute, Jamova 39, 1000 Ljubljana, Slovenia
| | - Stanislav Gobec
- University of Ljubljana, Faculty of Pharmacy, Aškerčeva 7, 1000 Ljubljana, Slovenia.
| |
Collapse
|
23
|
Simm J, Humbeck L, Zalewski A, Sturm N, Heyndrickx W, Moreau Y, Beck B, Schuffenhauer A. Splitting chemical structure data sets for federated privacy-preserving machine learning. J Cheminform 2021; 13:96. [PMID: 34876230 PMCID: PMC8650276 DOI: 10.1186/s13321-021-00576-2] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2021] [Accepted: 11/22/2021] [Indexed: 11/10/2022] Open
Abstract
With the increase in applications of machine learning methods in drug design and related fields, the challenge of designing sound test sets becomes more and more prominent. The goal of this challenge is to have a realistic split of chemical structures (compounds) between training, validation and test set such that the performance on the test set is meaningful to infer the performance in a prospective application. This challenge is by its own very interesting and relevant, but is even more complex in a federated machine learning approach where multiple partners jointly train a model under privacy-preserving conditions where chemical structures must not be shared between the different participating parties. In this work we discuss three methods which provide a splitting of a data set and are applicable in a federated privacy-preserving setting, namely: a. locality-sensitive hashing (LSH), b. sphere exclusion clustering, c. scaffold-based binning (scaffold network). For evaluation of these splitting methods we consider the following quality criteria (compared to random splitting): bias in prediction performance, classification label and data imbalance, similarity distance between the test and training set compounds. The main findings of the paper are a. both sphere exclusion clustering and scaffold-based binning result in high quality splitting of the data sets, b. in terms of compute costs sphere exclusion clustering is very expensive in the case of federated privacy-preserving setting.
Collapse
Affiliation(s)
- Jaak Simm
- KU Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, 3001, Heverlee, Belgium
| | - Lina Humbeck
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Adam Zalewski
- Amgen Research (Munich) GmbH, Staffelseestraße 2, 81477, Munich, Germany
| | - Noe Sturm
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002, Basel, Switzerland
| | - Wouter Heyndrickx
- Janssen Pharmaceutica N.V., Janssen Pharmaceutica, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Yves Moreau
- KU Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, 3001, Heverlee, Belgium
| | - Bernd Beck
- Medicinal Chemistry Department, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorfer Str. 65, 88397, Biberach an der Riss, Germany
| | - Ansgar Schuffenhauer
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4002, Basel, Switzerland.
| |
Collapse
|
24
|
Meyer A, Baeschlin D, Brocklehurst CE, Duckely M, Gallou F, Lovelle LE, Parmentier M, Schlama T, Snajdrova R, Auberson YP. Fostering Research Synergies between Chemists in Swiss Academia and at Novartis. Chimia (Aarau) 2021; 75:936-942. [PMID: 34798915 DOI: 10.2533/chimia.2021.936] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
We present a short overview of the way Novartis chemists interact and collaborate with the academic chemistry community in Switzerland. This article exemplifies a number of collaborations, and illustrates opportunities to foster research synergies between academic and industrial researchers. It also describes established programs available to academic groups, providing them access to Novartis resources and expertise.
Collapse
Affiliation(s)
- Arndt Meyer
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Daniel Baeschlin
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Cara E Brocklehurst
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Myriam Duckely
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Fabrice Gallou
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Lucie E Lovelle
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Michael Parmentier
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Thierry Schlama
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Radka Snajdrova
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland
| | - Yves P Auberson
- Global Discovery Chemistry, Novartis Institutes for BioMedical Research, CH-4002 Basel, Switzerland ;,
| |
Collapse
|
25
|
Silvestri IP, Colbon PJJ. The Growing Importance of Chirality in 3D Chemical Space Exploration and Modern Drug Discovery Approaches for Hit-ID: Topical Innovations. ACS Med Chem Lett 2021; 12:1220-1229. [PMID: 34413951 PMCID: PMC8366003 DOI: 10.1021/acsmedchemlett.1c00251] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2021] [Accepted: 07/02/2021] [Indexed: 12/19/2022] Open
Abstract
Modern-day drug discovery is now blessed with a wide range of high-throughput hit identification (hit-ID) strategies that have been successfully validated in recent years, with particular success coming from high-throughput screening, fragment-based lead discovery, and DNA-encoded library screening. As screening efficiency and throughput increases, this enables the viable exploration of increasingly complex three-dimensional (3D) chemical structure space, with a realistic chance of identifying highly specific hit ligands with increased target specificity and reduced attrition rates in preclinical and clinical development. This minireview will explore the impact of an improved design of multifunctionalized, sp3-rich, stereodefined scaffolds on the (virtual) exploration of 3D chemical space and the specific requirements for different hit-ID technologies.
Collapse
Affiliation(s)
- Ilaria Proietti Silvestri
- Department of Chemistry University
of Liverpool, Liverpool ChiroChem, Ltd., Crown Street, Liverpool L69 7ZD, United
Kingdom
| | - Paul J. J. Colbon
- Department of Chemistry University
of Liverpool, Liverpool ChiroChem, Ltd., Crown Street, Liverpool L69 7ZD, United
Kingdom
| |
Collapse
|
26
|
Mathai N, Stork C, Kirchmair J. BonMOLière: Small-Sized Libraries of Readily Purchasable Compounds, Optimized to Produce Genuine Hits in Biological Screens across the Protein Space. Int J Mol Sci 2021; 22:ijms22157773. [PMID: 34360558 PMCID: PMC8346018 DOI: 10.3390/ijms22157773] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Revised: 07/13/2021] [Accepted: 07/15/2021] [Indexed: 12/21/2022] Open
Abstract
Experimental screening of large sets of compounds against macromolecular targets is a key strategy to identify novel bioactivities. However, large-scale screening requires substantial experimental resources and is time-consuming and challenging. Therefore, small to medium-sized compound libraries with a high chance of producing genuine hits on an arbitrary protein of interest would be of great value to fields related to early drug discovery, in particular biochemical and cell research. Here, we present a computational approach that incorporates drug-likeness, predicted bioactivities, biological space coverage, and target novelty, to generate optimized compound libraries with maximized chances of producing genuine hits for a wide range of proteins. The computational approach evaluates drug-likeness with a set of established rules, predicts bioactivities with a validated, similarity-based approach, and optimizes the composition of small sets of compounds towards maximum target coverage and novelty. We found that, in comparison to the random selection of compounds for a library, our approach generates substantially improved compound sets. Quantified as the "fitness" of compound libraries, the calculated improvements ranged from +60% (for a library of 15,000 compounds) to +184% (for a library of 1000 compounds). The best of the optimized compound libraries prepared in this work are available for download as a dataset bundle ("BonMOLière").
Collapse
Affiliation(s)
- Neann Mathai
- Computational Biology Unit (CBU) and Department of Chemistry, University of Bergen, N-5020 Bergen, Norway;
| | - Conrad Stork
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany;
| | - Johannes Kirchmair
- Computational Biology Unit (CBU) and Department of Chemistry, University of Bergen, N-5020 Bergen, Norway;
- Division of Pharmaceutical Chemistry, Department of Pharmaceutical Sciences, University of Vienna, 1090 Vienna, Austria
- Correspondence:
| |
Collapse
|
27
|
Leeson PD, Bento AP, Gaulton A, Hersey A, Manners EJ, Radoux CJ, Leach AR. Target-Based Evaluation of "Drug-Like" Properties and Ligand Efficiencies. J Med Chem 2021; 64:7210-7230. [PMID: 33983732 PMCID: PMC7610969 DOI: 10.1021/acs.jmedchem.1c00416] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Physicochemical descriptors commonly used to define "drug-likeness" and ligand efficiency measures are assessed for their ability to differentiate marketed drugs from compounds reported to bind to their efficacious target or targets. Using ChEMBL version 26, a data set of 643 drugs acting on 271 targets was assembled, comprising 1104 drug-target pairs having ≥100 published compounds per target. Taking into account changes in their physicochemical properties over time, drugs are analyzed according to their target class, therapy area, and route of administration. Recent drugs, approved in 2010-2020, display no overall differences in molecular weight, lipophilicity, hydrogen bonding, or polar surface area from their target comparator compounds. Drugs are differentiated from target comparators by higher potency, ligand efficiency (LE), lipophilic ligand efficiency (LLE), and lower carboaromaticity. Overall, 96% of drugs have LE or LLE values, or both, greater than the median values of their target comparator compounds.
Collapse
Affiliation(s)
- Paul D Leeson
- Paul Leeson Consulting Ltd, The Malt House, Main Street, Congerstone, Nuneaton, Warkwickshire CV13 6LZ, United Kingdom
| | - A Patricia Bento
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Anna Gaulton
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Anne Hersey
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Emma J Manners
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Chris J Radoux
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| | - Andrew R Leach
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, United Kingdom
| |
Collapse
|
28
|
Graff DE, Shakhnovich EI, Coley CW. Accelerating high-throughput virtual screening through molecular pool-based active learning. Chem Sci 2021; 12:7866-7881. [PMID: 34168840 PMCID: PMC8188596 DOI: 10.1039/d0sc06805e] [Citation(s) in RCA: 86] [Impact Index Per Article: 28.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2020] [Accepted: 04/26/2021] [Indexed: 12/13/2022] Open
Abstract
Structure-based virtual screening is an important tool in early stage drug discovery that scores the interactions between a target protein and candidate ligands. As virtual libraries continue to grow (in excess of 108 molecules), so too do the resources necessary to conduct exhaustive virtual screening campaigns on these libraries. However, Bayesian optimization techniques, previously employed in other scientific discovery problems, can aid in their exploration: a surrogate structure-property relationship model trained on the predicted affinities of a subset of the library can be applied to the remaining library members, allowing the least promising compounds to be excluded from evaluation. In this study, we explore the application of these techniques to computational docking datasets and assess the impact of surrogate model architecture, acquisition function, and acquisition batch size on optimization performance. We observe significant reductions in computational costs; for example, using a directed-message passing neural network we can identify 94.8% or 89.3% of the top-50 000 ligands in a 100M member library after testing only 2.4% of candidate ligands using an upper confidence bound or greedy acquisition strategy, respectively. Such model-guided searches mitigate the increasing computational costs of screening increasingly large virtual libraries and can accelerate high-throughput virtual screening campaigns with applications beyond docking.
Collapse
Affiliation(s)
- David E Graff
- Department of Chemistry and Chemical Biology, Harvard University Cambridge MA USA
| | - Eugene I Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard University Cambridge MA USA
| | - Connor W Coley
- Department of Chemical Engineering, MIT Cambridge MA USA
| |
Collapse
|