1
|
Beckers M, Sirockin F, Fechner N, Stiefl N. Balancing Molecular Size, Activity, Permeability, and Other Properties: Drug Candidates in the Context of Their Chemical Structure Optimization. J Chem Inf Model 2024; 64:6636-6647. [PMID: 39137447 DOI: 10.1021/acs.jcim.4c00898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
Chemical structure optimization is a vital part of early drug discovery projects. Starting with compounds that show activity on the target of interest, the chemical structures are subsequently optimized toward a development candidate (DC) molecule with the best chances of clinical success. However, the DCs in the context of such optimization programs, as well as detailed characterization of major limiting factors, have not been investigated in detail so far. Here, we report an analysis of the historical DC molecules at Novartis since 2005 in the context of their optimization projects. Mapping the DCs into their respective chemical optimization series, we find that these tend to be synthesized rather early in a substantial number of cases. Further analysis of structural properties, ADMET, and potency-related readouts revealed that DC compounds tend to be generally significantly smaller, more permeable, and have higher ligand efficiency than other compounds sent to in vivo PK studies, which we also show for compounds from the same chemical series. Although this might seem obvious to most practitioners in medicinal chemistry, for all of these properties, we could show that they tend to evolve in an undesired direction during structure optimization. This highlights the difficulty of successfully translating our knowledge to medicinal chemistry optimizations.
Collapse
Affiliation(s)
- Maximilian Beckers
- Biomedical Research, Novartis Pharma AG, Postfach, Basel 4002, Switzerland
| | - Finton Sirockin
- Biomedical Research, Novartis Pharma AG, Postfach, Basel 4002, Switzerland
| | - Nikolas Fechner
- Biomedical Research, Novartis Pharma AG, Postfach, Basel 4002, Switzerland
| | - Nikolaus Stiefl
- Biomedical Research, Novartis Pharma AG, Postfach, Basel 4002, Switzerland
| |
Collapse
|
2
|
Lai A, Schaub J, Steinbeck C, Schymanski EL. An algorithm to classify homologous series within compound datasets. J Cheminform 2022; 14:85. [PMID: 36510332 PMCID: PMC9746203 DOI: 10.1186/s13321-022-00663-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2022] [Accepted: 11/27/2022] [Indexed: 12/15/2022] Open
Abstract
Homologous series are groups of related compounds that share the same core structure attached to a motif that repeats to different degrees. Compounds forming homologous series are of interest in multiple domains, including natural products, environmental chemistry, and drug design. However, many homologous compounds remain unannotated as such in compound datasets, which poses obstacles to understanding chemical diversity and their analytical identification via database matching. To overcome these challenges, an algorithm to detect homologous series within compound datasets was developed and implemented using the RDKit. The algorithm takes a list of molecules as SMILES strings and a monomer (i.e., repeating unit) encoded as SMARTS as its main inputs. In an iterative process, substructure matching of repeating units, molecule fragmentation, and core detection lead to homologous series classification through grouping of identical cores. Three open compound datasets from environmental chemistry (NORMAN Suspect List Exchange, NORMAN-SLE), exposomics (PubChemLite for Exposomics), and natural products (the COlleCtion of Open NatUral producTs, COCONUT) were subject to homologous series classification using the algorithm. Over 2000, 12,000, and 5000 series with CH2 repeating units were classified in the NORMAN-SLE, PubChemLite, and COCONUT respectively. Validation of classified series was performed using published homologous series and structure categories, including a comparison with a similar existing method for categorising PFAS compounds. The OngLai algorithm and its implementation for classifying homologues are openly available at: https://github.com/adelenelai/onglai-classify-homologues .
Collapse
Affiliation(s)
- Adelene Lai
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Jonas Schaub
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Christoph Steinbeck
- Institute for Inorganic and Analytical Chemistry, Friedrich Schiller University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Emma L. Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6 Avenue du Swing, 4367 Belvaux, Luxembourg
| |
Collapse
|
3
|
Beckers M, Fechner N, Stiefl N. 25 Years of Small-Molecule Optimization at Novartis: A Retrospective Analysis of Chemical Series Evolution. J Chem Inf Model 2022; 62:6002-6021. [PMID: 36351293 DOI: 10.1021/acs.jcim.2c00785] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
In the drug development process, optimization of properties and biological activities of small molecules is an important task to obtain drug candidates with optimal efficacy when first applied in subsequent clinical studies. However, despite its importance, large-scale investigations of the optimization process in early drug discovery are lacking, likely due to the absence of historical records of different chemical series used in past projects. Here, we report a retrospective reconstruction of ∼3000 chemical series from the Novartis compound database, which allows us to characterize the general properties of chemical series as well as the time evolution of structural properties, ADMET properties, and target activities. Our data-driven approach allows us to substantiate common MedChem knowledge. We find that size, fraction of sp3-hybridized carbon atoms (Fsp3), and the density of stereocenters tend to increase during optimization, while the aromaticity of the compounds decreases. On the ADMET side, solubility tends to increase and permeability decreases, while safety-related properties tend to improve. Importantly, while ligand efficiency decreases due to molecular growth over time, target activities and lipophilic efficiency tend to improve. This emphasizes the heavy-atom count and log D as important parameters to monitor, especially as we further show that the decrease in permeability can be explained with the increase in molecular size. We highlight overlaps, shortcomings, and differences of the computationally reconstructed chemical series compared to the series used in recent internal drug discovery projects and investigate the relation to historical projects.
Collapse
Affiliation(s)
- Maximilian Beckers
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002Basel, Switzerland
| | - Nikolas Fechner
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Postfach, 4002Basel, Switzerland
| |
Collapse
|
4
|
Schaub J, Zander J, Zielesny A, Steinbeck C. Scaffold Generator: a Java library implementing molecular scaffold functionalities in the Chemistry Development Kit (CDK). J Cheminform 2022; 14:79. [PMID: 36357931 PMCID: PMC9650898 DOI: 10.1186/s13321-022-00656-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Accepted: 10/30/2022] [Indexed: 11/12/2022] Open
Abstract
The concept of molecular scaffolds as defining core structures of organic molecules is utilised in many areas of chemistry and cheminformatics, e.g. drug design, chemical classification, or the analysis of high-throughput screening data. Here, we present Scaffold Generator, a comprehensive open library for the generation, handling, and display of molecular scaffolds, scaffold trees and networks. The new library is based on the Chemistry Development Kit (CDK) and highly customisable through multiple settings, e.g. five different structural framework definitions are available. For display of scaffold hierarchies, the open GraphStream Java library is utilised. Performance snapshots with natural products (NP) from the COCONUT (COlleCtion of Open Natural prodUcTs) database and drug molecules from DrugBank are reported. The generation of a scaffold network from more than 450,000 NP can be achieved within a single day.
Collapse
Affiliation(s)
- Jonas Schaub
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| | - Julian Zander
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Achim Zielesny
- grid.454254.60000 0004 0647 4362Institute for Bioinformatics and Chemoinformatics, Westphalian University of Applied Sciences, August-Schmidt-Ring 10, 45665 Recklinghausen, Germany
| | - Christoph Steinbeck
- grid.9613.d0000 0001 1939 2794Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Lessing Strasse 8, 07743 Jena, Germany
| |
Collapse
|
5
|
Congenericity of Claimed Compounds in Patent Applications. Molecules 2021; 26:molecules26175253. [PMID: 34500686 PMCID: PMC8433967 DOI: 10.3390/molecules26175253] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Revised: 08/17/2021] [Accepted: 08/18/2021] [Indexed: 12/04/2022] Open
Abstract
A method is presented to analyze quantitatively the degree of congenericity of claimed compounds in patent applications. The approach successfully differentiates patents exemplified with highly congeneric compounds of a structurally compact and well defined chemical series from patents containing a more diverse set of compounds around a more vaguely described patent claim. An application to 750 common patents available in SureChEMBL, SureChEMBLccs and ChEMBL is presented and the congenericity of patent compounds in those different sources discussed.
Collapse
|
6
|
Members of our early career panel highlight key research articles on the theme of computer-aided drug design. FUTURE DRUG DISCOVERY 2020. [DOI: 10.4155/fdd-2020-0026] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
7
|
Kruger F, Stiefl N, Landrum GA. rdScaffoldNetwork: The Scaffold Network Implementation in RDKit. J Chem Inf Model 2020; 60:3331-3335. [PMID: 32584031 DOI: 10.1021/acs.jcim.0c00296] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
We present an implementation of the scaffold network in the open source cheminformatics toolkit RDKit. Scaffold networks have been introduced in the literature as a powerful method to navigate and analyze large screening data sets in medicinal chemistry. Such a network can be created by iteratively applying predefined fragmentation rules to the investigated set of small molecules and by linking the produced fragments according to their descendence. This procedure results in a network graph, where the nodes correspond to the fragments and the edges correspond to the operations producing one fragment from another. In extension to the scaffold network implementations suggested in the literature, the presented implementation in RDKit allows an enhanced flexibility in terms of customizing the fragmentation rules and enables the inclusion of atom- and bond-generic scaffolds into the network. The output, providing node and edge information on the network, enables a simple and elegant navigation through the network, laying the basis to organize and better understand the data set being investigated.
Collapse
Affiliation(s)
- Franziska Kruger
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | - Nikolaus Stiefl
- Novartis Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4002 Basel, Switzerland
| | | |
Collapse
|