1
|
Folmsbee D, Koes DR, Hutchison GR. Systematic Comparison of Experimental Crystallographic Geometries and Gas-Phase Computed Conformers for Torsion Preferences. J Chem Inf Model 2023; 63:7401-7411. [PMID: 38000780 PMCID: PMC10716907 DOI: 10.1021/acs.jcim.3c01278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/07/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
We performed exhaustive torsion sampling on more than 3 million compounds using the GFN2-xTB method and performed a comparison of experimental crystallographic and gas-phase conformers. Many conformer sampling methods derive torsional angle distributions from experimental crystallographic data, limiting the torsion preferences to molecules that must be stable, synthetically accessible, and able to be crystallized. In this work, we evaluate the differences in torsional preferences of experimental crystallographic geometries and gas-phase computed conformers from a broad selection of compounds to determine whether torsional angle distributions obtained from semiempirical methods are suitable priors for conformer sampling. We find that differences in torsion preferences can be mostly attributed to a lack of available experimental crystallographic data with small deviations derived from gas-phase geometry differences. GFN2 demonstrates the ability to provide accurate and reliable torsional preferences that can provide a basis for new methods free from the limitations of experimental data collection. We provide Gaussian-based fits and sampling distributions suitable for torsion sampling and propose an alternative to the widely used "experimental torsion and knowledge distance geometry" (ETKDG) method using quantum torsion-derived distance geometry (QTDG) methods.
Collapse
Affiliation(s)
- Dakota
L. Folmsbee
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Anesthesiology & Perioperative Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - David R. Koes
- Department
of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R. Hutchison
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Chemical & Petroleum Engineering, University of Pittsburgh, 3700 O’Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
2
|
Seidel T, Permann C, Wieder O, Kohlbacher SM, Langer T. High-Quality Conformer Generation with CONFORGE: Algorithm and Performance Assessment. J Chem Inf Model 2023; 63:5549-5570. [PMID: 37624145 PMCID: PMC10498443 DOI: 10.1021/acs.jcim.3c00563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Indexed: 08/26/2023]
Abstract
Knowledge of the putative bound-state conformation of a molecule is an essential prerequisite for the successful application of many computer-aided drug design methods that aim to assess or predict its capability to bind to a particular target receptor. An established approach to predict bioactive conformers in the absence of receptor structure information is to sample the low-energy conformational space of the investigated molecules and derive representative conformer ensembles that can be expected to comprise members closely resembling possible bound-state ligand conformations. The high relevance of such conformer generation functionality led to the development of a wide panel of dedicated commercial and open-source software tools throughout the last decades. Several published benchmarking studies have shown that open-source tools usually lag behind their commercial competitors in many key aspects. In this work, we introduce the open-source conformer ensemble generator CONFORGE, which aims at delivering state-of-the-art performance for all types of organic molecules in drug-like chemical space. The ability of CONFORGE and several well-known commercial and open-source conformer ensemble generators to reproduce experimental 3D structures as well as their computational efficiency and robustness has been assessed thoroughly for both typical drug-like molecules and macrocyclic structures. For small molecules, CONFORGE clearly outperformed all other tested open-source conformer generators and performed at least equally well as the evaluated commercial generators in terms of both processing speed and accuracy. In the case of macrocyclic structures, CONFORGE achieved the best average accuracy among all benchmarked generators, with RDKit's generator coming close in second place.
Collapse
Affiliation(s)
- Thomas Seidel
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Christian Permann
- NeGeMac
Research Platform, Department of Pharmaceutical Sciences, Division
of Pharmaceutical Chemistry, University
of Vienna, Josef-Holaubek-Platz
2, 1090 Vienna, Austria
| | - Oliver Wieder
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Stefan M. Kohlbacher
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Thierry Langer
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- NeGeMac
Research Platform, Department of Pharmaceutical Sciences, Division
of Pharmaceutical Chemistry, University
of Vienna, Josef-Holaubek-Platz
2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| |
Collapse
|
3
|
Bougueroua S, Bricage M, Aboulfath Y, Barth D, Gaigeot MP. Algorithmic Graph Theory, Reinforcement Learning and Game Theory in MD Simulations: From 3D Structures to Topological 2D-Molecular Graphs (2D-MolGraphs) and Vice Versa. Molecules 2023; 28:molecules28072892. [PMID: 37049654 PMCID: PMC10096312 DOI: 10.3390/molecules28072892] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Revised: 03/17/2023] [Accepted: 03/18/2023] [Indexed: 04/14/2023] Open
Abstract
This paper reviews graph-theory-based methods that were recently developed in our group for post-processing molecular dynamics trajectories. We show that the use of algorithmic graph theory not only provides a direct and fast methodology to identify conformers sampled over time but also allows to follow the interconversions between the conformers through graphs of transitions in time. Examples of gas phase molecules and inhomogeneous aqueous solid interfaces are presented to demonstrate the power of topological 2D graphs and their versatility for post-processing molecular dynamics trajectories. An even more complex challenge is to predict 3D structures from topological 2D graphs. Our first attempts to tackle such a challenge are presented with the development of game theory and reinforcement learning methods for predicting the 3D structure of a gas-phase peptide.
Collapse
Affiliation(s)
- Sana Bougueroua
- Université Paris-Saclay, University Evry, CY Cergy Paris Université, CNRS, LAMBE UMR8587, 91025 Evry-Courcouronnes, France
| | - Marie Bricage
- Université Paris-Saclay, University Versailles Saint Quentin, DAVID, 78000 Versailles, France
| | - Ylène Aboulfath
- Université Paris-Saclay, University Versailles Saint Quentin, DAVID, 78000 Versailles, France
| | - Dominique Barth
- Université Paris-Saclay, University Versailles Saint Quentin, DAVID, 78000 Versailles, France
| | - Marie-Pierre Gaigeot
- Université Paris-Saclay, University Evry, CY Cergy Paris Université, CNRS, LAMBE UMR8587, 91025 Evry-Courcouronnes, France
| |
Collapse
|
4
|
Mansimov E, Mahmood O, Kang S, Cho K. Molecular Geometry Prediction using a Deep Generative Graph Neural Network. Sci Rep 2019; 9:20381. [PMID: 31892716 PMCID: PMC6938476 DOI: 10.1038/s41598-019-56773-5] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2019] [Accepted: 12/16/2019] [Indexed: 11/25/2022] Open
Abstract
A molecule's geometry, also known as conformation, is one of a molecule's most important properties, determining the reactions it participates in, the bonds it forms, and the interactions it has with other molecules. Conventional conformation generation methods minimize hand-designed molecular force field energy functions that are often not well correlated with the true energy function of a molecule observed in nature. They generate geometrically diverse sets of conformations, some of which are very similar to the lowest-energy conformations and others of which are very different. In this paper, we propose a conditional deep generative graph neural network that learns an energy function by directly learning to generate molecular conformations that are energetically favorable and more likely to be observed experimentally in data-driven manner. On three large-scale datasets containing small molecules, we show that our method generates a set of conformations that on average is far more likely to be close to the corresponding reference conformations than are those obtained from conventional force field methods. Our method maintains geometrical diversity by generating conformations that are not too similar to each other, and is also computationally faster. We also show that our method can be used to provide initial coordinates for conventional force field methods. On one of the evaluated datasets we show that this combination allows us to combine the best of both methods, yielding generated conformations that are on average close to reference conformations with some very similar to reference conformations.
Collapse
Affiliation(s)
- Elman Mansimov
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 60 5th Avenue, New York, New York, 10011, United States
| | - Omar Mahmood
- Center for Data Science, New York University, 60 5th Avenue, New York, New York, 10011, United States
| | - Seokho Kang
- Department of Systems Management Engineering, Sungkyunkwan University, 2066 Seobu-ro, Jangan-gu, Suwon, 16419, Republic of Korea
| | - Kyunghyun Cho
- Department of Computer Science, Courant Institute of Mathematical Sciences, New York University, 60 5th Avenue, New York, New York, 10011, United States.
- Center for Data Science, New York University, 60 5th Avenue, New York, New York, 10011, United States.
- Facebook AI Research, 770 Broadway, New York, New York, 10003, United States.
- CIFAR Azrieli Global Scholar, Canadian Institute for Advanced Research, 661 University Avenue, Toronto, ON, M5G 1M1, Canada.
| |
Collapse
|
5
|
Foscato M, Venkatraman V, Jensen VR. DENOPTIM: Software for Computational de Novo Design of Organic and Inorganic Molecules. J Chem Inf Model 2019; 59:4077-4082. [DOI: 10.1021/acs.jcim.9b00516] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Affiliation(s)
- Marco Foscato
- Department of Chemistry, University of Bergen, Allégaten 41, N-5007 Bergen, Norway
| | - Vishwesh Venkatraman
- Department of Chemistry, Norwegian University of Science and Technology, N-7491 Trondheim, Norway
| | - Vidar R. Jensen
- Department of Chemistry, University of Bergen, Allégaten 41, N-5007 Bergen, Norway
| |
Collapse
|
6
|
Vogiatzis KD, Polynski MV, Kirkland JK, Townsend J, Hashemi A, Liu C, Pidko EA. Computational Approach to Molecular Catalysis by 3d Transition Metals: Challenges and Opportunities. Chem Rev 2019; 119:2453-2523. [PMID: 30376310 PMCID: PMC6396130 DOI: 10.1021/acs.chemrev.8b00361] [Citation(s) in RCA: 222] [Impact Index Per Article: 44.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2018] [Indexed: 12/28/2022]
Abstract
Computational chemistry provides a versatile toolbox for studying mechanistic details of catalytic reactions and holds promise to deliver practical strategies to enable the rational in silico catalyst design. The versatile reactivity and nontrivial electronic structure effects, common for systems based on 3d transition metals, introduce additional complexity that may represent a particular challenge to the standard computational strategies. In this review, we discuss the challenges and capabilities of modern electronic structure methods for studying the reaction mechanisms promoted by 3d transition metal molecular catalysts. Particular focus will be placed on the ways of addressing the multiconfigurational problem in electronic structure calculations and the role of expert bias in the practical utilization of the available methods. The development of density functionals designed to address transition metals is also discussed. Special emphasis is placed on the methods that account for solvation effects and the multicomponent nature of practical catalytic systems. This is followed by an overview of recent computational studies addressing the mechanistic complexity of catalytic processes by molecular catalysts based on 3d metals. Cases that involve noninnocent ligands, multicomponent reaction systems, metal-ligand and metal-metal cooperativity, as well as modeling complex catalytic systems such as metal-organic frameworks are presented. Conventionally, computational studies on catalytic mechanisms are heavily dependent on the chemical intuition and expert input of the researcher. Recent developments in advanced automated methods for reaction path analysis hold promise for eliminating such human-bias from computational catalysis studies. A brief overview of these approaches is presented in the final section of the review. The paper is closed with general concluding remarks.
Collapse
Affiliation(s)
| | | | - Justin K. Kirkland
- Department
of Chemistry, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Jacob Townsend
- Department
of Chemistry, University of Tennessee, Knoxville, Tennessee 37996, United States
| | - Ali Hashemi
- Inorganic
Systems Engineering group, Department of Chemical Engineering, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| | - Chong Liu
- Inorganic
Systems Engineering group, Department of Chemical Engineering, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| | - Evgeny A. Pidko
- TheoMAT
group, ITMO University, Lomonosova 9, St. Petersburg 191002, Russia
- Inorganic
Systems Engineering group, Department of Chemical Engineering, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands
| |
Collapse
|
7
|
Collins CR, Gordon GJ, von Lilienfeld OA, Yaron DJ. Constant size descriptors for accurate machine learning models of molecular properties. J Chem Phys 2018; 148:241718. [DOI: 10.1063/1.5020441] [Citation(s) in RCA: 73] [Impact Index Per Article: 12.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Affiliation(s)
- Christopher R. Collins
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Geoffrey J. Gordon
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - O. Anatole von Lilienfeld
- Department of Chemistry, Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials (MARVEL), University of Basel, 4056 Basel, Switzerland
| | - David J. Yaron
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
8
|
Sadowski P, Fooshee D, Subrahmanya N, Baldi P. Synergies Between Quantum Mechanics and Machine Learning in Reaction Prediction. J Chem Inf Model 2016; 56:2125-2128. [PMID: 27749058 DOI: 10.1021/acs.jcim.6b00351] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Machine learning (ML) and quantum mechanical (QM) methods can be used in two-way synergy to build chemical reaction expert systems. The proposed ML approach identifies electron sources and sinks among reactants and then ranks all source-sink pairs. This addresses a bottleneck of QM calculations by providing a prioritized list of mechanistic reaction steps. QM modeling can then be used to compute the transition states and activation energies of the top-ranked reactions, providing additional or improved examples of ranked source-sink pairs. Retraining the ML model closes the loop, producing more accurate predictions from a larger training set. The approach is demonstrated in detail using a small set of organic radical reactions.
Collapse
Affiliation(s)
- Peter Sadowski
- University of California, Irvine , Department of Computer Science, Irvine, California 92697, United States
| | - David Fooshee
- University of California, Irvine , Department of Computer Science, Irvine, California 92697, United States
| | - Niranjan Subrahmanya
- ExxonMobil Research and Engineering , Annandale, New Jersey 08801, United States
| | - Pierre Baldi
- University of California, Irvine , Department of Computer Science, Irvine, California 92697, United States
| |
Collapse
|
9
|
Riniker S, Landrum GA. Better Informed Distance Geometry: Using What We Know To Improve Conformation Generation. J Chem Inf Model 2015; 55:2562-74. [DOI: 10.1021/acs.jcim.5b00654] [Citation(s) in RCA: 200] [Impact Index Per Article: 22.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Sereina Riniker
- Laboratory
of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Gregory A. Landrum
- Novartis
Institutes for BioMedical Research, Novartis Pharma AG, Novartis Campus, 4056 Basel, Switzerland
| |
Collapse
|
10
|
Supady A, Blum V, Baldauf C. First-Principles Molecular Structure Search with a Genetic Algorithm. J Chem Inf Model 2015; 55:2338-48. [DOI: 10.1021/acs.jcim.5b00243] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Adriana Supady
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
| | - Volker Blum
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
- Department of Mechanical Engineering & Materials Science, Duke University, Durham, North Carolina 27708, United States
| | - Carsten Baldauf
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
| |
Collapse
|
11
|
Bruno IJ, Groom CR. A crystallographic perspective on sharing data and knowledge. J Comput Aided Mol Des 2014; 28:1015-22. [PMID: 25091065 PMCID: PMC4196029 DOI: 10.1007/s10822-014-9780-9] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 07/17/2014] [Indexed: 11/24/2022]
Abstract
The crystallographic community is in many ways an exemplar of the benefits and practices of sharing data. Since the inception of the technique, virtually every published crystal structure has been made available to others. This has been achieved through the establishment of several specialist data centres, including the Cambridge Crystallographic Data Centre, which produces the Cambridge Structural Database. Containing curated structures of small organic molecules, some containing a metal, the database has been produced for almost 50 years. This has required the development of complex informatics tools and an environment allowing expert human curation. As importantly, a financial model has evolved which has, to date, ensured the sustainability of the resource. However, the opportunities afforded by technological changes and changing attitudes to sharing data make it an opportune moment to review current practices.
Collapse
Affiliation(s)
- Ian J Bruno
- The Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK,
| | | |
Collapse
|
12
|
Foscato M, Venkatraman V, Occhipinti G, Alsberg BK, Jensen VR. Automated Building of Organometallic Complexes from 3D Fragments. J Chem Inf Model 2014; 54:1919-31. [DOI: 10.1021/ci5003153] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Affiliation(s)
- Marco Foscato
- Department
of Chemistry, University of Bergen, Allégaten 41, N-5007 Bergen, Norway
| | - Vishwesh Venkatraman
- Department
of Chemistry, Norwegian University of Science and Technology, Ho̷gskoleringen
1, N-7491 Trondheim, Norway
| | - Giovanni Occhipinti
- Department
of Chemistry, University of Bergen, Allégaten 41, N-5007 Bergen, Norway
| | - Bjørn K. Alsberg
- Department
of Chemistry, Norwegian University of Science and Technology, Ho̷gskoleringen
1, N-7491 Trondheim, Norway
| | - Vidar R. Jensen
- Department
of Chemistry, University of Bergen, Allégaten 41, N-5007 Bergen, Norway
| |
Collapse
|