1
|
Okada H, Maeda S. On Accelerating Substrate Optimization Using Computational Gibbs Energy Barriers: A Numerical Consideration Utilizing a Computational Data Set. ACS OMEGA 2024; 9:7123-7131. [PMID: 38371820 PMCID: PMC10870292 DOI: 10.1021/acsomega.3c09066] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/14/2023] [Revised: 01/05/2024] [Accepted: 01/16/2024] [Indexed: 02/20/2024]
Abstract
Substrate optimization is a time- and resource-consuming step in organic synthesis. Recent advances in chemo- and materials-informatics provide systematic and efficient procedures utilizing tools such as Bayesian optimization (BO). This study explores the possibility of reducing the required experiments further by utilizing computational Gibbs energy barriers. To thoroughly validate the impact of using computational Gibbs energy barriers in BO-assisted substrate optimization, this study employs a computational Gibbs energy barrier data set in the literature and performs an extensive numerical investigation virtually regarding the Gibbs energy barriers as virtual experimental results and those with systematic and random noises as virtual computational results. The present numerical investigation shows that even the computational reactivity affected by noises of as much as 20 kJ/mol helps reduce the number of required experiments.
Collapse
Affiliation(s)
- Hiroaki Okada
- Graduate
School of Chemical Sciences and Engineering, Hokkaido University, Sapporo, Hokkaido 060-8628, Japan
| | - Satoshi Maeda
- Department
of Chemistry, Graduate School of Science, Hokkaido University, Sapporo, Hokkaido 060-0810, Japan
- Institute
for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido 001-0021, Japan
- ERATO
Maeda Artificial Intelligence for Chemical Reaction Design and Discovery
Project, Hokkaido University, Sapporo, Hokkaido 060-0810, Japan
- Research
and Services Division of Materials Data and Integrated System (MaDIS), National Institute for Materials Science (NIMS), Tsukuba, Ibaraki 305-0044, Japan
| |
Collapse
|
2
|
Colliandre L, Muller C. Bayesian Optimization in Drug Discovery. Methods Mol Biol 2024; 2716:101-136. [PMID: 37702937 DOI: 10.1007/978-1-0716-3449-3_5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/14/2023]
Abstract
Drug discovery deals with the search for initial hits and their optimization toward a targeted clinical profile. Throughout the discovery pipeline, the candidate profile will evolve, but the optimization will mainly stay a trial-and-error approach. Tons of in silico methods have been developed to improve and fasten this pipeline. Bayesian optimization (BO) is a well-known method for the determination of the global optimum of a function. In the last decade, BO has gained popularity in the early drug design phase. This chapter starts with the concept of black box optimization applied to drug design and presents some approaches to tackle it. Then it focuses on BO and explains its principle and all the algorithmic building blocks needed to implement it. This explanation aims to be accessible to people involved in drug discovery projects. A strong emphasis is made on the solutions to deal with the specific constraints of drug discovery. Finally, a large set of practical applications of BO is highlighted.
Collapse
|
3
|
Folmsbee D, Koes DR, Hutchison GR. Systematic Comparison of Experimental Crystallographic Geometries and Gas-Phase Computed Conformers for Torsion Preferences. J Chem Inf Model 2023; 63:7401-7411. [PMID: 38000780 PMCID: PMC10716907 DOI: 10.1021/acs.jcim.3c01278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/07/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
We performed exhaustive torsion sampling on more than 3 million compounds using the GFN2-xTB method and performed a comparison of experimental crystallographic and gas-phase conformers. Many conformer sampling methods derive torsional angle distributions from experimental crystallographic data, limiting the torsion preferences to molecules that must be stable, synthetically accessible, and able to be crystallized. In this work, we evaluate the differences in torsional preferences of experimental crystallographic geometries and gas-phase computed conformers from a broad selection of compounds to determine whether torsional angle distributions obtained from semiempirical methods are suitable priors for conformer sampling. We find that differences in torsion preferences can be mostly attributed to a lack of available experimental crystallographic data with small deviations derived from gas-phase geometry differences. GFN2 demonstrates the ability to provide accurate and reliable torsional preferences that can provide a basis for new methods free from the limitations of experimental data collection. We provide Gaussian-based fits and sampling distributions suitable for torsion sampling and propose an alternative to the widely used "experimental torsion and knowledge distance geometry" (ETKDG) method using quantum torsion-derived distance geometry (QTDG) methods.
Collapse
Affiliation(s)
- Dakota
L. Folmsbee
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Anesthesiology & Perioperative Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - David R. Koes
- Department
of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R. Hutchison
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Chemical & Petroleum Engineering, University of Pittsburgh, 3700 O’Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
4
|
McNutt A, Bisiriyu F, Song S, Vyas A, Hutchison GR, Koes DR. Conformer Generation for Structure-Based Drug Design: How Many and How Good? J Chem Inf Model 2023; 63:6598-6607. [PMID: 37903507 PMCID: PMC10647020 DOI: 10.1021/acs.jcim.3c01245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 10/18/2023] [Accepted: 10/19/2023] [Indexed: 11/01/2023]
Abstract
Conformer generation, the assignment of realistic 3D coordinates to a small molecule, is fundamental to structure-based drug design. Conformational ensembles are required for rigid-body matching algorithms, such as shape-based or pharmacophore approaches, and even methods that treat the ligand flexibly, such as docking, are dependent on the quality of the provided conformations due to not sampling all degrees of freedom (e.g., only sampling torsions). Here, we empirically elucidate some general principles about the size, diversity, and quality of the conformational ensembles needed to get the best performance in common structure-based drug discovery tasks. In many cases, our findings may parallel "common knowledge" well-known to practitioners of the field. Nonetheless, we feel that it is valuable to quantify these conformational effects while reproducing and expanding upon previous studies. Specifically, we investigate the performance of a state-of-the-art generative deep learning approach versus a more classical geometry-based approach, the effect of energy minimization as a postprocessing step, the effect of ensemble size (maximum number of conformers), and construction (filtering by root-mean-square deviation for diversity) and how these choices influence the ability to recapitulate bioactive conformations and perform pharmacophore screening and molecular docking.
Collapse
Affiliation(s)
- Andrew
T. McNutt
- Department
of Computational and Systems Biology, University
of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
| | - Fatimah Bisiriyu
- The
Neighborhood Academy, Pittsburgh, Pennsylvania 15206, United States
| | - Sophia Song
- Upper
St. Clair High School, Pittsburgh, Pennsylvania 15241, United States
| | - Ananya Vyas
- Taylor
Allderdice High School, Pittsburgh, Pennsylvania 15217, United States
| | - Geoffrey R. Hutchison
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
- Department
of Chemical and Petroleum Engineering, University
of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
| | - David Ryan Koes
- Department
of Computational and Systems Biology, University
of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
5
|
Andreadi N, Zankov D, Karpov K, Mitrofanov A. Tree Parzen estimator for global geometry optimization: A benchmark and database of experimental gas-phase structures of organic molecules. J Comput Chem 2022; 43:1434-1441. [PMID: 35678223 DOI: 10.1002/jcc.26947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Revised: 05/23/2022] [Accepted: 05/25/2022] [Indexed: 11/07/2022]
Abstract
Finding global and local minima on the potential energy surface is a key task for most studies in computational chemistry. Having a set of possible conformations for chemical structures and their corresponding energies, one can judge their chemical activity, understand the mechanisms of reactions, describe the formation of metal-ligand and ligand-protein complexes, and so forth. Despite the fact that the interest in various minima search algorithms in computational chemistry arose a while ago (during the formation of this science), new methods are still emerging. These methods allow to perform conformational analysis and geometry optimization faster, more accurately, or for more specific tasks. This article presents the application of a novel global geometry optimization approach based on the Tree Parzen Estimator method. For benchmarking, a database of small organic molecule geometries in the global minimum conformation was created, as well as a software package to perform the tests.
Collapse
Affiliation(s)
- Nikolai Andreadi
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
| | - Dmitry Zankov
- Science Data Software, LLC, Rockville, Maryland, USA
| | - Kirill Karpov
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Science Data Software, LLC, Rockville, Maryland, USA
| | - Artem Mitrofanov
- Department of Chemistry, Lomonosov Moscow State University, Moscow, Russia
- Science Data Software, LLC, Rockville, Maryland, USA
| |
Collapse
|
6
|
Steiner M, Reiher M. Autonomous Reaction Network Exploration in Homogeneous and Heterogeneous Catalysis. Top Catal 2022; 65:6-39. [PMID: 35185305 PMCID: PMC8816766 DOI: 10.1007/s11244-021-01543-9] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/17/2021] [Indexed: 12/11/2022]
Abstract
Autonomous computations that rely on automated reaction network elucidation algorithms may pave the way to make computational catalysis on a par with experimental research in the field. Several advantages of this approach are key to catalysis: (i) automation allows one to consider orders of magnitude more structures in a systematic and open-ended fashion than what would be accessible by manual inspection. Eventually, full resolution in terms of structural varieties and conformations as well as with respect to the type and number of potentially important elementary reaction steps (including decomposition reactions that determine turnover numbers) may be achieved. (ii) Fast electronic structure methods with uncertainty quantification warrant high efficiency and reliability in order to not only deliver results quickly, but also to allow for predictive work. (iii) A high degree of autonomy reduces the amount of manual human work, processing errors, and human bias. Although being inherently unbiased, it is still steerable with respect to specific regions of an emerging network and with respect to the addition of new reactant species. This allows for a high fidelity of the formalization of some catalytic process and for surprising in silico discoveries. In this work, we first review the state of the art in computational catalysis to embed autonomous explorations into the general field from which it draws its ingredients. We then elaborate on the specific conceptual issues that arise in the context of autonomous computational procedures, some of which we discuss at an example catalytic system.
Collapse
Affiliation(s)
- Miguel Steiner
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- Laboratory of Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
7
|
Ferro-Costas D, Mosquera-Lois I, Fernández-Ramos A. TorsiFlex: an automatic generator of torsional conformers. Application to the twenty proteinogenic amino acids. J Cheminform 2021; 13:100. [PMID: 34952644 PMCID: PMC8710030 DOI: 10.1186/s13321-021-00578-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2021] [Accepted: 12/08/2021] [Indexed: 11/10/2022] Open
Abstract
In this work, we introduce TorsiFlex, a user-friendly software written in Python 3 and designed to find all the torsional conformers of flexible acyclic molecules in an automatic fashion. For the mapping of the torsional potential energy surface, the algorithm implemented in TorsiFlex combines two searching strategies: preconditioned and stochastic. The former is a type of systematic search based on chemical knowledge and should be carried out before the stochastic (random) search. The algorithm applies several validation tests to accelerate the exploration of the torsional space. For instance, the optimized structures are stored and this information is used to prevent revisiting these points and their surroundings in future iterations. TorsiFlex operates with a dual-level strategy by which the initial search is carried out at an inexpensive electronic structure level of theory and the located conformers are reoptimized at a higher level. Additionally, the program takes advantage of conformational enantiomerism, when possible. As a case study, and in order to exemplify the effectiveness and capabilities of this program, we have employed TorsiFlex to locate the conformers of the twenty proteinogenic amino acids in their neutral canonical form. TorsiFlex has produced a number of conformers that roughly doubles the amount of the most complete work to date.
Collapse
Affiliation(s)
- David Ferro-Costas
- Centro Singular de Investigación en Química Biolóxica e Materiais Moleculares (CIQUS), Universidade de Santiago de Compostela, 15782, Santiago de Compostela, Spain.
| | - Irea Mosquera-Lois
- Centro Singular de Investigación en Química Biolóxica e Materiais Moleculares (CIQUS), Universidade de Santiago de Compostela, 15782, Santiago de Compostela, Spain
| | - Antonio Fernández-Ramos
- Centro Singular de Investigación en Química Biolóxica e Materiais Moleculares (CIQUS), Universidade de Santiago de Compostela, 15782, Santiago de Compostela, Spain.
| |
Collapse
|
8
|
Immel S, Köck M, Reggelin M. NMR-Based Configurational Assignments of Natural Products: Gibbs Sampling and Bayesian Inference Using Floating Chirality Distance Geometry Calculations. Mar Drugs 2021; 20:14. [PMID: 35049868 PMCID: PMC8781118 DOI: 10.3390/md20010014] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Revised: 12/12/2021] [Accepted: 12/20/2021] [Indexed: 02/07/2023] Open
Abstract
Floating chirality restrained distance geometry (fc-rDG) calculations are used to directly evolve structures from NMR data such as NOE-derived intramolecular distances or anisotropic residual dipolar couplings (RDCs). In contrast to evaluating pre-calculated structures against NMR restraints, multiple configurations (diastereomers) and conformations are generated automatically within the experimental limits. In this report, we show that the "unphysical" rDG pseudo energies defined from NMR violations bear statistical significance, which allows assigning probabilities to configurational assignments made that are fully compatible with the method of Bayesian inference. These "diastereomeric differentiabilities" then even become almost independent of the actual values of the force constants used to model the restraints originating from NOE or RDC data.
Collapse
Affiliation(s)
- Stefan Immel
- Clemens-Schöpf-Institut für Organische Chemie und Biochemie, Technische Universität Darmstadt, Alarich-Weiss-Straße 4, 64287 Darmstadt, Germany
| | - Matthias Köck
- Alfred-Wegener-Institut für Polar-und Meeresforschung in der Helmholtz-Gemeinschaft, Am Handelshafen 12, 27570 Bremerhaven, Germany;
| | - Michael Reggelin
- Clemens-Schöpf-Institut für Organische Chemie und Biochemie, Technische Universität Darmstadt, Alarich-Weiss-Straße 4, 64287 Darmstadt, Germany
| |
Collapse
|
9
|
Chan L, Morris GM, Hutchison GR. Understanding Conformational Entropy in Small Molecules. J Chem Theory Comput 2021; 17:2099-2106. [DOI: 10.1021/acs.jctc.0c01213] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Lucian Chan
- Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, U.K
| | - Garrett M. Morris
- Department of Statistics, University of Oxford, 24-29 St Giles’, Oxford OX1 3LB, U.K
| | - Geoffrey R. Hutchison
- Department of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department of Chemical and Petroleum Engineering, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
10
|
Nielson FF, Colby SM, Thomas DG, Renslow RS, Metz TO. Exploring the Impacts of Conformer Selection Methods on Ion Mobility Collision Cross Section Predictions. Anal Chem 2021; 93:3830-3838. [PMID: 33606495 DOI: 10.1021/acs.analchem.0c04341] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The prediction of structure dependent molecular properties, such as collision cross sections as measured using ion mobility spectrometry, are crucially dependent on the selection of the correct population of molecular conformers. Here, we report an in-depth evaluation of multiple conformation selection techniques, including simple averaging, Boltzmann weighting, lowest energy selection, low energy threshold reductions, and similarity reduction. Generating 50 000 conformers each for 18 molecules, we used the In Silico Chemical Library Engine (ISiCLE) to calculate the collision cross sections for the entire data set. First, we employed Monte Carlo simulations to understand the variability between conformer structures as generated using simulated annealing. Then we employed Monte Carlo simulations to the aforementioned conformer selection techniques applied on the simulated molecular property: the ion mobility collision cross section. Based on our analyses, we found Boltzmann weighting to be a good trade-off between precision and theoretical accuracy. Combining multiple techniques revealed that energy thresholds and root-mean-squared deviation-based similarity reductions can save considerable computational expense while maintaining property prediction accuracy. Molecular dynamic conformer generation tools like AMBER can continue to generate new lowest energy conformers even after tens of thousands of generations, decreasing precision between runs. This reduced precision can be ameliorated and theoretical accuracy increased by running density functional theory geometry optimization on carefully selected conformers.
Collapse
Affiliation(s)
- Felicity F Nielson
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Sean M Colby
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Dennis G Thomas
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Ryan S Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| | - Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington United States
| |
Collapse
|