1
|
Margraf JT. Neural graph distance embedding for molecular geometry generation. J Comput Chem 2024; 45:1784-1790. [PMID: 38655845 DOI: 10.1002/jcc.27349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/05/2024] [Accepted: 03/08/2024] [Indexed: 04/26/2024]
Abstract
This article introduces neural graph distance embedding (nGDE), a method for generating 3D molecular geometries. Leveraging a graph neural network trained on the OE62 dataset of molecular geometries, nGDE predicts interatomic distances based on molecular graphs. These distances are then used in multidimensional scaling to produce 3D geometries, subsequently refined with standard bioorganic forcefields. The machine learning-based graph distance introduced herein is found to be an improvement over the conventional shortest path distances used in graph drawing. Comparative analysis with a state-of-the-art distance geometry method demonstrates nGDE's competitive performance, particularly showcasing robustness in handling polycyclic molecules-a challenge for existing methods.
Collapse
Affiliation(s)
- Johannes T Margraf
- Bavarian Center for Battery Technology (BayBatt), University of Bayreuth, Bayreuth, Germany
| |
Collapse
|
2
|
Laplaza R, Wodrich MD, Corminboeuf C. Overcoming the Pitfalls of Computing Reaction Selectivity from Ensembles of Transition States. J Phys Chem Lett 2024:7363-7370. [PMID: 38990895 DOI: 10.1021/acs.jpclett.4c01657] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/13/2024]
Abstract
The prediction of reaction selectivity is a challenging task for computational chemistry, not only because many molecules adopt multiple conformations but also due to the exponential relationship between effective activation energies and rate constants. To account for molecular flexibility, an increasing number of methods exist that generate conformational ensembles of transition state (TS) structures. Typically, these TS ensembles are Boltzmann weighted and used to compute selectivity assuming Curtin-Hammett conditions. This strategy, however, can lead to erroneous predictions if the appropriate filtering of the conformer ensembles is not conducted. Here, we demonstrate how any possible selectivity can be obtained by processing the same sets of TS ensembles for a model reaction. To address the burdensome filtering task in a consistent and automated way, we introduce marc, a tool for the modular analysis of representative conformers that aids in avoiding human errors while minimizing the number of reoptimization computations needed to obtain correct reaction selectivity.
Collapse
Affiliation(s)
- Ruben Laplaza
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Matthew D Wodrich
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland
- National Center for Competence in Research-Catalysis (NCCR-Catalysis), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
3
|
King NJ, LeBlanc ID, Brown A. A variant on the CREST iMTD algorithm for noncovalent clusters of flexible molecules. J Comput Chem 2024. [PMID: 38944673 DOI: 10.1002/jcc.27458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 05/15/2024] [Accepted: 06/12/2024] [Indexed: 07/01/2024]
Abstract
Conformational ensemble generation and the search for the global minimum conformation are important problems in computational chemistry. In this work, a variant on the conformer-rotamer ensemble sampling tool (CREST) iterative metadynamics (iMTD) algorithm designed for determining structural ensembles and energetics of noncovalent clusters of flexible molecules is presented. We term this new algorithm a low-energy diversity-enhanced variant on CREST, or LEDE-CREST. As with CREST, the energies are evaluated using the semiempirical GFN2-xTB extended tight binding approach. The utility of the algorithm is highlighted by generating ensembles for a variety of noncovalent clusters of flexible or rigid monomers using both CREST and LEDE-CREST.
Collapse
Affiliation(s)
- Nathanael J King
- Department of Chemistry, University of Alberta, Edmonton, Canada
| | - Ian D LeBlanc
- Department of Computer Science, Grant MacEwan University, Edmonton, Canada
| | - Alex Brown
- Department of Chemistry, University of Alberta, Edmonton, Canada
| |
Collapse
|
4
|
Ginex T, Vázquez J, Estarellas C, Luque FJ. Quantum mechanical-based strategies in drug discovery: Finding the pace to new challenges in drug design. Curr Opin Struct Biol 2024; 87:102870. [PMID: 38914031 DOI: 10.1016/j.sbi.2024.102870] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2024] [Revised: 06/02/2024] [Accepted: 06/04/2024] [Indexed: 06/26/2024]
Abstract
The expansion of the chemical space to tangible libraries containing billions of synthesizable molecules opens exciting opportunities for drug discovery, but also challenges the power of computer-aided drug design to prioritize the best candidates. This directly hits quantum mechanics (QM) methods, which provide chemically accurate properties, but subject to small-sized systems. Preserving accuracy while optimizing the computational cost is at the heart of many efforts to develop high-quality, efficient QM-based strategies, reflected in refined algorithms and computational approaches. The design of QM-tailored physics-based force fields and the coupling of QM with machine learning, in conjunction with the computing performance of supercomputing resources, will enhance the ability to use these methods in drug discovery. The challenge is formidable, but we will undoubtedly see impressive advances that will define a new era.
Collapse
Affiliation(s)
- Tiziana Ginex
- Pharmacelera, Parc Científic de Barcelona (PCB), Baldiri Reixac 4-8, 08028 Barcelona, Spain
| | - Javier Vázquez
- Pharmacelera, Parc Científic de Barcelona (PCB), Baldiri Reixac 4-8, 08028 Barcelona, Spain; Departament de Nutrició, Ciències de l'Alimentació i Gastronomia, Universitat de Barcelona, Institut de Biomedicina (IBUB), 08921 Santa Coloma de Gramenet, Spain; Institut de Biomedicina (IBUB), 08921 Santa Coloma de Gramenet, Spain
| | - Carolina Estarellas
- Departament de Nutrició, Ciències de l'Alimentació i Gastronomia, Universitat de Barcelona, Institut de Biomedicina (IBUB), 08921 Santa Coloma de Gramenet, Spain; Institut de Química Teòrica i Computacional (IQTCUB), 08921 Santa Coloma de Gramenet, Spain
| | - F Javier Luque
- Departament de Nutrició, Ciències de l'Alimentació i Gastronomia, Universitat de Barcelona, Institut de Biomedicina (IBUB), 08921 Santa Coloma de Gramenet, Spain; Institut de Biomedicina (IBUB), 08921 Santa Coloma de Gramenet, Spain; Institut de Química Teòrica i Computacional (IQTCUB), 08921 Santa Coloma de Gramenet, Spain.
| |
Collapse
|
5
|
Mészáros BB, Kubicskó K, Németh DD, Daru J. Emerging Conformational-Analysis Protocols from the RTCONF55-16K Reaction Thermochemistry Conformational Benchmark Set. J Chem Theory Comput 2024. [PMID: 38899777 DOI: 10.1021/acs.jctc.4c00565] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/21/2024]
Abstract
RTCONF55-16K is a new, reactive conformational data set based on cost-efficient methods to assess different conformational analysis protocols. Our reference calculations underpinned the accuracy of the CENSO (Grimme et al. J. Phys. Chem. A, 2021, 125, 4039) procedure and resulted in alternative recipes with different cost-accuracy compromises. Our general-purpose and economical protocols (CENSO-light and zero, respectively) were found to be 10-30 times faster than the original algorithm, adding only 0.4-0.7 kcal/mol absolute error to the relative free energy estimates.
Collapse
Affiliation(s)
- Bence Balázs Mészáros
- Hevesy György PhD School of Chemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
- Department of Organic Chemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - Károly Kubicskó
- Hevesy György PhD School of Chemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
- Department of Organic Chemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - Dávid Dorián Németh
- Department of Organic Chemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| | - János Daru
- Department of Organic Chemistry, ELTE Eötvös Loránd University, Pázmány Péter sétány 1/A, 1117 Budapest, Hungary
| |
Collapse
|
6
|
Raush E, Abagyan R, Totrov M. Efficient Generation of Conformer Ensembles Using Internal Coordinates and a Generative Directional Graph Convolution Neural Network. J Chem Theory Comput 2024; 20:4054-4063. [PMID: 38669307 DOI: 10.1021/acs.jctc.4c00280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/28/2024]
Abstract
We present a neural-network-based high-throughput molecular conformer-generation algorithm. A chemical graph-convolutional network is trained to predict low-energy conformers in internal coordinate representation (bond lengths, bond, and torsion angles), starting from two-dimensional (2D) chemical topology. Generative neural network (NN) architecture performs denoising from torsion space, producing conformer ensembles with populations that are well correlated with torsion energy profiles. Short force-field-based energy minimization is applied to refine final conformers. All computation-intensive stages of the algorithm are GPU-optimized. The procedure (termed GINGER) is benchmarked on a commonly used test set of bioactive three-dimensional (3D) conformers from the PDB. We demonstrate highly competitive results in conformer recovery and throughput rates suitable for giga-scale compound library processing. A web server that allows interactive conformer ensemble generation by GINGER and their viewing is made freely available at https://www.molsoft.com/gingerdemo.html.
Collapse
Affiliation(s)
- Eugene Raush
- Molsoft L.L.C., 11199 Sorrento Valley Road, S209, San Diego, California 92121, United States
| | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California 92093, United States
| | - Maxim Totrov
- Molsoft L.L.C., 11199 Sorrento Valley Road, S209, San Diego, California 92121, United States
| |
Collapse
|
7
|
Ai H, Wu D, Zhou H, Xu J, Gu Q. dMXP: A De Novo Small-Molecule 3D Structure Predictor with Graph Attention Networks. J Chem Inf Model 2024; 64:3744-3755. [PMID: 38662925 DOI: 10.1021/acs.jcim.4c00391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/14/2024]
Abstract
Generating the three-dimensional (3D) structure of small molecules is crucial in both structure- and ligand-based drug design. Structure-based drug design needs bioactive conformations of compounds for lead identification and optimization. Ligand-based drug design techniques, such as 3D shape similarity search, 3D pharmacophore model, 3D-QSAR, etc., all require high-quality small-molecule ligand conformations to obtain reliable results. Although predicting a small molecular bioactive conformer requires information from the receptor, a crystal structure of the molecule is a proper approximation to its bioactive conformer in a specific receptor because the binding pose of a small molecule in its receptor's binding pockets should be energetically close to the crystal structures. This study presents a de novo small molecular structure predictor (dMXP) with graph attention networks based on crystal data derived from the Cambridge Structural Database (CSD) combined with molecular electrostatic information calculated by density-functional theory (DFT). Two featuring strategies (topological and atomic partial change features) were employed to explore the relation between these features and the 3D crystal structure of a small molecule. These features were then assembled to construct the holistic 3D crystal structure of a molecule. Molecular graphs were encoded using a graph attention mechanism to deal with the issues of the inconsistencies of local substructures contributing to the entire molecular structure. The root-mean-square deviation (RMSDs) of approximately 80% dMXP predicted structures and the native binding poses within receptors are less than 2.0 Å.
Collapse
Affiliation(s)
- Haopeng Ai
- Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou 510006, China
| | - Deyin Wu
- Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou 510006, China
| | - Huihao Zhou
- Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou 510006, China
| | - Jun Xu
- Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou 510006, China
| | - Qiong Gu
- Research Center for Drug Discovery, School of Pharmaceutical Sciences, Sun Yat-Sen University, 132 East Circle at University City, Guangzhou 510006, China
| |
Collapse
|
8
|
Williams DC, Inala N. Physics-Informed Generative Model for Drug-like Molecule Conformers. J Chem Inf Model 2024; 64:2988-3007. [PMID: 38486425 DOI: 10.1021/acs.jcim.3c01816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
We present a diffusion-based generative model for conformer generation. Our model is focused on the reproduction of the bonded structure and is constructed from the associated terms traditionally found in classical force fields to ensure a physically relevant representation. Techniques in deep learning are used to infer atom typing and geometric parameters from a training set. Conformer sampling is achieved by taking advantage of recent advancements in diffusion-based generation. By training on large, synthetic data sets of diverse, drug-like molecules optimized with the semiempirical GFN2-xTB method, high accuracy is achieved for bonded parameters, exceeding that of conventional, knowledge-based methods. Results are also compared to experimental structures from the Protein Databank and the Cambridge Structural Database.
Collapse
Affiliation(s)
- David C Williams
- Nobias Therapeutics, Inc., 144 S Whisman Rd, Suite C, Mountain View, California 94041, United States
| | - Neil Inala
- Nobias Therapeutics, Inc., 144 S Whisman Rd, Suite C, Mountain View, California 94041, United States
| |
Collapse
|
9
|
Pracht P, Grimme S, Bannwarth C, Bohle F, Ehlert S, Feldmann G, Gorges J, Müller M, Neudecker T, Plett C, Spicher S, Steinbach P, Wesołowski PA, Zeller F. CREST-A program for the exploration of low-energy molecular chemical space. J Chem Phys 2024; 160:114110. [PMID: 38511658 DOI: 10.1063/5.0197592] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 02/29/2024] [Indexed: 03/22/2024] Open
Abstract
Conformer-rotamer sampling tool (CREST) is an open-source program for the efficient and automated exploration of molecular chemical space. Originally developed in Pracht et al. [Phys. Chem. Chem. Phys. 22, 7169 (2020)] as an automated driver for calculations at the extended tight-binding level (xTB), it offers a variety of molecular- and metadynamics simulations, geometry optimization, and molecular structure analysis capabilities. Implemented algorithms include automated procedures for conformational sampling, explicit solvation studies, the calculation of absolute molecular entropy, and the identification of molecular protonation and deprotonation sites. Calculations are set up to run concurrently, providing efficient single-node parallelization. CREST is designed to require minimal user input and comes with an implementation of the GFNn-xTB Hamiltonians and the GFN-FF force-field. Furthermore, interfaces to any quantum chemistry and force-field software can easily be created. In this article, we present recent developments in the CREST code and show a selection of applications for the most important features of the program. An important novelty is the refactored calculation backend, which provides significant speed-up for sampling of small or medium-sized drug molecules and allows for more sophisticated setups, for example, quantum mechanics/molecular mechanics and minimum energy crossing point calculations.
Collapse
Affiliation(s)
- Philipp Pracht
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Stefan Grimme
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Christoph Bannwarth
- Institute for Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056 Aachen, Germany
| | - Fabian Bohle
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Sebastian Ehlert
- AI4Science, Microsoft Research, Evert van de Beekstraat 354, 1118 CZ Schiphol, The Netherlands
| | - Gereon Feldmann
- Institute for Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056 Aachen, Germany
| | - Johannes Gorges
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Marcel Müller
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | - Tim Neudecker
- Institute for Physical and Theoretical Chemistry, University of Bremen, 28359 Bremen, Germany
| | - Christoph Plett
- Mulliken Center for Theoretical Chemistry, Institute for Physical and Theoretical Chemistry, University of Bonn, Beringstr. 4, 53115 Bonn, Germany
| | | | - Pit Steinbach
- Institute for Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056 Aachen, Germany
| | - Patryk A Wesołowski
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Felix Zeller
- Institute for Physical and Theoretical Chemistry, University of Bremen, 28359 Bremen, Germany
| |
Collapse
|
10
|
Biyuzan H, Masrour MA, Grandmougin L, Payan F, Douguet D. SENSAAS-Flex: a joint optimization approach for aligning 3D shapes and exploring the molecular conformation space. Bioinformatics 2024; 40:btae105. [PMID: 38383065 PMCID: PMC10918633 DOI: 10.1093/bioinformatics/btae105] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 02/13/2024] [Accepted: 02/20/2024] [Indexed: 02/23/2024] Open
Abstract
MOTIVATION Popular shape-based alignment methods handle molecular flexibility by utilizing conformational ensembles to select the most fitted conformer. However, the initial conformer library generation step is computationally intensive and limiting to the overall alignment process. In this work, we describe a method to perform flexible alignment of two molecular shapes by optimizing the 3D conformation. SENSAAS-Flex, an add-on to the SENSAAS tool, is able to proceed from a limited set of initial conformers through an iterative process where additional conformational optimizations are made at the substructure level and constrained by the target shape. RESULTS In self- and cross-alignment experiments, SENSAAS-Flex is able to reproduce the crystal structure geometry of ligands of the AstraZeneca Molecule Overlay Test set and PDBbind refined dataset. Our study shows that the point-based representation of molecular surfaces is appropriate in terms of shape constraint to sample the conformational space and perform flexible molecular alignments. AVAILABILITY AND IMPLEMENTATION The documentation and source code are available at https://chemoinfo.ipmc.cnrs.fr/Sensaas-flex/sensaas-flex-main.tar.gz.
Collapse
Affiliation(s)
- Hamza Biyuzan
- Université Côte d’Azur, CNRS UMR7271, I3S, Sophia Antipolis 06900, France
| | | | - Lucas Grandmougin
- Université Côte d’Azur, CNRS UMR7271, I3S, Sophia Antipolis 06900, France
| | - Frédéric Payan
- Université Côte d’Azur, CNRS UMR7271, I3S, Sophia Antipolis 06900, France
| | - Dominique Douguet
- Université Côte d’Azur, Inserm U1323, CNRS UMR7275, IPMC, Valbonne 06560, France
| |
Collapse
|
11
|
Das S, Merz KM. Molecular Gas-Phase Conformational Ensembles. J Chem Inf Model 2024; 64:749-760. [PMID: 38206321 DOI: 10.1021/acs.jcim.3c01309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2024]
Abstract
Accurately determining the global minima of a molecular structure is important in diverse scientific fields, including drug design, materials science, and chemical synthesis. Conformational search engines serve as valuable tools for exploring the extensive conformational space of molecules and for identifying energetically favorable conformations. In this study, we present a comparison of Auto3D, CREST, Balloon, and ETKDG (from RDKit), which are freely available conformational search engines, to evaluate their effectiveness in locating global minima. These engines employ distinct methodologies, including machine learning (ML) potential-based, semiempirical, and force field-based approaches. To validate these methods, we propose the use of collisional cross-section (CCS) values obtained from ion mobility-mass spectrometry studies. We hypothesize that experimental gas-phase CCS values can provide experimental evidence that we likely have the global minimum for a given molecule. To facilitate this effort, we used our gas-phase conformation library (GPCL) which currently consists of the full ensembles of 20 small molecules and can be used by the community to validate any conformational search engine. Further members of the GPCL can be readily created for any molecule of interest using our standard workflow used to compute CCS values, expanding the ability of the GPCL in validation exercises. These innovative validation techniques enhance our understanding of the conformational landscape and provide valuable insights into the performance of conformational generation engines. Our findings shed light on the strengths and limitations of each search engine, enabling informed decisions for their utilization in various scientific fields, where accurate molecular structure determination is crucial for understanding biological activity and designing targeted interventions. By facilitating the identification of reliable conformations, this study significantly contributes to enhancing the efficiency and accuracy of molecular structure determination, with particular focus on metabolite structure elucidation. The findings of this research also provide valuable insights for developing effective workflows for predicting the structures of unknown compounds with high precision.
Collapse
Affiliation(s)
- Susanta Das
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, 578 S. Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
12
|
Gelžinytė E, Öeren M, Segall MD, Csányi G. Transferable Machine Learning Interatomic Potential for Bond Dissociation Energy Prediction of Drug-like Molecules. J Chem Theory Comput 2024; 20:164-177. [PMID: 38108269 PMCID: PMC10782450 DOI: 10.1021/acs.jctc.3c00710] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2023] [Revised: 11/30/2023] [Accepted: 11/30/2023] [Indexed: 12/19/2023]
Abstract
We present a transferable MACE interatomic potential that is applicable to open- and closed-shell drug-like molecules containing hydrogen, carbon, and oxygen atoms. Including an accurate description of radical species extends the scope of possible applications to bond dissociation energy (BDE) prediction, for example, in the context of cytochrome P450 (CYP) metabolism. The transferability of the MACE potential was validated on the COMP6 data set, containing only closed-shell molecules, where it reaches better accuracy than the readily available general ANI-2x potential. MACE achieves similar accuracy on two CYP metabolism-specific data sets, which include open- and closed-shell structures. This model enables us to calculate the aliphatic C-H BDE, which allows us to compare reaction energies of hydrogen abstraction, which is the rate-limiting step of the aliphatic hydroxylation reaction catalyzed by CYPs. On the "CYP 3A4" data set, MACE achieves a BDE RMSE of 1.37 kcal/mol and better prediction of BDE ranks than alternatives: the semiempirical AM1 and GFN2-xTB methods and the ALFABET model that directly predicts bond dissociation enthalpies. Finally, we highlight the smoothness of the MACE potential over paths of sp3C-H bond elongation and show that a minimal extension is enough for the MACE model to start finding reasonable minimum energy paths of methoxy radical-mediated hydrogen abstraction. Altogether, this work lays the ground for further extensions of scope in terms of chemical elements, (CYP-mediated) reaction classes and modeling the full reaction paths, not only BDEs.
Collapse
Affiliation(s)
- Elena Gelžinytė
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K.
| | - Mario Öeren
- Optibrium
Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.
| | - Matthew D. Segall
- Optibrium
Limited, Cambridge Innovation Park, Denny End Road, Cambridge CB25 9GL, U.K.
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, U.K.
| |
Collapse
|
13
|
Poongavanam V, Wieske LHE, Peintner S, Erdélyi M, Kihlberg J. Molecular chameleons in drug discovery. Nat Rev Chem 2024; 8:45-60. [PMID: 38123688 DOI: 10.1038/s41570-023-00563-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/16/2023] [Indexed: 12/23/2023]
Abstract
Molecular chameleons possess a flexibility that allows them to dynamically shield or expose polar functionalities in response to the properties of the environment. Although the concept of molecular chameleons was introduced already in 1970, interest in them has grown considerably since the 2010s, when drug discovery has focused to an increased extent on new chemical modalities. Such modalities include cyclic peptides, macrocycles and proteolysis-targeting chimeras, all of which reside in a chemical space far from that of traditional small-molecule drugs. Both cell permeability and aqueous solubility are required for the oral absorption of drugs. Engineering these properties, and potent target binding, into the larger new modalities is a more daunting task than for traditional small-molecule drugs. The ability of chameleons to adapt to different environments may be essential for success. In this Review, we provide both general and theoretical insights into the realm of molecular chameleons. We discuss why chameleons have come into fashion and provide a do-it-yourself toolbox for their design; we then provide a glimpse of how advanced in silico methods can support molecular chameleon design.
Collapse
Affiliation(s)
| | | | - Stefan Peintner
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Máté Erdélyi
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden
| | - Jan Kihlberg
- Department of Chemistry - BMC, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
14
|
Baillif B, Cole J, Giangreco I, McCabe P, Bender A. Applying atomistic neural networks to bias conformer ensembles towards bioactive-like conformations. J Cheminform 2023; 15:124. [PMID: 38129933 PMCID: PMC10740246 DOI: 10.1186/s13321-023-00794-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Accepted: 12/10/2023] [Indexed: 12/23/2023] Open
Abstract
Identifying bioactive conformations of small molecules is an essential process for virtual screening applications relying on three-dimensional structure such as molecular docking. For most small molecules, conformer generators retrieve at least one bioactive-like conformation, with an atomic root-mean-square deviation (ARMSD) lower than 1 Å, among the set of low-energy conformers generated. However, there is currently no general method to prioritise these likely target-bound conformations in the ensemble. In this work, we trained atomistic neural networks (AtNNs) on 3D information of generated conformers of a curated subset of PDBbind ligands to predict the ARMSD to their closest bioactive conformation, and evaluated the early enrichment of bioactive-like conformations when ranking conformers by AtNN prediction. AtNN ranking was compared with bioactivity-unaware baselines such as ascending Sage force field energy ranking, and a slower bioactivity-based baseline ranking by ascending Torsion Fingerprint Deviation to the Maximum Common Substructure to the most similar molecule in the training set (TFD2SimRefMCS). On test sets from random ligand splits of PDBbind, ranking conformers using ComENet, the AtNN encoding the most 3D information, leads to early enrichment of bioactive-like conformations with a median BEDROC of 0.29 ± 0.02, outperforming the best bioactivity-unaware Sage energy ranking baseline (median BEDROC of 0.18 ± 0.02), and performing on a par with the bioactivity-based TFD2SimRefMCS baseline (median BEDROC of 0.31 ± 0.02). The improved performance of the AtNN and TFD2SimRefMCS baseline is mostly observed on test set ligands that bind proteins similar to proteins observed in the training set. On a more challenging subset of flexible molecules, the bioactivity-unaware baselines showed median BEDROCs up to 0.02, while AtNNs and TFD2SimRefMCS showed median BEDROCs between 0.09 and 0.13. When performing rigid ligand re-docking of PDBbind ligands with GOLD using the 1% top-ranked conformers, ComENet ranked conformers showed a higher successful docking rate than bioactivity-unaware baselines, with a rate of 0.48 ± 0.02 compared to CSD probability baseline with a rate of 0.39 ± 0.02. Similarly, on a pharmacophore searching experiment, selecting the 20% top-ranked conformers ranked by ComENet showed higher hit rate compared to baselines. Hence, the approach presented here uses AtNNs successfully to focus conformer ensembles towards bioactive-like conformations, representing an opportunity to reduce computational expense in virtual screening applications on known targets that require input conformations.
Collapse
Affiliation(s)
- Benoit Baillif
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, Cambridge, CB2 1EW, UK
| | - Jason Cole
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK
| | - Ilenia Giangreco
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK
- Exscientia plc, The Schrödinger Building, Oxford Science Park, Oxford, OX4 4GE, UK
| | - Patrick McCabe
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge, CB2 1EZ, UK
| | - Andreas Bender
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Rd, Cambridge, CB2 1EW, UK.
| |
Collapse
|
15
|
Folmsbee D, Koes DR, Hutchison GR. Systematic Comparison of Experimental Crystallographic Geometries and Gas-Phase Computed Conformers for Torsion Preferences. J Chem Inf Model 2023; 63:7401-7411. [PMID: 38000780 PMCID: PMC10716907 DOI: 10.1021/acs.jcim.3c01278] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 11/07/2023] [Accepted: 11/13/2023] [Indexed: 11/26/2023]
Abstract
We performed exhaustive torsion sampling on more than 3 million compounds using the GFN2-xTB method and performed a comparison of experimental crystallographic and gas-phase conformers. Many conformer sampling methods derive torsional angle distributions from experimental crystallographic data, limiting the torsion preferences to molecules that must be stable, synthetically accessible, and able to be crystallized. In this work, we evaluate the differences in torsional preferences of experimental crystallographic geometries and gas-phase computed conformers from a broad selection of compounds to determine whether torsional angle distributions obtained from semiempirical methods are suitable priors for conformer sampling. We find that differences in torsion preferences can be mostly attributed to a lack of available experimental crystallographic data with small deviations derived from gas-phase geometry differences. GFN2 demonstrates the ability to provide accurate and reliable torsional preferences that can provide a basis for new methods free from the limitations of experimental data collection. We provide Gaussian-based fits and sampling distributions suitable for torsion sampling and propose an alternative to the widely used "experimental torsion and knowledge distance geometry" (ETKDG) method using quantum torsion-derived distance geometry (QTDG) methods.
Collapse
Affiliation(s)
- Dakota
L. Folmsbee
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Anesthesiology & Perioperative Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - David R. Koes
- Department
of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Geoffrey R. Hutchison
- Department
of Chemistry, University of Pittsburgh, 219 Parkman Avenue, Pittsburgh, Pennsylvania 15260, United States
- Department
of Chemical & Petroleum Engineering, University of Pittsburgh, 3700 O’Hara Street, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
16
|
Teng C, Huang D, Donahue E, Bao JL. Exploring torsional conformer space with physical prior mean function-driven meta-Gaussian processes. J Chem Phys 2023; 159:214111. [PMID: 38051097 DOI: 10.1063/5.0176709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 11/12/2023] [Indexed: 12/07/2023] Open
Abstract
We present a novel approach for systematically exploring the conformational space of small molecules with multiple internal torsions. Identifying unique conformers through a systematic conformational search is important for obtaining accurate thermodynamic functions (e.g., free energy), encompassing contributions from the ensemble of all local minima. Traditional geometry optimizers focus on one structure at a time, lacking transferability from the local potential-energy surface (PES) around a specific minimum to optimize other conformers. In this work, we introduce a physics-driven meta-Gaussian processes (meta-GPs) method that not only enables efficient exploration of target PES for locating local minima but, critically, incorporates physical surrogates that can be applied universally across the optimization of all conformers of the same molecule. Meta-GPs construct surrogate PESs based on the optimization history of prior conformers, dynamically selecting the most suitable prior mean function (representing prior knowledge in Bayesian learning) as a function of the optimization progress. We systematically benchmarked the performance of multiple GP variants for brute-force conformational search of amino acids. Our findings highlight the superior performance of meta-GPs in terms of efficiency, comprehensiveness of conformer discovery, and the distribution of conformers compared to conventional non-surrogate optimizers and other non-meta-GPs. Furthermore, we demonstrate that by concurrently optimizing, training GPs on the fly, and learning PESs, meta-GPs exhibit the capacity to generate high-quality PESs in the torsional space without extensive training data. This represents a promising avenue for physics-based transfer learning via meta-GPs with adaptive priors in exploring torsional conformer space.
Collapse
Affiliation(s)
- Chong Teng
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Daniel Huang
- Department of Computer Science, San Francisco State University, San Francisco, California 94132, USA
| | - Elizabeth Donahue
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| | - Junwei Lucas Bao
- Department of Chemistry, Boston College, Chestnut Hill, Massachusetts 02467, USA
| |
Collapse
|
17
|
Stylianakis I, Zervos N, Lii JH, Pantazis DA, Kolocouris A. Conformational energies of reference organic molecules: benchmarking of common efficient computational methods against coupled cluster theory. J Comput Aided Mol Des 2023; 37:607-656. [PMID: 37597063 PMCID: PMC10618395 DOI: 10.1007/s10822-023-00513-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 06/03/2023] [Indexed: 08/21/2023]
Abstract
We selected 145 reference organic molecules that include model fragments used in computer-aided drug design. We calculated 158 conformational energies and barriers using force fields, with wide applicability in commercial and free softwares and extensive application on the calculation of conformational energies of organic molecules, e.g. the UFF and DREIDING force fields, the Allinger's force fields MM3-96, MM3-00, MM4-8, the MM2-91 clones MMX and MM+, the MMFF94 force field, MM4, ab initio Hartree-Fock (HF) theory with different basis sets, the standard density functional theory B3LYP, the second-order post-HF MP2 theory and the Domain-based Local Pair Natural Orbital Coupled Cluster DLPNO-CCSD(T) theory, with the latter used for accurate reference values. The data set of the organic molecules includes hydrocarbons, haloalkanes, conjugated compounds, and oxygen-, nitrogen-, phosphorus- and sulphur-containing compounds. We reviewed in detail the conformational aspects of these model organic molecules providing the current understanding of the steric and electronic factors that determine the stability of low energy conformers and the literature including previous experimental observations and calculated findings. While progress on the computer hardware allows the calculations of thousands of conformations for later use in drug design projects, this study is an update from previous classical studies that used, as reference values, experimental ones using a variety of methods and different environments. The lowest mean error against the DLPNO-CCSD(T) reference was calculated for MP2 (0.35 kcal mol-1), followed by B3LYP (0.69 kcal mol-1) and the HF theories (0.81-1.0 kcal mol-1). As regards the force fields, the lowest errors were observed for the Allinger's force fields MM3-00 (1.28 kcal mol-1), ΜΜ3-96 (1.40 kcal mol-1) and the Halgren's MMFF94 force field (1.30 kcal mol-1) and then for the MM2-91 clones MMX (1.77 kcal mol-1) and MM+ (2.01 kcal mol-1) and MM4 (2.05 kcal mol-1). The DREIDING (3.63 kcal mol-1) and UFF (3.77 kcal mol-1) force fields have the lowest performance. These model organic molecules we used are often present as fragments in drug-like molecules. The values calculated using DLPNO-CCSD(T) make up a valuable data set for further comparisons and for improved force field parameterization.
Collapse
Affiliation(s)
- Ioannis Stylianakis
- Department of Medicinal Chemistry, Faculty of Pharmacy, National and Kapodistrian University of Athens, Panepistimioupolis Zografou, 15771, Athens, Greece
| | - Nikolaos Zervos
- Department of Medicinal Chemistry, Faculty of Pharmacy, National and Kapodistrian University of Athens, Panepistimioupolis Zografou, 15771, Athens, Greece
| | - Jenn-Huei Lii
- Department of Chemistry, National Changhua University of Education, Changhua City, Taiwan
| | - Dimitrios A Pantazis
- Max-Planck-Institut für Kohlenforschung, Kaiser-Wilhelm-Platz 1, 45470, Mülheim an der Ruhr, Germany
| | - Antonios Kolocouris
- Department of Medicinal Chemistry, Faculty of Pharmacy, National and Kapodistrian University of Athens, Panepistimioupolis Zografou, 15771, Athens, Greece.
- Laboratory of Medicinal Chemistry, Section of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Panepistimiopolis-Zografou, 15771, Athens, Greece.
| |
Collapse
|
18
|
Pattanaik L, Menon A, Settels V, Spiekermann KA, Tan Z, Vermeire FH, Sandfort F, Eiden P, Green WH. ConfSolv: Prediction of Solute Conformer-Free Energies across a Range of Solvents. J Phys Chem B 2023; 127:10151-10170. [PMID: 37966798 DOI: 10.1021/acs.jpcb.3c05904] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
Predicting Gibbs free energy of solution is key to understanding the solvent effects on thermodynamics and reaction rates for kinetic modeling. Accurately computing solution free energies requires the enumeration and evaluation of relevant solute conformers in solution. However, even after generation of relevant conformers, determining their free energy of solution requires an expensive workflow consisting of several ab initio computational chemistry calculations. To help address this challenge, we generate a large data set of solution free energies for nearly 44,000 solutes with almost 9 million conformers calculated in 41 different solvents using density functional theory and COSMO-RS and quantify the impact of solute conformers on the solution free energy. We then train a message passing neural network to predict the relative solution free energies of a set of solute conformers, enabling the identification of a small subset of thermodynamically relevant conformers. The model offers substantial computational time savings with predictions usually substantially within 1 kcal/mol of the free energy of the solution calculated by using computational chemical methods.
Collapse
Affiliation(s)
- Lagnajit Pattanaik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Angiras Menon
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Volker Settels
- BASF SE, Scientific Modeling, Group Research, Ludwigshafen am Rhein 67056, Germany
| | - Kevin A Spiekermann
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Zipei Tan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Florence H Vermeire
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemical Engineering, KU Leuven, Celestijnenlaan 200F, Leuven 3001, Belgium
| | - Frederik Sandfort
- BASF SE, Scientific Modeling, Group Research, Ludwigshafen am Rhein 67056, Germany
| | - Philipp Eiden
- BASF SE, Scientific Modeling, Group Research, Ludwigshafen am Rhein 67056, Germany
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
19
|
Xia S, Chen E, Zhang Y. Integrated Molecular Modeling and Machine Learning for Drug Design. J Chem Theory Comput 2023; 19:7478-7495. [PMID: 37883810 PMCID: PMC10653122 DOI: 10.1021/acs.jctc.3c00814] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 10/10/2023] [Accepted: 10/11/2023] [Indexed: 10/28/2023]
Abstract
Modern therapeutic development often involves several stages that are interconnected, and multiple iterations are usually required to bring a new drug to the market. Computational approaches have increasingly become an indispensable part of helping reduce the time and cost of the research and development of new drugs. In this Perspective, we summarize our recent efforts on integrating molecular modeling and machine learning to develop computational tools for modulator design, including a pocket-guided rational design approach based on AlphaSpace to target protein-protein interactions, delta machine learning scoring functions for protein-ligand docking as well as virtual screening, and state-of-the-art deep learning models to predict calculated and experimental molecular properties based on molecular mechanics optimized geometries. Meanwhile, we discuss remaining challenges and promising directions for further development and use a retrospective example of FDA approved kinase inhibitor Erlotinib to demonstrate the use of these newly developed computational tools.
Collapse
Affiliation(s)
- Song Xia
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Eric Chen
- Department
of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department
of Chemistry, New York University, New York, New York 10003, United States
- Simons
Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU
Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
20
|
McNutt A, Bisiriyu F, Song S, Vyas A, Hutchison GR, Koes DR. Conformer Generation for Structure-Based Drug Design: How Many and How Good? J Chem Inf Model 2023; 63:6598-6607. [PMID: 37903507 PMCID: PMC10647020 DOI: 10.1021/acs.jcim.3c01245] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 10/18/2023] [Accepted: 10/19/2023] [Indexed: 11/01/2023]
Abstract
Conformer generation, the assignment of realistic 3D coordinates to a small molecule, is fundamental to structure-based drug design. Conformational ensembles are required for rigid-body matching algorithms, such as shape-based or pharmacophore approaches, and even methods that treat the ligand flexibly, such as docking, are dependent on the quality of the provided conformations due to not sampling all degrees of freedom (e.g., only sampling torsions). Here, we empirically elucidate some general principles about the size, diversity, and quality of the conformational ensembles needed to get the best performance in common structure-based drug discovery tasks. In many cases, our findings may parallel "common knowledge" well-known to practitioners of the field. Nonetheless, we feel that it is valuable to quantify these conformational effects while reproducing and expanding upon previous studies. Specifically, we investigate the performance of a state-of-the-art generative deep learning approach versus a more classical geometry-based approach, the effect of energy minimization as a postprocessing step, the effect of ensemble size (maximum number of conformers), and construction (filtering by root-mean-square deviation for diversity) and how these choices influence the ability to recapitulate bioactive conformations and perform pharmacophore screening and molecular docking.
Collapse
Affiliation(s)
- Andrew
T. McNutt
- Department
of Computational and Systems Biology, University
of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
| | - Fatimah Bisiriyu
- The
Neighborhood Academy, Pittsburgh, Pennsylvania 15206, United States
| | - Sophia Song
- Upper
St. Clair High School, Pittsburgh, Pennsylvania 15241, United States
| | - Ananya Vyas
- Taylor
Allderdice High School, Pittsburgh, Pennsylvania 15217, United States
| | - Geoffrey R. Hutchison
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
- Department
of Chemical and Petroleum Engineering, University
of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
| | - David Ryan Koes
- Department
of Computational and Systems Biology, University
of Pittsburgh, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
21
|
Han F, Hao D, He X, Wang L, Niu T, Wang J. Distribution of Bound Conformations in Conformational Ensembles for X-ray Ligands Predicted by the ANI-2X Machine Learning Potential. J Chem Inf Model 2023; 63:6608-6618. [PMID: 37899502 PMCID: PMC10647024 DOI: 10.1021/acs.jcim.3c01350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 10/11/2023] [Accepted: 10/13/2023] [Indexed: 10/31/2023]
Abstract
In this study, we systematically studied the energy distribution of bioactive conformations of small molecular ligands in their conformational ensembles using ANI-2X, a machine learning potential, in conjunction with one of our recently developed geometry optimization algorithms, known as a conjugate gradient with backtracking line search (CG-BS). We first evaluated the combination of these methods (ANI-2X/CG-BS) using two molecule sets. For the 231-molecule set, ab initio calculations were performed at both the ωB97X/6-31G(d) and B3LYP-D3BJ/DZVP levels for accuracy comparison, while for the 8,992-molecule set, ab initio calculations were carried out at the B3LYP-D3BJ/DZVP level. For each molecule in the two molecular sets, up to 10 conformations were generated, which diminish the influence of individual outliers on the performance evaluation. Encouraged by the performance of ANI-2x/CG-BS in these evaluations, we calculated the energy distributions using ANI-2x/CG-BS for more than 27,000 ligands in the protein data bank (PDB). Each ligand has at least one conformation bound to a biological molecule, and this ligand conformation is labeled as a bound conformation. Besides the bound conformations, up to 200 conformations were generated using OpenEye's Omega2 software (https://docs.eyesopen.com/applications/ omega/) for each conformation. We performed a statistical analysis of how the bound conformation energies are distributed in the ensembles for 17,197 PDB ligands that have their bound conformation energies within the energy ranges of the Omega2-generated conformation ensembles. We found that half of the ligands have their relative conformation energy lower than 2.91 kcal/mol for the bound conformations in comparison with the global conformations, and about 90% of the bound conformations are within 10 kcal/mol above the global conformation energies. This information is useful to guide the construction of libraries for shape-based virtual screening and to improve the docking algorithm to efficiently sample bound conformations.
Collapse
Affiliation(s)
- Fengyang Han
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Dongxiao Hao
- School
of Electronics and Information Engineering, Ankang University, Ankang 725000, China
| | - Xibing He
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Luxuan Wang
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Taoyu Niu
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| | - Junmei Wang
- Department
of Pharmaceutical Sciences and Computational Chemical Genomics Screening
Center, School of Pharmacy, University of
Pittsburgh, Pittsburgh, Pennsylvania 15261, United States
| |
Collapse
|
22
|
Wang Z, Zhong H, Zhang J, Pan P, Wang D, Liu H, Yao X, Hou T, Kang Y. Small-Molecule Conformer Generators: Evaluation of Traditional Methods and AI Models on High-Quality Data Sets. J Chem Inf Model 2023; 63:6525-6536. [PMID: 37883143 DOI: 10.1021/acs.jcim.3c01519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2023]
Abstract
Small-molecule conformer generation (SMCG) is an extremely important task in both ligand- and structure-based computer-aided drug design, especially during the hit discovery phase. Recently, a multitude of artificial intelligence (AI) models tailored for SMCG have emerged. Despite developers typically furnishing performance evaluation data upon releasing their AI models, a comprehensive and equitable performance comparison between AI models and conventional methods is still lacking. In this study, we curated a new benchmarking data set comprising 3354 high-quality ligand bioactive conformations. Subsequently, we conducted a systematic assessment of the performance of four widely adopted traditional methods (i.e., ConfGenX, Conformator, OMEGA, and RDKit ETKDG) and five AI models (i.e., ConfGF, DMCG, GeoDiff, GeoMol, and torsional diffusion) in the tasks of reproducing bioactive and low-energy conformations of small molecules. In the former task, the AI models have no advantage, particularly with a maximum ensemble size of 1. Even the best-performing AI model GeoMol is still worse than any of the tested traditional methods. Conversely, in the latter task, the torsional diffusion model shows obvious advantages, surpassing the best-performing traditional method ConfGenX by 26.09 and 12.97% on the COV-R and COV-P metrics, respectively. Furthermore, the influence of force field-based fine-tuning on the quality of the generated conformers was also discussed. Finally, a user-friendly Web server called fastSMCG was developed to enable researchers to rapidly and flexibly generate small-molecule conformers using both traditional and AI methods. We anticipate that our work will offer valuable practical assistance to the scientific community in this field.
Collapse
Affiliation(s)
- Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Haiyang Zhong
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Jintu Zhang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Peichen Pan
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Dong Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Huanxiang Liu
- Faculty of Applied Science, Macao Polytechnic University, Macao SAR 999078, China
| | - Xiaojun Yao
- State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao SAR 999078, China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| | - Yu Kang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
23
|
Ilnicka A, Schneider G. Designing molecules with autoencoder networks. NATURE COMPUTATIONAL SCIENCE 2023; 3:922-933. [PMID: 38177601 DOI: 10.1038/s43588-023-00548-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Accepted: 10/03/2023] [Indexed: 01/06/2024]
Abstract
Autoencoders are versatile tools in molecular informatics. These unsupervised neural networks serve diverse tasks such as data-driven molecular representation and constructive molecular design. This Review explores their algorithmic foundations and applications in drug discovery, highlighting the most active areas of development and the contributions autoencoder networks have made in advancing this field. We also explore the challenges and prospects concerning the utilization of autoencoders and the various adaptations of this neural network architecture in molecular design.
Collapse
Affiliation(s)
- Agnieszka Ilnicka
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, ETH Zurich, Zurich, Switzerland.
| |
Collapse
|
24
|
Talmazan RA, Podewitz M. PyConSolv: A Python Package for Conformer Generation of (Metal-Containing) Systems in Explicit Solvent. J Chem Inf Model 2023; 63:5400-5407. [PMID: 37606893 PMCID: PMC10498442 DOI: 10.1021/acs.jcim.3c00798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 08/23/2023]
Abstract
We introduce PyConSolv, a freely available Python package that automates the generation of conformers of metal- and nonmetal-containing complexes in explicit solvent, through classical molecular dynamics simulations. Using a streamlined workflow and interfacing with widely used computational chemistry software, PyConSolv is an all-in-one tool for the generation of conformers in any solvent. Input requirements are minimal; only the geometry of the structure and the desired solvent in xyz (XMOL) format are needed. The package can also account for charged systems, by including arbitrary counterions in the simulation. A bonded model parametrization is performed automatically, utilizing AmberTools, ORCA, and Multiwfn software packages. PyConSolv provides a selection of preparametrized solvents and counterions for use in classical molecular dynamics simulations. We show the applicability of our package on a number of (transition-metal-containing) systems. The software is provided open source and free of charge.
Collapse
Affiliation(s)
- R. A. Talmazan
- Institute
of Materials Chemistry, TU Wien, Getreidemarkt 9, A-1060 Wien, Austria
| | - M. Podewitz
- Institute
of Materials Chemistry, TU Wien, Getreidemarkt 9, A-1060 Wien, Austria
| |
Collapse
|
25
|
Seidel T, Permann C, Wieder O, Kohlbacher SM, Langer T. High-Quality Conformer Generation with CONFORGE: Algorithm and Performance Assessment. J Chem Inf Model 2023; 63:5549-5570. [PMID: 37624145 PMCID: PMC10498443 DOI: 10.1021/acs.jcim.3c00563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Indexed: 08/26/2023]
Abstract
Knowledge of the putative bound-state conformation of a molecule is an essential prerequisite for the successful application of many computer-aided drug design methods that aim to assess or predict its capability to bind to a particular target receptor. An established approach to predict bioactive conformers in the absence of receptor structure information is to sample the low-energy conformational space of the investigated molecules and derive representative conformer ensembles that can be expected to comprise members closely resembling possible bound-state ligand conformations. The high relevance of such conformer generation functionality led to the development of a wide panel of dedicated commercial and open-source software tools throughout the last decades. Several published benchmarking studies have shown that open-source tools usually lag behind their commercial competitors in many key aspects. In this work, we introduce the open-source conformer ensemble generator CONFORGE, which aims at delivering state-of-the-art performance for all types of organic molecules in drug-like chemical space. The ability of CONFORGE and several well-known commercial and open-source conformer ensemble generators to reproduce experimental 3D structures as well as their computational efficiency and robustness has been assessed thoroughly for both typical drug-like molecules and macrocyclic structures. For small molecules, CONFORGE clearly outperformed all other tested open-source conformer generators and performed at least equally well as the evaluated commercial generators in terms of both processing speed and accuracy. In the case of macrocyclic structures, CONFORGE achieved the best average accuracy among all benchmarked generators, with RDKit's generator coming close in second place.
Collapse
Affiliation(s)
- Thomas Seidel
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Christian Permann
- NeGeMac
Research Platform, Department of Pharmaceutical Sciences, Division
of Pharmaceutical Chemistry, University
of Vienna, Josef-Holaubek-Platz
2, 1090 Vienna, Austria
| | - Oliver Wieder
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Stefan M. Kohlbacher
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Thierry Langer
- Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- NeGeMac
Research Platform, Department of Pharmaceutical Sciences, Division
of Pharmaceutical Chemistry, University
of Vienna, Josef-Holaubek-Platz
2, 1090 Vienna, Austria
- Christian
Doppler Laboratory for Molecular Informatics in the Biosciences, Department
of Pharmaceutical Sciences, Division of Pharmaceutical Chemistry, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| |
Collapse
|
26
|
Franca TC, Goncalves ADS, Bérubé C, Voyer N, Aubry N, LaPlante SR. Determining the Predominant Conformations of Mortiamides A-D in Solution Using NMR Data and Molecular Modeling Tools. ACS OMEGA 2023; 8:25832-25838. [PMID: 37521620 PMCID: PMC10373451 DOI: 10.1021/acsomega.3c01206] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Accepted: 06/07/2023] [Indexed: 08/01/2023]
Abstract
Macrocyclic peptidomimetics have been seriously contributing to our arsenal of drugs to combat diseases. The search for nature's discoveries led us to mortiamides A-D (found in a novel fungus from Northern Canada), which is a family of cyclic peptides that clearly have demonstrated impressive pharmaceutical potential. This prompted us to learn more about their solution-state properties as these are central for binding to target molecules. Here, we secured and isolated mortiamide D, and then acquired high-resolution nuclear magnetic resonance (NMR) data to learn more about its structure and dynamics attributes. Sets of two-dimensional NMR experiments provided atomic-level (through-bond and through-space) data to confirm the primary structure, and NMR-driven molecular dynamics (MD) simulations suggested that more than one predominant three-dimensional (3D) structure exist in solution. Further steps of MD simulations are consistent with the finding that the backbones of mortiamides A-C also have at least two prominent macrocyclic shapes, but the side-chain structures and dynamics differed significantly. Knowledge of these solution properties can be exploited for drug design and discovery.
Collapse
Affiliation(s)
- Tanos
C. C. Franca
- INRS
− Centre Armand-Frappier Santé Biotechnologie, Université de Québec, 531 Boulevard des Prairies, Laval, Quebec H7V 1B7, Canada
- Laboratory
of Molecular Modeling Applied to Chemical and Biological Defense, Military Institute of Engineering, 22290-270 Rio de Janeiro, Brazil
- Department
of Chemistry, Faculty of Science, University
of Hradec Králové, Rokitanskeho 62, 50003 Hradec Králové, Czech Republic
| | - Arlan da Silva Goncalves
- Department
of Chemistry, Federal Institute of Espírito
Santo − Unit Vila Velha, 29106-010 Vila Velha, ES, Brazil
- PPGQUI
(Graduate Program in Chemistry), Federal
University of Espírito Santo, Av. Fernando Ferrari, 514,, 29075-910 Vitória, ES, Brazil
| | - Christopher Bérubé
- Departement
de Chimie and PROTEO, Faculté des Sciences et de Génie, Université Laval, 1045 Avenue de la Médecine, Québec, Quebec G1V OA6, Canada
| | - Normand Voyer
- Departement
de Chimie and PROTEO, Faculté des Sciences et de Génie, Université Laval, 1045 Avenue de la Médecine, Québec, Quebec G1V OA6, Canada
| | - Norman Aubry
- NMR
consultant of Steven R. LaPlante’s Lab, INRS − Centre
Armand-Frappier Santé Biotechnologie, Université de Québec, 531 Boulevard des Prairies, Laval, Quebec H7V 1B7, Canada
| | - Steven R. LaPlante
- INRS
− Centre Armand-Frappier Santé Biotechnologie, Université de Québec, 531 Boulevard des Prairies, Laval, Quebec H7V 1B7, Canada
| |
Collapse
|
27
|
Bhujbal SP, Hah JM. An Intriguing Purview on the Design of Macrocyclic Inhibitors for Unexplored Protein Kinases through Their Binding Site Comparison. Pharmaceuticals (Basel) 2023; 16:1009. [PMID: 37513921 PMCID: PMC10386424 DOI: 10.3390/ph16071009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/02/2023] [Accepted: 07/13/2023] [Indexed: 07/30/2023] Open
Abstract
Kinases play an important role in regulating various intracellular signaling pathways that control cell proliferation, differentiation, survival, and other cellular processes, and their deregulation causes more than 400 diseases. Consequently, macrocyclization can be considered a noteworthy approach to developing new therapeutic agents for human diseases. Macrocyclization has emerged as an effective drug discovery strategy over the past decade to improve target selectivity and potency of small molecules. Small compounds with linear structures upon macrocyclization can lead to changes in their physicochemical and biological properties by firmly reducing conformational flexibility. A number of distinct protein kinases exhibit similar binding sites. Comparison of protein binding sites provides crucial insights for drug discovery and development. Binding site similarities are helpful in understanding polypharmacology, identifying potential off-targets, and repurposing known drugs. In this review, we focused on comparing the binding sites of those kinases for which macrocyclic inhibitors are available/studied so far. Furthermore, we calculated the volume of the binding site pocket for each targeted kinase and then compared it with the binding site pocket of the kinase for which only acyclic inhibitors were designed to date. Our review and analysis of several explored kinases might be useful in targeting new protein kinases for macrocyclic drug discovery.
Collapse
Affiliation(s)
- Swapnil P Bhujbal
- College of Pharmacy, Hanyang University, Ansan 426-791, Republic of Korea
- Institute of Pharmaceutical Science and Technology, Hanyang University, Ansan 426-791, Republic of Korea
| | - Jung-Mi Hah
- College of Pharmacy, Hanyang University, Ansan 426-791, Republic of Korea
- Institute of Pharmaceutical Science and Technology, Hanyang University, Ansan 426-791, Republic of Korea
| |
Collapse
|
28
|
Zhang Z, Wang G, Li R, Ni L, Zhang R, Cheng K, Ren Q, Kong X, Ni S, Tong X, Luo L, Wang D, Lu X, Zheng M, Li X. Tora3D: an autoregressive torsion angle prediction model for molecular 3D conformation generation. J Cheminform 2023; 15:57. [PMID: 37287071 DOI: 10.1186/s13321-023-00726-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 05/20/2023] [Indexed: 06/09/2023] Open
Abstract
Three-dimensional (3D) conformations of a small molecule profoundly affect its binding to the target of interest, the resulting biological effects, and its disposition in living organisms, but it is challenging to accurately characterize the conformational ensemble experimentally. Here, we proposed an autoregressive torsion angle prediction model Tora3D for molecular 3D conformer generation. Rather than directly predicting the conformations in an end-to-end way, Tora3D predicts a set of torsion angles of rotatable bonds by an interpretable autoregressive method and reconstructs the 3D conformations from them, which keeps structural validity during reconstruction. Another advancement of our method over other conformational generation methods is the ability to use energy to guide the conformation generation. In addition, we propose a new message-passing mechanism that applies the Transformer to the graph to solve the difficulty of remote message passing. Tora3D shows superior performance to prior computational models in the trade-off between accuracy and efficiency, and ensures conformational validity, accuracy, and diversity in an interpretable way. Overall, Tora3D can be used for the quick generation of diverse molecular conformations and 3D-based molecular representation, contributing to a wide range of downstream drug design tasks.
Collapse
Affiliation(s)
- Zimei Zhang
- Division of Life Science and Medicine, University of Science and Technology of China, Hefei, 230026, Anhui, China
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
| | - Gang Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
| | - Rui Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- School of Pharmacy, China Pharmaceutical University, 639 Longmian Road, Nanjing, 211198, China
| | - Lin Ni
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China
| | - RunZe Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
| | - Kaiyang Cheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China
| | - Qun Ren
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China
| | - Xiangtai Kong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
| | - Shengkun Ni
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
| | - Xiaochu Tong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
| | - Li Luo
- Precision Pharmacy & Drug Development Center, Department of Pharmacy, Tangdu Hospital, Fourth Military Medical University, Xi'an, 710038, China
| | | | - Xiaojie Lu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China
| | - Mingyue Zheng
- Division of Life Science and Medicine, University of Science and Technology of China, Hefei, 230026, Anhui, China.
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China.
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China.
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No.19A Yuquan Road, Beijing, 100049, China.
| |
Collapse
|
29
|
Baillif B, Cole J, McCabe P, Bender A. Deep generative models for 3D molecular structure. Curr Opin Struct Biol 2023; 80:102566. [DOI: 10.1016/j.sbi.2023.102566] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Revised: 02/05/2023] [Accepted: 02/15/2023] [Indexed: 03/30/2023]
|
30
|
Wieske LHE, Peintner S, Erdélyi M. Ensemble determination by NMR data deconvolution. Nat Rev Chem 2023:10.1038/s41570-023-00494-x. [PMID: 37169885 DOI: 10.1038/s41570-023-00494-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/03/2023] [Indexed: 05/13/2023]
Abstract
Nuclear magnetic resonance (NMR) is the spectroscopic technique of choice for determining molecular conformations in solution at atomic resolution. As solution NMR spectra are rich in structural and dynamic information, the way in which the data should be acquired and handled to deliver accurate ensembles is not trivial. This Review provides a guide to the NMR experiment selection and parametrization process, the generation of viable theoretical conformer pools and the deconvolution of time-averaged NMR data into a conformer ensemble that accurately represents a flexible molecule in solution. In addition to reviewing the key elements of solution ensemble determination of flexible mid-sized molecules, the feasibility and pitfalls of data deconvolution are discussed with a comparison of the performance of representative algorithms.
Collapse
Affiliation(s)
| | - Stefan Peintner
- Department of Chemistry-BMC, Uppsala University, Uppsala, Sweden
| | - Máté Erdélyi
- Department of Chemistry-BMC, Uppsala University, Uppsala, Sweden.
| |
Collapse
|
31
|
Szwabowski GL, Baker DL, Parrill AL. Application of computational methods for class A GPCR Ligand discovery. J Mol Graph Model 2023; 121:108434. [PMID: 36841204 DOI: 10.1016/j.jmgm.2023.108434] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 02/11/2023] [Accepted: 02/13/2023] [Indexed: 02/22/2023]
Abstract
G protein-coupled receptors (GPCR) are integral membrane proteins of considerable interest as targets for drug development due to their role in transmitting cellular signals in a multitude of biological processes. Of the six classes categorizing GPCR (A, B, C, D, E, and F), class A contains the largest number of therapeutically relevant GPCR. Despite their importance as drug targets, many challenges exist for the discovery of novel class A GPCR ligands serving as drug precursors. Though knowledge of the structural and functional characteristics of GPCR has grown significantly over the past 20 years, a large portion of GPCR lack reported, experimentally determined structures. Furthermore, many GPCR have no known endogenous and/or synthetic ligands, limiting further exploration of their biochemical, cellular, and physiological roles. While many successes in GPCR ligand discovery have resulted from experimental high-throughput screening, computational methods have played an increasingly important role in GPCR ligand identification in the past decade. Here we discuss computational techniques applied to GPCR ligand discovery. This review summarizes class A GPCR structure/function and provides an overview of many obstacles currently faced in GPCR ligand discovery. Furthermore, we discuss applications and recent successes of computational techniques used to predict GPCR structure as well as present a summary of ligand- and structure-based methods used to identify potential GPCR ligands. Finally, we discuss computational hit list generation and refinement and provide comprehensive workflows for GPCR ligand identification.
Collapse
Affiliation(s)
| | - Daniel L Baker
- Department of Chemistry, The University of Memphis, Memphis, TN, 38152, USA
| | - Abby L Parrill
- Department of Chemistry, The University of Memphis, Memphis, TN, 38152, USA.
| |
Collapse
|
32
|
Fang L, Guo X, Todorović M, Rinke P, Chen X. Exploring the Conformers of an Organic Molecule on a Metal Cluster with Bayesian Optimization. J Chem Inf Model 2023; 63:745-752. [PMID: 36642891 PMCID: PMC9930108 DOI: 10.1021/acs.jcim.2c01120] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Finding low-energy conformers of organic molecules is a complex problem due to the flexibilities of the molecules and the high dimensionality of the search space. When such molecules are on nanoclusters, the search complexity is exacerbated by constraints imposed by the presence of the cluster and other surrounding molecules. To address this challenge, we modified our previously developed active learning molecular conformer search method based on Bayesian optimization and density functional theory. Especially, we have developed and tested strategies to avoid steric clashes between a molecule and a cluster. In this work, we chose a cysteine molecule on a well-studied gold-thiolate cluster as a model system to test and demonstrate our method. We found that cysteine conformers in a cluster inherit the hydrogen bond types from isolated conformers. However, the energy rankings and spacings between the conformers are reordered.
Collapse
Affiliation(s)
- Lincan Fang
- Department
of Applied Physics, Aalto University, 00076AALTO, Finland
| | - Xiaomi Guo
- State
Key Laboratory of Low Dimensional Quantum Physics and Department of
Physics, Tsinghua University, 100084Beijing, China
| | - Milica Todorović
- Department
of Mechanical and Materials Engineering, University of Turku, FI-20014Turku, Finland
| | - Patrick Rinke
- Department
of Applied Physics, Aalto University, 00076AALTO, Finland
| | - Xi Chen
- Department
of Applied Physics, Aalto University, 00076AALTO, Finland,E-mail:
| |
Collapse
|
33
|
Sethio D, Poongavanam V, Xiong R, Tyagi M, Duy Vo D, Lindh R, Kihlberg J. Simulation Reveals the Chameleonic Behavior of Macrocycles. J Chem Inf Model 2023; 63:138-146. [PMID: 36563083 PMCID: PMC9832480 DOI: 10.1021/acs.jcim.2c01093] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Conformational analysis is central to the design of bioactive molecules. It is particularly challenging for macrocycles due to noncovalent transannular interactions, steric interactions, and ring strain that are often coupled. Herein, we simulated the conformations of five macrocycles designed to express a progression of increasing complexity in environment-dependent intramolecular interactions and verified the results against NMR measurements in chloroform and dimethyl sulfoxide. Molecular dynamics using an explicit solvent model, but not the Monte Carlo method with implicit solvation, handled both solvents correctly. Refinement of conformations at the ab initio level was fundamental to reproducing the experimental observations─standard state-of-the-art molecular mechanics force fields were insufficient. Our simulations correctly predicted the intramolecular interactions between side chains and the macrocycle and revealed an unprecedented solvent-induced conformational switch of the macrocyclic ring. Our results provide a platform for the rational, prospective design of molecular chameleons that adapt to the properties of the environment.
Collapse
|
34
|
Xia S, Zhang D, Zhang Y. Multitask Deep Ensemble Prediction of Molecular Energetics in Solution: From Quantum Mechanics to Experimental Properties. J Chem Theory Comput 2023; 19:10.1021/acs.jctc.2c01024. [PMID: 36607141 PMCID: PMC10323048 DOI: 10.1021/acs.jctc.2c01024] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The past few years have witnessed significant advances in developing machine learning methods for molecular energetics predictions, including calculated electronic energies with high-level quantum mechanical methods and experimental properties, such as solvation free energy and logP. Typically, task-specific machine learning models are developed for distinct prediction tasks. In this work, we present a multitask deep ensemble model, sPhysNet-MT-ens5, which can simultaneously and accurately predict electronic energies of molecules in gas, water, and octanol phases, as well as transfer free energies at both calculated and experimental levels. On the calculated data set Frag20-solv-678k, which is developed in this work and contains 678,916 molecular conformations, up to 20 heavy atoms, and their properties calculated at B3LYP/6-31G* level of theory with continuum solvent models, sPhysNet-MT-ens5 predicts density functional theory (DFT)-level electronic energies directly from force field-optimized geometry within chemical accuracy. On the experimental data sets, sPhysNet-MT-ens5 achieves state-of-the-art performances, which predict both experimental hydration free energy with a RMSE of 0.620 kcal/mol on the FreeSolv data set and experimental logP with a RMSE of 0.393 on the PHYSPROP data set. Furthermore, sPhysNet-MT-ens5 also provides a reasonable estimation of model uncertainty which shows correlations with prediction error. Finally, by analyzing the atomic contributions of its predictions, we find that the developed deep learning model is aware of the chemical environment of each atom by assigning reasonable atomic contributions consistent with our chemical knowledge.
Collapse
Affiliation(s)
- Song Xia
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Dongdong Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
| | - Yingkai Zhang
- Department of Chemistry, New York University, New York, New York 10003, United States
- Simons Center for Computational Physical Chemistry at New York University, New York, New York 10003, United States
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| |
Collapse
|
35
|
Bahloul A, Benayahoum A, Bouakkaz S, Bordjiba T, Boudjahem A, Lilya B, Bachari K. The antioxidant activity of N-E-caffeoyl and N-E-feruloyl tyramine conformers and their sulfured analogs contribution: density functional theory studies. Theor Chem Acc 2023. [DOI: 10.1007/s00214-022-02939-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/09/2022]
|
36
|
Zhong X, Liu Z. Conformational Screening of the Catalyst System Containing Transition Metal and Flexible Ligand. CHINESE J ORG CHEM 2023. [DOI: 10.6023/cjoc202207021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2023]
|
37
|
Chang Y, Hawkins BA, Du JJ, Groundwater PW, Hibbs DE, Lai F. A Guide to In Silico Drug Design. Pharmaceutics 2022; 15:pharmaceutics15010049. [PMID: 36678678 PMCID: PMC9867171 DOI: 10.3390/pharmaceutics15010049] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Revised: 12/16/2022] [Accepted: 12/17/2022] [Indexed: 12/28/2022] Open
Abstract
The drug discovery process is a rocky path that is full of challenges, with the result that very few candidates progress from hit compound to a commercially available product, often due to factors, such as poor binding affinity, off-target effects, or physicochemical properties, such as solubility or stability. This process is further complicated by high research and development costs and time requirements. It is thus important to optimise every step of the process in order to maximise the chances of success. As a result of the recent advancements in computer power and technology, computer-aided drug design (CADD) has become an integral part of modern drug discovery to guide and accelerate the process. In this review, we present an overview of the important CADD methods and applications, such as in silico structure prediction, refinement, modelling and target validation, that are commonly used in this area.
Collapse
Affiliation(s)
- Yiqun Chang
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Bryson A. Hawkins
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Jonathan J. Du
- Department of Biochemistry, Emory University School of Medicine, Atlanta, GA 30322, USA
| | - Paul W. Groundwater
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - David E. Hibbs
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
| | - Felcia Lai
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Camperdown, NSW 2006, Australia
- Correspondence:
| |
Collapse
|
38
|
Bhat V, Sornberger P, Pokuri BSS, Duke R, Ganapathysubramanian B, Risko C. Electronic, redox, and optical property prediction of organic π-conjugated molecules through a hierarchy of machine learning approaches. Chem Sci 2022; 14:203-213. [PMID: 36605753 PMCID: PMC9769113 DOI: 10.1039/d2sc04676h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Accepted: 11/16/2022] [Indexed: 11/18/2022] Open
Abstract
Accelerating the development of π-conjugated molecules for applications such as energy generation and storage, catalysis, sensing, pharmaceuticals, and (semi)conducting technologies requires rapid and accurate evaluation of the electronic, redox, or optical properties. While high-throughput computational screening has proven to be a tremendous aid in this regard, machine learning (ML) and other data-driven methods can further enable orders of magnitude reduction in time while at the same time providing dramatic increases in the chemical space that is explored. However, the lack of benchmark datasets containing the electronic, redox, and optical properties that characterize the diverse, known chemical space of organic π-conjugated molecules limits ML model development. Here, we present a curated dataset containing 25k molecules with density functional theory (DFT) and time-dependent DFT (TDDFT) evaluated properties that include frontier molecular orbitals, ionization energies, relaxation energies, and low-lying optical excitation energies. Using the dataset, we train a hierarchy of ML models, ranging from classical models such as ridge regression to sophisticated graph neural networks, with molecular SMILES representation as input. We observe that graph neural networks augmented with contextual information allow for significantly better predictions across a wide array of properties. Our best-performing models also provide an uncertainty quantification for the predictions. To democratize access to the data and trained models, an interactive web platform has been developed and deployed.
Collapse
Affiliation(s)
- Vinayak Bhat
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| | - Parker Sornberger
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| | | | - Rebekah Duke
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| | | | - Chad Risko
- Department of Chemistry and Center for Applied Energy Research, University of KentuckyLexingtonKentucky 40506USA
| |
Collapse
|
39
|
Cordier BA, Sawaya NPD, Guerreschi GG, McWeeney SK. Biology and medicine in the landscape of quantum advantages. J R Soc Interface 2022; 19:20220541. [PMID: 36448288 PMCID: PMC9709576 DOI: 10.1098/rsif.2022.0541] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Quantum computing holds substantial potential for applications in biology and medicine, spanning from the simulation of biomolecules to machine learning methods for subtyping cancers on the basis of clinical features. This potential is encapsulated by the concept of a quantum advantage, which is contingent on a reduction in the consumption of a computational resource, such as time, space or data. Here, we distill the concept of a quantum advantage into a simple framework to aid researchers in biology and medicine pursuing the development of quantum applications. We then apply this framework to a wide variety of computational problems relevant to these domains in an effort to (i) assess the potential of practical advantages in specific application areas and (ii) identify gaps that may be addressed with novel quantum approaches. In doing so, we provide an extensive survey of the intersection of biology and medicine with the current landscape of quantum algorithms and their potential advantages. While we endeavour to identify specific computational problems that may admit practical advantages throughout this work, the rapid pace of change in the fields of quantum computing, classical algorithms and biological research implies that this intersection will remain highly dynamic for the foreseeable future.
Collapse
Affiliation(s)
- Benjamin A. Cordier
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97202, USA
| | | | | | - Shannon K. McWeeney
- Department of Medical Informatics and Clinical Epidemiology, Oregon Health and Science University, Portland, OR 97202, USA,Knight Cancer Institute, Oregon Health and Science University, Portland, OR 97202, USA,Oregon Clinical and Translational Research Institute, Oregon Health and Science University, Portland, OR 97202, USA
| |
Collapse
|
40
|
Jacobi R, Hernández-Castillo D, Sinambela N, Bösking J, Pannwitz A, González L. Computation of Förster Resonance Energy Transfer in Lipid Bilayer Membranes. J Phys Chem A 2022; 126:8070-8081. [PMID: 36260519 PMCID: PMC9639162 DOI: 10.1021/acs.jpca.2c04524] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022]
Abstract
![]()
Calculations of Förster
Resonance Energy Transfer (FRET)
often neglect the influence of different chromophore orientations
or changes in the spectral overlap. In this work, we present two computational
approaches to estimate the energy transfer rate between chromophores
embedded in lipid bilayer membranes. In the first approach, we assess
the transition dipole moments and the spectral overlap by means of
quantum chemical calculations in implicit solvation, and we investigate
the alignment and distance between the chromophores in classical molecular
dynamics simulations. In the second, all properties are evaluated
integrally with hybrid quantum mechanical/molecular mechanics (QM/MM)
calculations. Both approaches come with advantages and drawbacks,
and despite the fact that they do not agree quantitatively, they provide
complementary insights on the different factors that influence the
FRET rate. We hope that these models can be used as a basis to optimize
energy transfers in nonisotropic media.
Collapse
Affiliation(s)
- Richard Jacobi
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Straße 17, 1090Vienna, Austria.,Doctoral School in Chemistry (DoSChem), University of Vienna, Währinger Straße 42, 1090Vienna, Austria
| | - David Hernández-Castillo
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Straße 17, 1090Vienna, Austria.,Doctoral School in Chemistry (DoSChem), University of Vienna, Währinger Straße 42, 1090Vienna, Austria
| | - Novitasari Sinambela
- Institute of Inorganic Chemistry I, Ulm University, Albert-Einstein-Allee 11, 89081Ulm, Germany
| | - Julian Bösking
- Institute of Inorganic Chemistry I, Ulm University, Albert-Einstein-Allee 11, 89081Ulm, Germany
| | - Andrea Pannwitz
- Institute of Inorganic Chemistry I, Ulm University, Albert-Einstein-Allee 11, 89081Ulm, Germany
| | - Leticia González
- Institute of Theoretical Chemistry, Faculty of Chemistry, University of Vienna, Währinger Straße 17, 1090Vienna, Austria.,Vienna Research Platform on Accelerating Photoreaction Discovery, University of Vienna, Währinger Straße 17, 1090Vienna, Austria
| |
Collapse
|
41
|
Pracht P, Bannwarth C. Fast Screening of Minimum Energy Crossing Points with Semiempirical Tight-Binding Methods. J Chem Theory Comput 2022; 18:6370-6385. [PMID: 36121838 DOI: 10.1021/acs.jctc.2c00578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The investigation of photochemical processes is a highly active field in computational chemistry. One research direction is the automated exploration and identification of minimum energy conical intersection (MECI) geometries. However, due to the immense technical effort required to calculate nonadiabatic potential energy landscapes, the routine application of such computational protocols is severely limited. In this study, we will discuss the prospect of combining adiabatic potential energy surfaces from semiempirical quantum mechanical calculations with specialized confinement potential and metadynamics simulations to identify S0/T1 minimum energy crossing point (MECP) geometries. It is shown that MECPs calculated at the GFN2-xTB level can provide suitable approximations to high-level S0/S1ab initio conical intersection geometries at a fraction of the computational cost. Reference MECIs of benzene are studied to illustrate the basic concept. An example application of the presented protocol is demonstrated for a set of photoswitch molecules.
Collapse
Affiliation(s)
- Philipp Pracht
- Institute of Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056Aachen, Germany
| | - Christoph Bannwarth
- Institute of Physical Chemistry, RWTH Aachen University, Melatener Str. 20, 52056Aachen, Germany
| |
Collapse
|
42
|
Yonezawa T, Esaki T, Ikeda K. Benchmark of 3D conformer generation and molecular property calculation for medium-sized molecules. CHEM-BIO INFORMATICS JOURNAL 2022. [DOI: 10.1273/cbij.22.38] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Tomoki Yonezawa
- Division of Physics for Life Functions, Keio University Faculty of Pharmacy
| | - Tsuyoshi Esaki
- Data Science and AI Innovation Research Promotion Center, Shiga University
| | - Kazuyoshi Ikeda
- Division of Physics for Life Functions, Keio University Faculty of Pharmacy
| |
Collapse
|
43
|
Viegas LP. Gas-phase OH-oxidation of 2-butanethiol: Multiconformer transition state theory rate constant with constrained transition state randomization. Chem Phys Lett 2022. [DOI: 10.1016/j.cplett.2022.139829] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
|
44
|
Conformation and structural features of diuron and irgarol: insights from quantum chemistry calculations. COMPUT THEOR CHEM 2022. [DOI: 10.1016/j.comptc.2022.113844] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
|
45
|
Spiekermann KA, Pattanaik L, Green WH. Fast Predictions of Reaction Barrier Heights: Toward Coupled-Cluster Accuracy. J Phys Chem A 2022; 126:3976-3986. [PMID: 35727075 DOI: 10.1021/acs.jpca.2c02614] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Quantitative estimates of reaction barriers are essential for developing kinetic mechanisms and predicting reaction outcomes. However, the lack of experimental data and the steep scaling of accurate quantum calculations often hinder the ability to obtain reliable kinetic values. Here, we train a directed message passing neural network on nearly 24,000 diverse gas-phase reactions calculated at CCSD(T)-F12a/cc-pVDZ-F12//ωB97X-D3/def2-TZVP. Our model uses 75% fewer parameters than previous studies, an improved reaction representation, and proper data splits to accurately estimate performance on unseen reactions. Using information from only the reactant and product, our model quickly predicts barrier heights with a testing MAE of 2.6 kcal mol-1 relative to the coupled-cluster data, making it more accurate than a good density functional theory calculation. Furthermore, our results show that future modeling efforts to estimate reaction properties would significantly benefit from fine-tuning calibration using a transfer learning technique. We anticipate this model will accelerate and improve kinetic predictions for small molecule chemistry.
Collapse
Affiliation(s)
- Kevin A Spiekermann
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Lagnajit Pattanaik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - William H Green
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
46
|
Li J, Maravelias CT, Van Lehn RC. Adaptive Conformer Sampling for Property Prediction Using the Conductor-like Screening Model for Real Solvents. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.2c01163] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jianping Li
- Department of Chemical and Biological Engineering and DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| | - Christos T. Maravelias
- Department of Chemical and Biological Engineering and Andlinger Center for Energy and the Environment, Princeton University, Princeton, New Jersey 08540, United States
| | - Reid C. Van Lehn
- Department of Chemical and Biological Engineering and DOE Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, Wisconsin 53706, United States
| |
Collapse
|
47
|
Guo X, Fang L, Xu Y, Duan W, Rinke P, Todorović M, Chen X. Molecular Conformer Search with Low-Energy Latent Space. J Chem Theory Comput 2022; 18:4574-4585. [PMID: 35696366 PMCID: PMC9281398 DOI: 10.1021/acs.jctc.2c00290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
Identifying low-energy
conformers with quantum mechanical accuracy
for molecules with many degrees of freedom is challenging. In this
work, we use the molecular dihedral angles as features and explore
the possibility of performing molecular conformer search in a latent
space with a generative model named variational auto-encoder (VAE).
We bias the VAE towards low-energy molecular configurations to generate
more informative data. In this way, we can effectively build a reliable
energy model for the low-energy potential energy surface. After the
energy model has been built, we extract local-minimum conformations
and refine them with structure optimization. We have tested and benchmarked
our low-energy latent-space (LOLS) structure search method on organic
molecules with 5–9 searching dimensions. Our results agree
with previous studies.
Collapse
Affiliation(s)
- Xiaomi Guo
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China.,Department of Applied Physics, Aalto University, Espoo 00076, Finland
| | - Lincan Fang
- Department of Applied Physics, Aalto University, Espoo 00076, Finland
| | - Yong Xu
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China.,Frontier Science Center for Quantum Information, Beijing 100084, China.,RIKEN Center for Emergent Matter Science (CEMS), Wako, Saitama 351-0198, Japan
| | - Wenhui Duan
- State Key Laboratory of Low Dimensional Quantum Physics and Department of Physics, Tsinghua University, Beijing 100084, China.,Frontier Science Center for Quantum Information, Beijing 100084, China.,Institute for Advanced Study, Tsinghua University, Beijing 100084, China
| | - Patrick Rinke
- Department of Applied Physics, Aalto University, Espoo 00076, Finland
| | - Milica Todorović
- Department of Mechanical and Materials Engineering, University of Turku, FI-20014 Turku, Finland
| | - Xi Chen
- Department of Applied Physics, Aalto University, Espoo 00076, Finland
| |
Collapse
|
48
|
Lu Q. Identifying molecular structural features by pattern recognition methods. RSC Adv 2022; 12:17559-17569. [PMID: 35765452 PMCID: PMC9192268 DOI: 10.1039/d2ra00764a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2022] [Accepted: 06/06/2022] [Indexed: 11/21/2022] Open
Abstract
Identification of molecular structural features is a central part of computational chemistry. It would be beneficial if pattern recognition techniques could be incorporated to facilitate the identification. Currently, the quantification of the structural dissimilarity is mainly carried out by root-mean-square-deviation (RMSD) calculations such as in molecular dynamics simulations. However, the RMSD calculation underperforms for large molecules, showing the so-called “curse of dimensionality” problem. Also, it requires consistent ordering of atoms in two comparing structures, which needs nontrivial effort to fulfill. In this work, we propose to take advantage of the point cloud recognition using convex hulls as the basis to recognize molecular structural features. Two advantages of the method can be highlighted. First, the dimension of the input data structure is largely reduced from the number of atoms of molecules to the number of atoms of convex hulls. Therefore, the dimensionality curse problem is avoided, and the atom ordering process is saved. Second, the construction of convex hulls can be used to define new molecular descriptors, such as the contact area of molecular interactions. These new molecular descriptors have different properties from existing ones, therefore they are expected to exhibit different behaviors for certain machine learning studies. Several illustrative applications have been carried out, which provide promising results for structure–activity studies. Identification of molecular structural features by point clouds and convex hulls.![]()
Collapse
Affiliation(s)
- Qing Lu
- Beijing National Laboratory for Molecular Sciences, Institute of Chemistry, Chinese Academy of Sciences Beijing 100190 China
| |
Collapse
|
49
|
Das S, Tanemura KA, Dinpazhoh L, Keng M, Schumm C, Leahy L, Asef CK, Rainey M, Edison AS, Fernández FM, Merz KM. In Silico Collision Cross Section Calculations to Aid Metabolite Annotation. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2022; 33:750-759. [PMID: 35378036 PMCID: PMC9277703 DOI: 10.1021/jasms.1c00315] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/28/2023]
Abstract
The interpretation of ion mobility coupled to mass spectrometry (IM-MS) data to predict unknown structures is challenging and depends on accurate theoretical estimates of the molecular ion collision cross section (CCS) against a buffer gas in a low or atmospheric pressure drift chamber. The sensitivity and reliability of computational prediction of CCS values depend on accurately modeling the molecular state over accessible conformations. In this work, we developed an efficient CCS computational workflow using a machine learning model in conjunction with standard DFT methods and CCS calculations. Furthermore, we have performed Traveling Wave IM-MS (TWIMS) experiments to validate the extant experimental values and assess uncertainties in experimentally measured CCS values. The developed workflow yielded accurate structural predictions and provides unique insights into the likely preferred conformation analyzed using IM-MS experiments. The complete workflow makes the computation of CCS values tractable for a large number of conformationally flexible metabolites with complex molecular structures.
Collapse
Affiliation(s)
- Susanta Das
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Kiyoto Aramis Tanemura
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Laleh Dinpazhoh
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Mithony Keng
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Christina Schumm
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Lydia Leahy
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| | - Carter K Asef
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Markace Rainey
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Arthur S Edison
- Departments of Genetics and Biochemistry, Institute of Bioinformatics and Complex Carbohydrate Center, University of Georgia, 315 Riverbend Road, Athens, Georgia 30602, United States
| | - Facundo M Fernández
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
- Petit Institute for Bioengineering and Bioscience, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, 578 South Shaw Lane, East Lansing, Michigan 48824, United States
| |
Collapse
|
50
|
Zhou Y, Jiang Y, Chen SJ. RNA-ligand molecular docking: advances and challenges. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 12:e1571. [PMID: 37293430 PMCID: PMC10250017 DOI: 10.1002/wcms.1571] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Accepted: 07/20/2021] [Indexed: 12/16/2022]
Abstract
With rapid advances in computer algorithms and hardware, fast and accurate virtual screening has led to a drastic acceleration in selecting potent small molecules as drug candidates. Computational modeling of RNA-small molecule interactions has become an indispensable tool for RNA-targeted drug discovery. The current models for RNA-ligand binding have mainly focused on the docking-and-scoring method. Accurate docking and scoring should tackle four crucial problems: (1) conformational flexibility of ligand, (2) conformational flexibility of RNA, (3) efficient sampling of binding sites and binding poses, and (4) accurate scoring of different binding modes. Moreover, compared with the problem of protein-ligand docking, predicting ligand binding to RNA, a negatively charged polymer, is further complicated by additional effects such as metal ion effects. Thermodynamic models based on physics-based and knowledge-based scoring functions have shown highly encouraging success in predicting ligand binding poses and binding affinities. Recently, kinetic models for ligand binding have further suggested that including dissociation kinetics (residence time) in ligand docking would result in improved performance in estimating in vivo drug efficacy. More recently, the rise of deep-learning approaches has led to new tools for predicting RNA-small molecule binding. In this review, we present an overview of the recently developed computational methods for RNA-ligand docking and their advantages and disadvantages.
Collapse
Affiliation(s)
- Yuanzhe Zhou
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Yangwei Jiang
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| | - Shi-Jie Chen
- Department of Physics and Astronomy, Department of Biochemistry, Institute of Data Sciences and Informatics, University of Missouri, Columbia, MO 65211-7010, USA
| |
Collapse
|