1
|
Hwang W, Austin SL, Blondel A, Boittier ED, Boresch S, Buck M, Buckner J, Caflisch A, Chang HT, Cheng X, Choi YK, Chu JW, Crowley MF, Cui Q, Damjanovic A, Deng Y, Devereux M, Ding X, Feig MF, Gao J, Glowacki DR, Gonzales JE, Hamaneh MB, Harder ED, Hayes RL, Huang J, Huang Y, Hudson PS, Im W, Islam SM, Jiang W, Jones MR, Käser S, Kearns FL, Kern NR, Klauda JB, Lazaridis T, Lee J, Lemkul JA, Liu X, Luo Y, MacKerell AD, Major DT, Meuwly M, Nam K, Nilsson L, Ovchinnikov V, Paci E, Park S, Pastor RW, Pittman AR, Post CB, Prasad S, Pu J, Qi Y, Rathinavelan T, Roe DR, Roux B, Rowley CN, Shen J, Simmonett AC, Sodt AJ, Töpfer K, Upadhyay M, van der Vaart A, Vazquez-Salazar LI, Venable RM, Warrensford LC, Woodcock HL, Wu Y, Brooks CL, Brooks BR, Karplus M. CHARMM at 45: Enhancements in Accessibility, Functionality, and Speed. J Phys Chem B 2024; 128:9976-10042. [PMID: 39303207 DOI: 10.1021/acs.jpcb.4c04100] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Since its inception nearly a half century ago, CHARMM has been playing a central role in computational biochemistry and biophysics. Commensurate with the developments in experimental research and advances in computer hardware, the range of methods and applicability of CHARMM have also grown. This review summarizes major developments that occurred after 2009 when the last review of CHARMM was published. They include the following: new faster simulation engines, accessible user interfaces for convenient workflows, and a vast array of simulation and analysis methods that encompass quantum mechanical, atomistic, and coarse-grained levels, as well as extensive coverage of force fields. In addition to providing the current snapshot of the CHARMM development, this review may serve as a starting point for exploring relevant theories and computational methods for tackling contemporary and emerging problems in biomolecular systems. CHARMM is freely available for academic and nonprofit research at https://academiccharmm.org/program.
Collapse
Affiliation(s)
- Wonmuk Hwang
- Department of Biomedical Engineering, Texas A&M University, College Station, Texas 77843, United States
- Department of Materials Science and Engineering, Texas A&M University, College Station, Texas 77843, United States
- Department of Physics and Astronomy, Texas A&M University, College Station, Texas 77843, United States
- Center for AI and Natural Sciences, Korea Institute for Advanced Study, Seoul 02455, Republic of Korea
| | - Steven L Austin
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Arnaud Blondel
- Institut Pasteur, Université Paris Cité, CNRS UMR3825, Structural Bioinformatics Unit, 28 rue du Dr. Roux F-75015 Paris, France
| | - Eric D Boittier
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Stefan Boresch
- Faculty of Chemistry, Department of Computational Biological Chemistry, University of Vienna, Wahringerstrasse 17, 1090 Vienna, Austria
| | - Matthias Buck
- Department of Physiology and Biophysics, Case Western Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | - Joshua Buckner
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Amedeo Caflisch
- Department of Biochemistry, University of Zürich, CH-8057 Zürich, Switzerland
| | - Hao-Ting Chang
- Institute of Bioinformatics and Systems Biology, National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan, ROC
| | - Xi Cheng
- Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
| | - Yeol Kyo Choi
- Department of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jhih-Wei Chu
- Institute of Bioinformatics and Systems Biology, Department of Biological Science and Technology, Institute of Molecular Medicine and Bioengineering, and Center for Intelligent Drug Systems and Smart Bio-devices (IDS2B), National Yang Ming Chiao Tung University, Hsinchu 30010, Taiwan, ROC
| | - Michael F Crowley
- Renewable Resources and Enabling Sciences Center, National Renewable Energy Laboratory, Golden, Colorado 80401, United States
| | - Qiang Cui
- Department of Chemistry, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department of Physics, Boston University, 590 Commonwealth Avenue, Boston, Massachusetts 02215, United States
- Department of Biomedical Engineering, Boston University, 44 Cummington Mall, Boston, Massachusetts 02215, United States
| | - Ana Damjanovic
- Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Department of Physics and Astronomy, Johns Hopkins University, Baltimore, Maryland 21218, United States
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Yuqing Deng
- Shanghai R&D Center, DP Technology, Ltd., Shanghai 201210, China
| | - Mike Devereux
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Xinqiang Ding
- Department of Chemistry, Tufts University, Medford, Massachusetts 02155, United States
| | - Michael F Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Jiali Gao
- School of Chemical Biology & Biotechnology, Peking University Shenzhen Graduate School, Shenzhen, Guangdong 518055, China
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, Guangdong 518055, China
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455, United States
| | - David R Glowacki
- CiTIUS Centro Singular de Investigación en Tecnoloxías Intelixentes da USC, 15705 Santiago de Compostela, Spain
| | - James E Gonzales
- Department of Biomedical Engineering, Texas A&M University, College Station, Texas 77843, United States
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Mehdi Bagerhi Hamaneh
- Department of Physiology and Biophysics, Case Western Reserve University, School of Medicine, Cleveland, Ohio 44106, United States
| | | | - Ryan L Hayes
- Department of Chemical and Biomolecular Engineering, University of California, Irvine, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, Irvine, California 92697, United States
| | - Jing Huang
- Key Laboratory of Structural Biology of Zhejiang Province, School of Life Sciences, Westlake University, Hangzhou, Zhejiang 310024, China
| | - Yandong Huang
- College of Computer Engineering, Jimei University, Xiamen 361021, China
| | - Phillip S Hudson
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
- Medicine Design, Pfizer Inc., Cambridge, Massachusetts 02139, United States
| | - Wonpil Im
- Department of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Shahidul M Islam
- Department of Chemistry, Delaware State University, Dover, Delaware 19901, United States
| | - Wei Jiang
- Computational Science Division, Argonne National Laboratory, Argonne, Illinois 60439, United States
| | - Michael R Jones
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Silvan Käser
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Fiona L Kearns
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Nathan R Kern
- Department of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Jeffery B Klauda
- Department of Chemical and Biomolecular Engineering, Institute for Physical Science and Technology, Biophysics Program, University of Maryland, College Park, Maryland 20742, United States
| | - Themis Lazaridis
- Department of Chemistry, City College of New York, New York, New York 10031, United States
| | - Jinhyuk Lee
- Disease Target Structure Research Center, Korea Research Institute of Bioscience and Biotechnology, Daejeon 34141, Republic of Korea
- Department of Bioinformatics, KRIBB School of Bioscience, University of Science and Technology, Daejeon 34141, Republic of Korea
| | - Justin A Lemkul
- Department of Biochemistry, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24061, United States
| | - Xiaorong Liu
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Yun Luo
- Department of Biotechnology and Pharmaceutical Sciences, College of Pharmacy, Western University of Health Sciences, Pomona, California 91766, United States
| | - Alexander D MacKerell
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Dan T Major
- Department of Chemistry and Institute for Nanotechnology & Advanced Materials, Bar-Ilan University, Ramat-Gan 52900, Israel
| | - Markus Meuwly
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
- Department of Chemistry, Brown University, Providence, Rhode Island 02912, United States
| | - Kwangho Nam
- Department of Chemistry and Biochemistry, University of Texas at Arlington, Arlington, Texas 76019, United States
| | - Lennart Nilsson
- Karolinska Institutet, Department of Biosciences and Nutrition, SE-14183 Huddinge, Sweden
| | - Victor Ovchinnikov
- Harvard University, Department of Chemistry and Chemical Biology, Cambridge, Massachusetts 02138, United States
| | - Emanuele Paci
- Dipartimento di Fisica e Astronomia, Universitá di Bologna, Bologna 40127, Italy
| | - Soohyung Park
- Department of Biological Sciences, Lehigh University, Bethlehem, Pennsylvania 18015, United States
| | - Richard W Pastor
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Amanda R Pittman
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Carol Beth Post
- Borch Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907, United States
| | - Samarjeet Prasad
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University Indianapolis, Indianapolis, Indiana 46202, United States
| | - Yifei Qi
- School of Pharmacy, Fudan University, Shanghai 201203, China
| | | | - Daniel R Roe
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Benoit Roux
- Department of Chemistry, University of Chicago, Chicago, Illinois 60637, United States
| | | | - Jana Shen
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, United States
| | - Andrew C Simmonett
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Alexander J Sodt
- Eunice Kennedy Shriver National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Kai Töpfer
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Meenu Upadhyay
- Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Arjan van der Vaart
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | | | - Richard M Venable
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Luke C Warrensford
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - H Lee Woodcock
- Department of Chemistry, University of South Florida, Tampa, Florida 33620, United States
| | - Yujin Wu
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Charles L Brooks
- Department of Chemistry, University of Michigan, Ann Arbor, Michigan 48109, United States
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Martin Karplus
- Harvard University, Department of Chemistry and Chemical Biology, Cambridge, Massachusetts 02138, United States
- Laboratoire de Chimie Biophysique, ISIS, Université de Strasbourg, 67000 Strasbourg, France
| |
Collapse
|
2
|
Yuan ECY, Kumar A, Guan X, Hermes ED, Rosen AS, Zádor J, Head-Gordon T, Blau SM. Analytical ab initio hessian from a deep learning potential for transition state optimization. Nat Commun 2024; 15:8865. [PMID: 39402016 DOI: 10.1038/s41467-024-52481-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 09/06/2024] [Indexed: 10/17/2024] Open
Abstract
Identifying transition states-saddle points on the potential energy surface connecting reactant and product minima-is central to predicting kinetic barriers and understanding chemical reaction mechanisms. In this work, we train a fully differentiable equivariant neural network potential, NewtonNet, on thousands of organic reactions and derive the analytical Hessians. By reducing the computational cost by several orders of magnitude relative to the density functional theory (DFT) ab initio source, we can afford to use the learned Hessians at every step for the saddle point optimizations. We show that the full machine learned (ML) Hessian robustly finds the transition states of 240 unseen organic reactions, even when the quality of the initial guess structures are degraded, while reducing the number of optimization steps to convergence by 2-3× compared to the quasi-Newton DFT and ML methods. All data generation, NewtonNet model, and ML transition state finding methods are available in an automated workflow.
Collapse
Affiliation(s)
- Eric C-Y Yuan
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Anup Kumar
- Energy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Xingyi Guan
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | | | - Andrew S Rosen
- Materials Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
- Department of Materials Science and Engineering, University of California, Berkeley, CA, USA
| | - Judit Zádor
- Combustion Research Facility, Sandia National Laboratories, Livermore, CA, USA
| | - Teresa Head-Gordon
- Kenneth S. Pitzer Theory Center and Department of Chemistry, University of California, Berkeley, CA, USA.
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
- Departments of Bioengineering and Chemical and Biomolecular Engineering, University of California, Berkeley, CA, USA.
| | - Samuel M Blau
- Energy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, CA, USA.
| |
Collapse
|
3
|
Eastman P, Pritchard BP, Chodera JD, Markland TE. Nutmeg and SPICE: Models and Data for Biomolecular Machine Learning. J Chem Theory Comput 2024; 20:8583-8593. [PMID: 39318326 DOI: 10.1021/acs.jctc.4c00794] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/26/2024]
Abstract
We describe version 2 of the SPICE data set, a collection of quantum chemistry calculations for training machine learning potentials. It expands on the original data set by adding much more sampling of chemical space and more data on noncovalent interactions. We train a set of potential energy functions called Nutmeg on it. They are based on the TensorNet architecture. They use a novel mechanism to improve performance on charged and polar molecules, injecting precomputed partial charges into the model to provide a reference for the large-scale charge distribution. Evaluation of the new models shows that they do an excellent job of reproducing energy differences between conformations even on highly charged molecules or ones that are significantly larger than the molecules in the training set. They also produce stable molecular dynamics trajectories and are fast enough to be useful for routine simulation of small molecules.
Collapse
Affiliation(s)
- Peter Eastman
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Benjamin P Pritchard
- Molecular Sciences Software Institute, Virginia Polytechnic Institute and State University, Blacksburg, Virginia 24060, United States
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
4
|
Yu Z, Jackson NE. Chemically Transferable Electronic Coarse Graining for Polythiophenes. J Chem Theory Comput 2024. [PMID: 39370933 DOI: 10.1021/acs.jctc.4c00804] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/08/2024]
Abstract
Recent advances in machine-learning-based electronic coarse graining (ECG) methods have demonstrated the potential to enable electronic predictions in soft materials at mesoscopic length scales. However, previous ECG models have yet to confront the issue of chemical transferability. In this study, we develop chemically transferable ECG models for polythiophenes using graph neural networks. Our models are trained on a data set that samples over the conformational space of random polythiophene sequences generated with 15 different monomer chemistries and three different degrees of polymerization. We systematically explore the impact of coarse-grained representation on ECG accuracy, highlighting the significance of preserving the C-β coordinates in thiophene. We also find that integrating unique polymer sequences into training enhances the model performance more efficiently than augmenting conformational sampling for sequences already in the training data set. Moreover, our ECG models, developed initially for one property and one level of quantum chemical theory, can be efficiently transferred to related properties and higher levels of theory with minimal additional data. The chemically transferable ECG model introduced in this work will serve as a foundation model for new classes of chemically transferable ECG predictions across chemical space.
Collapse
Affiliation(s)
- Zheng Yu
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Nicholas E Jackson
- Department of Chemistry, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
5
|
Kumar D, Harris AL, Luo YL. Molecular permeation through large pore channels: computational approaches and insights. J Physiol 2024. [PMID: 39373834 DOI: 10.1113/jp285198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 09/06/2024] [Indexed: 10/08/2024] Open
Abstract
Computational methods such as molecular dynamics (MD) have illuminated how single-atom ions permeate membrane channels and how selectivity among them is achieved. Much less is understood about molecular permeation through eukaryotic channels that mediate the flux of small molecules (e.g. connexins, pannexins, LRRC8s, CALHMs). Here we describe computational methods that have been profitably employed to explore the movements of molecules through wide pores, revealing mechanistic insights, guiding experiments, and suggesting testable hypotheses. This review illustrates MD techniques such as voltage-driven flux, potential of mean force, and mean first-passage-time calculations, as applied to molecular permeation through wide pores. These techniques have enabled detailed and quantitative modeling of molecular interactions and movement of permeants at the atomic level. We highlight novel contributors to the transit of molecules through these wide pathways. In particular, the flexibility and anisotropic nature of permeant molecules, coupled with the dynamics of pore-lining residues, lead to bespoke permeation dynamics. As more eukaryotic large-pore channel structures and functional data become available, these insights and approaches will be important for understanding the physical principles underlying molecular permeation and as guides for experimental design.
Collapse
Affiliation(s)
- Deepak Kumar
- Department of Biotechnology and Pharmaceutical Sciences, Western University of Health Sciences, Pomona, CA, USA
| | - Andrew L Harris
- Department of Pharmacology, Physiology, and Neuroscience, New Jersey Medical School, Rutgers, The State University of New Jersey, Newark, NJ, USA
| | - Yun Lyna Luo
- Department of Biotechnology and Pharmaceutical Sciences, Western University of Health Sciences, Pomona, CA, USA
| |
Collapse
|
6
|
Berenger F, Tsuda K. An ANI-2 enabled open-source protocol to estimate ligand strain after docking. J Comput Chem 2024. [PMID: 39367774 DOI: 10.1002/jcc.27478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2024] [Revised: 07/22/2024] [Accepted: 07/27/2024] [Indexed: 10/07/2024]
Abstract
In protein-ligand docking, the score assigned to a protein-ligand complex is approximate. Especially, the internal energy of the ligand is difficult to compute precisely using a molecular mechanics based force-field, introducing significant noise in the rank-ordering of ligands. We propose an open-source protocol (https://github.com/UnixJunkie/MMO), using two quantum mechanics (QM) single point energy calculations, plus a Monte Carlo (Monte Carlo) based ligand minimization procedure in-between, to estimate ligand strain after docking. The MC simulation uses the ANI-2x (QM approximating) force field and is performed in the dihedral space. On some protein targets, using strain filtering after docking allows to significantly improve hit rates. We performed a structure-based virtual screening campaign on nine protein targets from the Laboratoire d'Innovation Thérapeutique-PubChem assays dataset using Cambridge crystallographic data centre genetic optimization for ligand docking. Then, docked ligands were submitted to the strain estimation protocol and the impact on hit rate was analyzed. As for docking, the method does not always work. However, if sufficient active and inactive molecules are known for a given protein target, its efficiency can be evaluated.
Collapse
Affiliation(s)
- Francois Berenger
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| | - Koji Tsuda
- Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa, Japan
| |
Collapse
|
7
|
Nikidis E, Kyriakopoulos N, Tohid R, Kachrimanis K, Kioseoglou J. Harnessing machine learning for efficient large-scale interatomic potential for sildenafil and pharmaceuticals containing H, C, N, O, and S. NANOSCALE 2024; 16:18014-18026. [PMID: 39252581 DOI: 10.1039/d4nr00929k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
In this study a cutting-edge approach to producing accurate and computationally efficient interatomic potentials using machine learning algorithms is presented. Specifically, the study focuses on the application of Allegro, a novel machine learning algorithm, running on high-performance GPUs for training potentials. The choice of training parameters plays a pivotal role in the quality of the potential functions. To enable this methodology, the "Solvated Protein Fragments" dataset, containing nearly 2.7 million Density Functional Theory (DFT) calculations for many-body intermolecular interactions involving protein fragments and water molecules, encompassing H, C, N, O, and S elements, is considered as the training dataset. The project optimizes computational efficiency by reducing the initial dataset size according to the intended application. To assess the efficacy of the approach, the sildenafil citrate, iso-sildenafil, aspirin, ibuprofen, mebendazole and urea, representing all five relevant elements, serve as the test bed. The results of the Allegro-trained potentials demonstrate outstanding performance, benefiting from the combination of an appropriate training dataset and parameter selection. This notably enhanced computational efficiency when compared to the computationally intensive DFT method aided by GPU acceleration. Validation of the produced interatomic potentials is achieved through Allegro's own evaluation mechanism, yielding exceptional accuracy. Further verification is carried out through LAMMPS molecular dynamics simulations. Structural optimization by energy minimization and NPT Molecular Dynamics simulations are performed for each potential, assessing relaxation processes and energy reduction. Additional structures, including urea, ammonia, uracil, oxalic acid, and acetic acid, are tested, highlighting the potential's versatility in describing systems containing the aforementioned elements. Visualization of the results confirms the scientific accuracy of each structure's relaxation. The findings of this study demonstrate strong scaling and the potential for applications in pharmaceutical research, allowing the exploration of larger molecular structures not previously amenable to computational analysis at this level of accuracy The success of the machine learning approach underscores its potential to revolutionize computational solid-state physics.
Collapse
Affiliation(s)
- E Nikidis
- Physics Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
- Center for Interdisciplinary Research & Innovation, Aristotle University of Thessaloniki, Greece
| | - N Kyriakopoulos
- Physics Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
- Center for Interdisciplinary Research & Innovation, Aristotle University of Thessaloniki, Greece
| | - R Tohid
- Center of Computation and Technology, Louisiana State University, 70803 Baton Rouge, USA
| | - K Kachrimanis
- Center for Interdisciplinary Research & Innovation, Aristotle University of Thessaloniki, Greece
- Pharmaceutical Technology Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece
| | - J Kioseoglou
- Physics Department, Aristotle University of Thessaloniki, 54124 Thessaloniki, Greece.
- Center for Interdisciplinary Research & Innovation, Aristotle University of Thessaloniki, Greece
| |
Collapse
|
8
|
Xu Y, Jin Y, García Sánchez JS, Pérez-Lemus GR, Zubieta Rico PF, Delferro M, de Pablo JJ. A Molecular View of Methane Activation on Ni(111) through Enhanced Sampling and Machine Learning. J Phys Chem Lett 2024; 15:9852-9862. [PMID: 39298736 DOI: 10.1021/acs.jpclett.4c02237] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
A combination of machine learned interatomic potentials (MLIPs) and enhanced sampling simulations is used to investigate the activation of methane on a Ni(111) surface. The work entails the development and iterative refinement of MLIPs, initially trained on a data set constructed via ab initio molecular dynamics simulations, supplemented by adaptive biasing forces, to enrich the sampling of catalytically relevant configurations. Our results reveal that upon incorporation of collective variables that capture the behavior of the reactant molecule, as well as additional frames that describe the dynamic response of the catalytic surface, it is possible to enhance considerably the accuracy of predicted energies and forces. By employing enhanced sampling schemes in the refinement of the MLIP, we systematically explore the potential energy surface, leading to a refined MLIP capable of predicting density functional theory-level energies and forces and replicating key geometric characteristics of the catalytic system. The resulting free energy landscapes at several temperatures provide a detailed view of the thermodynamics and dynamics of methane activation. Specifically, as methane approaches and dissociates on the catalytic surface, the process involves the dynamic interplay of CH4 and the Ni catalyst that includes both enthalpic and entropic contributions. The progression toward the transition state involves a CH4 moiety that is increasingly restrained in its ability to rotate or translate, while the stage following the transition state is characterized by a notable rise of the Ni atom that interacts with the cleaved C-H bond. This leads to an increase in the mobility of the adsorbed species, a feature that becomes more pronounced at higher temperatures.
Collapse
Affiliation(s)
- Yinan Xu
- Pritzker School of Molecular Engineering, The University of Chicago, 640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | - Yezhi Jin
- Pritzker School of Molecular Engineering, The University of Chicago, 640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | - Jireh S García Sánchez
- Pritzker School of Molecular Engineering, The University of Chicago, 640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | - Gustavo R Pérez-Lemus
- Pritzker School of Molecular Engineering, The University of Chicago, 640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | - Pablo F Zubieta Rico
- Pritzker School of Molecular Engineering, The University of Chicago, 640 South Ellis Avenue, Chicago, Illinois 60637, United States
| | - Massimiliano Delferro
- Chemical Sciences and Engineering Division, Argonne National Laboratory, 9700 South Cass Avenue, Lemont, Illinois 60439, United States
| | - Juan J de Pablo
- Pritzker School of Molecular Engineering, The University of Chicago, 640 South Ellis Avenue, Chicago, Illinois 60637, United States
- Materials Science Division, Argonne National Laboratory, 9700 South Cass Avenue, Lemont, Illinois 60439, United States
| |
Collapse
|
9
|
Hou YF, Zhang Q, Dral PO. Surprising Dynamics Phenomena in the Diels-Alder Reaction of C 60 Uncovered with AI. J Org Chem 2024. [PMID: 39358911 DOI: 10.1021/acs.joc.4c01763] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2024]
Abstract
We performed an extensive artificial intelligence-accelerated quasi-classical molecular dynamics investigation of the time-resolved mechanism of the Diels-Alder reaction of fullerene C60 with 2,3-dimethyl-1,3-butadiene. In a substantial fraction (10%) of reactive trajectories, the larger C60 noncovalently attracts the 2,3-dimethyl-1,3-butadiene long before the barrier so that the diene undergoes the series of complex motions including roaming, somersaults, twisting, and twisting somersaults around the fullerene until it aligns itself to pass over the barrier. These complicated processes could be easily missed in typically performed quantum chemical simulations with shorter and fewer trajectories. After the barrier is passed, the bonds take longer to form compared to the simplest prototypical Diels-Alder reaction of ethene with 1,3-butadiene despite high similarities in transition states and barrier widths evaluated with intrinsic reaction coordinate (IRC) calculations. C60 is mainly responsible for these differences as its reaction with 1,3-butadiene is similar to the reaction with 2,3-dimethyl-1,3-butadiene: the only substantial difference being that the extra methyl groups double the probability of the prolonged alignment phase in dynamics. These additional calculations of C60 with 1,3-butadiene could be performed via active learning more easily by reusing the data generated for the other two reactions, showing the potential for larger-scale exploration of the effects of different substrates in the same types of reactions.
Collapse
Affiliation(s)
- Yi-Fan Hou
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Quanhao Zhang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
| | - Pavlo O Dral
- State Key Laboratory of Physical Chemistry of Solid Surfaces, College of Chemistry and Chemical Engineering, Fujian Provincial Key Laboratory of Theoretical and Computational Chemistry, and Innovation Laboratory for Sciences and Technologies of Energy Materials of Fujian Province (IKKEM), Xiamen University, Xiamen, Fujian 361005, China
- Institute of Physics, Faculty of Physics, Astronomy, and Informatics, Nicolaus Copernicus University in Toruń, Ul. Grudziądzka 5, Toruń 87-100, Poland
| |
Collapse
|
10
|
See TJ, Zhang D, Boley M, Chalmers DK. Graph Neural Network-Based Molecular Property Prediction with Patch Aggregation. J Chem Theory Comput 2024. [PMID: 39356714 DOI: 10.1021/acs.jctc.4c00798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2024]
Abstract
Graph neural networks (GNNs) have emerged as powerful tools for quantum chemical property prediction, leveraging the inherent graph structure of molecular systems. GNNs depend on an edge-to-node aggregation mechanism for combining edge representations into node representations. Unfortunately, existing learnable edge-to-node aggregation methods substantially increase the number of parameters and, thus, the computational cost relative to simple sum aggregation. Worse, as we report here, they often fail to improve predictive accuracy. We therefore propose a novel learnable edge-to-node aggregation mechanism that aims to improve the accuracy and parameter efficiency of GNNs in predicting molecular properties. The new mechanism, called "patch aggregation", is inspired by the Multi-Head Attention and Mixture of Experts machine learning techniques. We have incorporated the patch aggregation method into the specialized, state-of-the-art GNN models SchNet, DimeNet++, SphereNet, TensorNet, and VisNet and show that patch aggregation consistently outperforms existing learnable and nonlearnable aggregation techniques (sum, multilayer perceptron, softmax, and set transformer aggregation) in the prediction of molecular properties such as QM9 thermodynamic properties and MD17 molecular dynamics trajectory energies and forces. We also find that patch aggregation not only improves prediction accuracy but also is parameter-efficient, making it an attractive option for practical applications for which computational resources are limited. Further, we show that Patch aggregation can be applied across different GNN models. Overall, Patch aggregation is a powerful edge-to-node aggregation mechanism that improves the accuracy of molecular property predictions by GNNs.
Collapse
Affiliation(s)
- Teng Jiek See
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, VIC 3068, Australia
| | - Daokun Zhang
- School of Computer Science, University of Nottingham Ningbo China, 199 Taikang East Road, Ningbo 315100, China
| | - Mario Boley
- Department of Data Science and AI, Faculty of Information Technology, Monash University, Clayton Campus, Building 63, 25 Exhibition Walk, VIC 3800, Australia
| | - David K Chalmers
- Medicinal Chemistry, Monash Institute of Pharmaceutical Sciences, Monash University, 381 Royal Parade, Parkville, VIC 3068, Australia
| |
Collapse
|
11
|
Shirani H, Hashemianzadeh SM. Quantum-level machine learning calculations of Levodopa. Comput Biol Chem 2024; 112:108146. [PMID: 39067350 DOI: 10.1016/j.compbiolchem.2024.108146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 06/20/2024] [Accepted: 07/08/2024] [Indexed: 07/30/2024]
Abstract
Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6-31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.
Collapse
Affiliation(s)
- Hossein Shirani
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| |
Collapse
|
12
|
Haghiri S, Viquez Rojas C, Bhat S, Isayev O, Slipchenko L. ANI/EFP: Modeling Long-Range Interactions in ANI Neural Network with Effective Fragment Potentials. J Chem Theory Comput 2024. [PMID: 39352841 DOI: 10.1021/acs.jctc.4c01052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/04/2024]
Abstract
Deep learning Neural Networks (NN) have been developed in the field of molecular modeling for the purpose of circumventing the high computational cost of quantum-mechanical calculations while rivaling their accuracies. Although these networks have found great success, they generally lack the ability to accurately describe long-range interactions, which makes them unusable for extended molecular systems. Herein, we provide a method for partially retraining the deep learning general-use neural network ANI, in which the long-range interactions are represented via atomic electrostatic potentials. The electrostatic potentials, generated with polarizable effective fragment potentials (EFP), are used as an additional input feature for the network. This new ANI/EFP network can predict solute-solvent interaction energies on a trained data set with a kcal/mol accuracy. It also shows promise in predicting the interaction energies of a solute in solvent environments that have not been included in a training data set. The proposed protocol can be taken as an example and further developed, leading to highly accurate and transferable neural network potentials capable of handling long-range interactions and extended molecular systems.
Collapse
Affiliation(s)
- Shahed Haghiri
- Department of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907-2084, United States
| | - Claudia Viquez Rojas
- Department of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907-2084, United States
| | - Sriram Bhat
- Department of Computer Science, The University of Texas at Dallas, 800 W. Campbell Road, Richardson, Texas 75080, United States
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, Pennsylvania 15213, United States
| | - Lyudmila Slipchenko
- Department of Chemistry, Purdue University, 560 Oval Drive, West Lafayette, Indiana 47907-2084, United States
| |
Collapse
|
13
|
Urquhart RJ, van Teijlingen A, Tuttle T. ANI neural network potentials for small molecule p Ka prediction. Phys Chem Chem Phys 2024; 26:23934-23943. [PMID: 39235138 DOI: 10.1039/d4cp01982b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024]
Abstract
The pKa value of a molecule is of interest to chemists across a broad spectrum of fields including pharmacology, environmental chemistry and theoretical chemistry. Determination of pKa values can be accomplished through several experimental methods such as NMR techniques and titration together with computational techniques such as DFT calculations. However, all of these methods remain time consuming and computationally expensive. In this work we develop a method for the rapid calculation of pKa values of small molecules which utilises a combination of neural network potentials, low energy conformer searches and thermodynamic cycles. We show that neural network potentials trained on different phase and charge states can be employed in tandem to predict the full thermodynamic energy cycle of molecules. Focusing here on imidazolium derived carbene species, the method utilised can easily be extended to other functional groups of interest such as amines with further training.
Collapse
Affiliation(s)
- Ross James Urquhart
- Department of Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow, G1 1XL, UK.
| | - Alexander van Teijlingen
- Department of Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow, G1 1XL, UK.
| | - Tell Tuttle
- Department of Pure and Applied Chemistry, University of Strathclyde, 295 Cathedral Street, Glasgow, G1 1XL, UK.
| |
Collapse
|
14
|
DelloStritto M, Klein ML. Understanding Strain and Failure of a Knot in Polyethylene Using Molecular Dynamics with Machine-Learned Potentials. J Phys Chem Lett 2024; 15:9070-9077. [PMID: 39197116 DOI: 10.1021/acs.jpclett.4c01845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/30/2024]
Abstract
A neural network potential (NNP) has been developed by fitting to ab initio electronic structure data on hydrocarbons and is used to study failure of linear and knotted polyethylene (PE) chains. A linear PE chain must be highly strained before breaking as the stress is equally distributed across the chain. In contrast, the stress in a PE chain with a 31 or overhand knot, accumulates at the knot's entrance/exit. We find the strain energy is greatest when the bond length and angle are strained simultaneously, and that the knot weakens the chain by increasing the variance of the C-C-C angle, thereby allowing rupture at lower bond strains. We extend our analysis to both 51 and 52 knots and find that both break at the entrance/exit of a loop. Notably, molecular scale PE knots exhibit many of the same characteristics as knots in a macroscopic rope, with stick-slip phenomena upon tightening and similar points of failure.
Collapse
Affiliation(s)
- Mark DelloStritto
- Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122, United States
| | - Michael L Klein
- Institute for Computational Molecular Science, Temple University, Philadelphia, Pennsylvania 19122, United States
| |
Collapse
|
15
|
Tu NTP, Williamson S, Johnson ER, Rowley CN. Modeling Intermolecular Interactions with Exchange-Hole Dipole Moment Dispersion Corrections to Neural Network Potentials. J Phys Chem B 2024; 128:8290-8302. [PMID: 39166778 DOI: 10.1021/acs.jpcb.4c02882] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/23/2024]
Abstract
Neural network potentials (NNPs) are an innovative approach for calculating the potential energy and forces of a chemical system. In principle, these methods are capable of modeling large systems with an accuracy approaching that of a high-level ab initio calculation, but with a much smaller computational cost. Due to their training to density-functional theory (DFT) data and neglect of long-range interactions, some classes of NNPs require an additional term to include London dispersion physics. In this Perspective, we discuss the requirements for a dispersion model for use with an NNP, focusing on the MLXDM (Machine Learned eXchange-Hole Dipole Moment) model developed by our groups. This model is based on the DFT-based XDM dispersion correction, which calculates interatomic dispersion coefficients in terms of atomic moments and polarizabilities, both of which can be approximated effectively using neural networks.
Collapse
Affiliation(s)
| | - Siri Williamson
- Department of Chemistry, Carleton University, Ottawa, Ontario K1S 5B6, Canada
| | - Erin R Johnson
- Department of Chemistry, Dalhousie University, Halifax, Nova Scotia B3H 4J3, Canada
| | | |
Collapse
|
16
|
Hou P, Tian Y, Meng X. Improving Molecular-Dynamics Simulations for Solid-Liquid Interfaces with Machine-Learning Interatomic Potentials. Chemistry 2024; 30:e202401373. [PMID: 38877181 DOI: 10.1002/chem.202401373] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2024] [Revised: 06/13/2024] [Accepted: 06/14/2024] [Indexed: 06/16/2024]
Abstract
Emerging developments in artificial intelligence have opened infinite possibilities for material simulation. Depending on the powerful fitting of machine learning algorithms to first-principles data, machine learning interatomic potentials (MLIPs) can effectively balance the accuracy and efficiency problems in molecular dynamics (MD) simulations, serving as powerful tools in various complex physicochemical systems. Consequently, this brings unprecedented enthusiasm for researchers to apply such novel technology in multiple fields to revisit the major scientific problems that have remained controversial owing to the limitations of previous computational methods. Herein, we introduce the evolution of MLIPs, provide valuable application examples for solid-liquid interfaces, and present current challenges. Driven by solving multitudinous difficulties in terms of the accuracy, efficiency, and versatility of MLIPs, this booming technique, combined with molecular simulation methods, will provide an underlying and valuable understanding of interdisciplinary scientific challenges, including materials, physics, and chemistry.
Collapse
Affiliation(s)
- Pengfei Hou
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| | - Yumiao Tian
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| | - Xing Meng
- Key Laboratory of Physics and Technology for Advanced Batteries (Ministry of Education), College of Physics, Jilin University, Changchun, 130012, China
- Key Laboratory of Material Simulation Methods and Software of Ministry of Education, College of Physics, Jilin University, Changchun, 130012, China
| |
Collapse
|
17
|
Faraji S, Liu M. Transferable machine learning interatomic potential for carbon hydrogen systems. Phys Chem Chem Phys 2024; 26:22346-22358. [PMID: 39140158 DOI: 10.1039/d4cp02300e] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2024]
Abstract
In this study, we developed a machine learning interatomic potential based on artificial neural networks (ANN) to model carbon-hydrogen (C-H) systems. The ANN potential was trained on a dataset of C-H clusters obtained through density functional theory (DFT) calculations. Through comprehensive evaluations against DFT results, including predictions of geometries and formation energies across 0D-3D systems comprising C and C-H, as well as modeling various chemical processes, the ANN potential demonstrated exceptional accuracy and transferability. Its capability to accurately predict lattice dynamics, crucial for stability assessment in crystal structure prediction, was also verified through phonon dispersion analysis. Notably, its accuracy and computational efficiency in calculating force constants facilitated the exploration of complex energy landscapes, leading to the discovery of a novel C polymorph. These results underscore the robustness and versatility of the ANN potential, highlighting its efficacy in advancing computational materials science by conducting precise atomistic simulations on a wide range of C-H materials.
Collapse
Affiliation(s)
- Somayeh Faraji
- Department of Chemistry, University of Florida, Gainesville, FL 32611, USA.
| | - Mingjie Liu
- Department of Chemistry, University of Florida, Gainesville, FL 32611, USA.
| |
Collapse
|
18
|
van Gunsteren WF, Oostenbrink C. Methods for Classical-Mechanical Molecular Simulation in Chemistry: Achievements, Limitations, Perspectives. J Chem Inf Model 2024; 64:6281-6304. [PMID: 39136351 DOI: 10.1021/acs.jcim.4c00823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/27/2024]
Abstract
More than a half century ago it became feasible to simulate, using classical-mechanical equations of motion, the dynamics of molecular systems on a computer. Since then classical-physical molecular simulation has become an integral part of chemical research. It is widely applied in a variety of branches of chemistry and has significantly contributed to the development of chemical knowledge. It offers understanding and interpretation of experimental results, semiquantitative predictions for measurable and nonmeasurable properties of substances, and allows the calculation of properties of molecular systems under conditions that are experimentally inaccessible. Yet, molecular simulation is built on a number of assumptions, approximations, and simplifications which limit its range of applicability and its accuracy. These concern the potential-energy function used, adequate sampling of the vast statistical-mechanical configurational space of a molecular system and the methods used to compute particular properties of chemical systems from statistical-mechanical ensembles. During the past half century various methodological ideas to improve the efficiency and accuracy of classical-physical molecular simulation have been proposed, investigated, evaluated, implemented in general simulation software or were abandoned. The latter because of fundamental flaws or, while being physically sound, computational inefficiency. Some of these methodological ideas are briefly reviewed and the most effective methods are highlighted. Limitations of classical-physical simulation are discussed and perspectives are sketched.
Collapse
Affiliation(s)
- Wilfred F van Gunsteren
- Institute for Molecular Physical Science, Swiss Federal Institute of Technology, ETH, CH-8093 Zurich, Switzerland
| | - Chris Oostenbrink
- Institute of Molecular Modelling and Simulation, BOKU University, 1190 Vienna, Austria
- Christian Doppler Laboratory for Molecular Informatics in the Biosciences, BOKU University, Muthgasse 18, 1190 Vienna, Austria
| |
Collapse
|
19
|
Glick ZL, Metcalf DP, Glick CS, Spronk SA, Koutsoukas A, Cheney DL, Sherrill CD. A physics-aware neural network for protein-ligand interactions with quantum chemical accuracy. Chem Sci 2024; 15:13313-13324. [PMID: 39183910 PMCID: PMC11339967 DOI: 10.1039/d4sc01029a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Accepted: 07/09/2024] [Indexed: 08/27/2024] Open
Abstract
Quantifying intermolecular interactions with quantum chemistry (QC) is useful for many chemical problems, including understanding the nature of protein-ligand interactions. Unfortunately, QC computations on protein-ligand systems are too computationally expensive for most use cases. The flourishing field of machine-learned (ML) potentials is a promising solution, but it is limited by an inability to easily capture long range, non-local interactions. In this work we develop an atomic-pairwise neural network (AP-Net) specialized for modeling intermolecular interactions. This model benefits from a number of physical constraints, including a two-component equivariant message passing neural network architecture that predicts interaction energies via an intermediate prediction of monomer electron densities. The AP-Net model is trained on a comprehensive dataset composed of paired ligand and protein fragments. This model accurately predicts QC-quality interaction energies of protein-ligand systems at a computational cost reduced by orders of magnitude. Applications of the AP-Net model to molecular crystal structure prediction are explored, as well as limitations in modeling highly polarizable systems.
Collapse
Affiliation(s)
- Zachary L Glick
- School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology Atlanta Georgia 30332-0400 USA
| | - Derek P Metcalf
- School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology Atlanta Georgia 30332-0400 USA
| | - Caroline S Glick
- School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology Atlanta Georgia 30332-0400 USA
| | - Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company P.O. Box 5400 Princeton New Jersey 08543 USA
| | - Alexios Koutsoukas
- Molecular Structure and Design, Bristol Myers Squibb Company P.O. Box 5400 Princeton New Jersey 08543 USA
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company P.O. Box 5400 Princeton New Jersey 08543 USA
| | - C David Sherrill
- School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology Atlanta Georgia 30332-0400 USA
| |
Collapse
|
20
|
Gubler M, Finkler JA, Schäfer MR, Behler J, Goedecker S. Accelerating Fourth-Generation Machine Learning Potentials Using Quasi-Linear Scaling Particle Mesh Charge Equilibration. J Chem Theory Comput 2024; 20. [PMID: 39151921 PMCID: PMC11360134 DOI: 10.1021/acs.jctc.4c00334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 08/19/2024]
Abstract
Machine learning potentials (MLPs) have revolutionized the field of atomistic simulations by describing atomic interactions with the accuracy of electronic structure methods at a small fraction of the cost. Most current MLPs construct the energy of a system as a sum of atomic energies, which depend on information about the atomic environments provided in the form of predefined or learnable feature vectors. If, in addition, nonlocal phenomena like long-range charge transfer are important, fourth-generation MLPs need to be used, which include a charge equilibration (Qeq) step to take the global structure of the system into account. This Qeq can significantly increase the computational cost and thus can become a computational bottleneck for large systems. In this Article, we present a highly efficient formulation of Qeq that does not require the explicit computation of the Coulomb matrix elements, resulting in a quasi-linear scaling method. Moreover, our approach also allows for the efficient calculation of energy derivatives, which explicitly consider the global structure-dependence of the atomic charges as obtained from Qeq. Due to its generality, the method is not restricted to MLPs and can also be applied within a variety of other force fields.
Collapse
Affiliation(s)
- Moritz Gubler
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| | - Jonas A. Finkler
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| | - Moritz R. Schäfer
- Lehrstuhl
für Theoretische Chemie II, Ruhr-Universität
Bochum, 44780 Bochum, Germany
- Research
Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl
für Theoretische Chemie II, Ruhr-Universität
Bochum, 44780 Bochum, Germany
- Research
Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Stefan Goedecker
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| |
Collapse
|
21
|
Williams CD, Kalayan J, Burton NA, Bryce RA. Stable and accurate atomistic simulations of flexible molecules using conformationally generalisable machine learned potentials. Chem Sci 2024; 15:12780-12795. [PMID: 39148799 PMCID: PMC11323334 DOI: 10.1039/d4sc01109k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/07/2024] [Indexed: 08/17/2024] Open
Abstract
Computational simulation methods based on machine learned potentials (MLPs) promise to revolutionise shape prediction of flexible molecules in solution, but their widespread adoption has been limited by the way in which training data is generated. Here, we present an approach which allows the key conformational degrees of freedom to be properly represented in reference molecular datasets. MLPs trained on these datasets using a global descriptor scheme are generalisable in conformational space, providing quantum chemical accuracy for all conformers. These MLPs are capable of propagating long, stable molecular dynamics trajectories, an attribute that has remained a challenge. We deploy the MLPs in obtaining converged conformational free energy surfaces for flexible molecules via well-tempered metadynamics simulations; this approach provides a hitherto inaccessible route to accurately computing the structural, dynamical and thermodynamical properties of a wide variety of flexible molecular systems. It is further demonstrated that MLPs must be trained on reference datasets with complete coverage of conformational space, including in barrier regions, to achieve stable molecular dynamics trajectories.
Collapse
Affiliation(s)
- Christopher D Williams
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Jas Kalayan
- Science and Technologies Facilities Council (STFC), Daresbury Laboratory Keckwick Lane, Daresbury Warrington WA4 4AD UK
| | - Neil A Burton
- Department of Chemistry, School of Natural Sciences, Faculty of Science and Engineering, The University of Manchester Oxford Road Manchester M13 9PL UK
| | - Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, Faculty of Biology, Medicine and Health, The University of Manchester Oxford Road Manchester M13 9PL UK
| |
Collapse
|
22
|
Grambow CA, Weir H, Cunningham CN, Biancalani T, Chuang KV. CREMP: Conformer-rotamer ensembles of macrocyclic peptides for machine learning. Sci Data 2024; 11:859. [PMID: 39122750 PMCID: PMC11316032 DOI: 10.1038/s41597-024-03698-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 07/29/2024] [Indexed: 08/12/2024] Open
Abstract
Computational and machine learning approaches to model the conformational landscape of macrocyclic peptides have the potential to enable rational design and optimization. However, accurate, fast, and scalable methods for modeling macrocycle geometries remain elusive. Recent deep learning approaches have significantly accelerated protein structure prediction and the generation of small-molecule conformational ensembles, yet similar progress has not been made for macrocyclic peptides due to their unique properties. Here, we introduce CREMP, a resource generated for the rapid development and evaluation of machine learning models for macrocyclic peptides. CREMP contains 36,198 unique macrocyclic peptides and their high-quality structural ensembles generated using the Conformer-Rotamer Ensemble Sampling Tool (CREST). Altogether, this new dataset contains nearly 31.3 million unique macrocycle geometries, each annotated with energies derived from semi-empirical extended tight-binding (xTB) DFT calculations. Additionally, we include 3,258 macrocycles with reported passive permeability data to couple conformational ensembles to experiment. We anticipate that this dataset will enable the development of machine learning models that can improve peptide design and optimization for novel therapeutics.
Collapse
Affiliation(s)
- Colin A Grambow
- Prescient Design, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA.
| | - Hayley Weir
- Prescient Design, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA
| | - Christian N Cunningham
- Department of Peptide Therapeutics, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA
| | - Tommaso Biancalani
- Biology Research | Development, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA
| | - Kangway V Chuang
- Prescient Design, Genentech, 1 DNA Way, South San Francisco, CA, 94080, USA.
| |
Collapse
|
23
|
Cao Y, Balduf T, Beachy MD, Bennett MC, Bochevarov AD, Chien A, Dub PA, Dyall KG, Furness JW, Halls MD, Hughes TF, Jacobson LD, Kwak HS, Levine DS, Mainz DT, Moore KB, Svensson M, Videla PE, Watson MA, Friesner RA. Quantum chemical package Jaguar: A survey of recent developments and unique features. J Chem Phys 2024; 161:052502. [PMID: 39092934 DOI: 10.1063/5.0213317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2024] [Accepted: 07/12/2024] [Indexed: 08/04/2024] Open
Abstract
This paper is dedicated to the quantum chemical package Jaguar, which is commercial software developed and distributed by Schrödinger, Inc. We discuss Jaguar's scientific features that are relevant to chemical research as well as describe those aspects of the program that are pertinent to the user interface, the organization of the computer code, and its maintenance and testing. Among the scientific topics that feature prominently in this paper are the quantum chemical methods grounded in the pseudospectral approach. A number of multistep workflows dependent on Jaguar are covered: prediction of protonation equilibria in aqueous solutions (particularly calculations of tautomeric stability and pKa), reactivity predictions based on automated transition state search, assembly of Boltzmann-averaged spectra such as vibrational and electronic circular dichroism, as well as nuclear magnetic resonance. Discussed also are quantum chemical calculations that are oriented toward materials science applications, in particular, prediction of properties of optoelectronic materials and organic semiconductors, and molecular catalyst design. The topic of treatment of conformations inevitably comes up in real world research projects and is considered as part of all the workflows mentioned above. In addition, we examine the role of machine learning methods in quantum chemical calculations performed by Jaguar, from auxiliary functions that return the approximate calculation runtime in a user interface, to prediction of actual molecular properties. The current work is second in a series of reviews of Jaguar, the first having been published more than ten years ago. Thus, this paper serves as a rare milestone on the path that is being traversed by Jaguar's development in more than thirty years of its existence.
Collapse
Affiliation(s)
- Yixiang Cao
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Ty Balduf
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Michael D Beachy
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - M Chandler Bennett
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Art D Bochevarov
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Alan Chien
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Pavel A Dub
- Schrödinger, Inc., 9868 Scranton Road, Suite 3200, San Diego, California 92121, USA
| | - Kenneth G Dyall
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - James W Furness
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mathew D Halls
- Schrödinger, Inc., 9868 Scranton Road, Suite 3200, San Diego, California 92121, USA
| | - Thomas F Hughes
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Leif D Jacobson
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - H Shaun Kwak
- Schrödinger, Inc., 101 SW Main St., Suite 1300, Portland, Oregon 97204, USA
| | - Daniel S Levine
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Daniel T Mainz
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Kevin B Moore
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mats Svensson
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Pablo E Videla
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Mark A Watson
- Schrödinger, Inc., 1540 Broadway, Floor 24, New York, New York 10036, USA
| | - Richard A Friesner
- Department of Chemistry, Columbia University, 3000 Broadway, New York, New York 10027, USA
| |
Collapse
|
24
|
Singh AN, Limmer DT. Splitting probabilities as optimal controllers of rare reactive events. J Chem Phys 2024; 161:054113. [PMID: 39101534 DOI: 10.1063/5.0203840] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2024] [Accepted: 07/10/2024] [Indexed: 08/06/2024] Open
Abstract
The committor constitutes the primary quantity of interest within chemical kinetics as it is understood to encode the ideal reaction coordinate for a rare reactive event. We show the generative utility of the committor in that it can be used explicitly to produce a reactive trajectory ensemble that exhibits numerically exact statistics as that of the original transition path ensemble. This is done by relating a time-dependent analog of the committor that solves a generalized bridge problem to the splitting probability that solves a boundary value problem under a bistable assumption. By invoking stochastic optimal control and spectral theory, we derive a general form for the optimal controller of a bridge process that connects two metastable states expressed in terms of the splitting probability. This formalism offers an alternative perspective into the role of the committor and its gradients in that they encode force fields that guarantee reactivity, generating trajectories that are statistically identical to the way that a system would react autonomously.
Collapse
Affiliation(s)
- Aditya N Singh
- Department of Chemistry, University of California, Berkeley, California 94720, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - David T Limmer
- Department of Chemistry, University of California, Berkeley, California 94720, USA
- Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Materials Science Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
- Kavli Energy Nanoscience Institute at Berkeley, Berkeley, California 94720, USA
| |
Collapse
|
25
|
Frank JT, Unke OT, Müller KR, Chmiela S. A Euclidean transformer for fast and stable machine learned force fields. Nat Commun 2024; 15:6539. [PMID: 39107296 PMCID: PMC11303804 DOI: 10.1038/s41467-024-50620-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 07/10/2024] [Indexed: 08/10/2024] Open
Abstract
Recent years have seen vast progress in the development of machine learned force fields (MLFFs) based on ab-initio reference calculations. Despite achieving low test errors, the reliability of MLFFs in molecular dynamics (MD) simulations is facing growing scrutiny due to concerns about instability over extended simulation timescales. Our findings suggest a potential connection between robustness to cumulative inaccuracies and the use of equivariant representations in MLFFs, but the computational cost associated with these representations can limit this advantage in practice. To address this, we propose a transformer architecture called SO3KRATES that combines sparse equivariant representations (Euclidean variables) with a self-attention mechanism that separates invariant and equivariant information, eliminating the need for expensive tensor products. SO3KRATES achieves a unique combination of accuracy, stability, and speed that enables insightful analysis of quantum properties of matter on extended time and system size scales. To showcase this capability, we generate stable MD trajectories for flexible peptides and supra-molecular structures with hundreds of atoms. Furthermore, we investigate the PES topology for medium-sized chainlike molecules (e.g., small peptides) by exploring thousands of minima. Remarkably, SO3KRATES demonstrates the ability to strike a balance between the conflicting demands of stability and the emergence of new minimum-energy conformations beyond the training data, which is crucial for realistic exploration tasks in the field of biochemistry.
Collapse
Affiliation(s)
- J Thorben Frank
- Machine Learning Group, TU Berlin, Berlin, Germany
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| | | | - Klaus-Robert Müller
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
- Google DeepMind, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Seoul, Korea.
- Max Planck Institut für Informatik, Saarbrücken, Germany.
| | - Stefan Chmiela
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
| |
Collapse
|
26
|
Kingsbury CJ, Senge MO. Quantifying near-symmetric molecular distortion using symmetry-coordinate structural decomposition. Chem Sci 2024:d4sc01670j. [PMID: 39129773 PMCID: PMC11310747 DOI: 10.1039/d4sc01670j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Accepted: 08/01/2024] [Indexed: 08/13/2024] Open
Abstract
We imagine molecules to be perfect, but rigidified units can be designed to bend from their ideal shape, discarding their symmetric elements as they progress through vibrations and larger, more permanent distortions. The shape of molecules is either simulated or measured by crystallography and strongly affects chemical properties but, beyond an image or tabulation of atom-to-atom distances, little is often discussed of the accessed conformation. We have simplified the process of shape quantification across multiple molecular types with a new web-accessible program - SCSD - through which a molecular subunit possessing near-symmetry can be dissected into symmetry coordinates with ease. This parameterization allows a common set of numbers for comparing and understanding molecular shape, and is a simple method for database analysis; this program is available at https://www.kingsbury.id.au/scsd.
Collapse
Affiliation(s)
- Christopher J Kingsbury
- School of Chemistry, Chair of Organic Chemistry, Trinity College Dublin, The University of Dublin, Trinity Biomedical Sciences Institute 152-160 Pearse Street Dublin D02R590 Ireland
| | - Mathias O Senge
- School of Chemistry, Chair of Organic Chemistry, Trinity College Dublin, The University of Dublin, Trinity Biomedical Sciences Institute 152-160 Pearse Street Dublin D02R590 Ireland
- Institute for Advanced Study (TUM-IAS), Technical University of Munich Lichtenberg-Str. 2a 85748 Garching Germany
| |
Collapse
|
27
|
Voss J. Machine learning for accuracy in density functional approximations. J Comput Chem 2024; 45:1829-1845. [PMID: 38668453 DOI: 10.1002/jcc.27366] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Revised: 02/16/2024] [Accepted: 03/25/2024] [Indexed: 07/21/2024]
Abstract
Machine learning techniques have found their way into computational chemistry as indispensable tools to accelerate atomistic simulations and materials design. In addition, machine learning approaches hold the potential to boost the predictive power of computationally efficient electronic structure methods, such as density functional theory, to chemical accuracy and to correct for fundamental errors in density functional approaches. Here, recent progress in applying machine learning to improve the accuracy of density functional and related approximations is reviewed. Promises and challenges in devising machine learning models transferable between different chemistries and materials classes are discussed with the help of examples applying promising models to systems far outside their training sets.
Collapse
Affiliation(s)
- Johannes Voss
- SUNCAT Center for Interface Science and Catalysis, SLAC National Accelerator Laboratory, Menlo Park, California, USA
| |
Collapse
|
28
|
Pyzer-Knapp EO, Curioni A. Advancing biomolecular simulation through exascale HPC, AI and quantum computing. Curr Opin Struct Biol 2024; 87:102826. [PMID: 38733863 DOI: 10.1016/j.sbi.2024.102826] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 04/11/2024] [Accepted: 04/11/2024] [Indexed: 05/13/2024]
Abstract
Biomolecular simulation can act as both a digital microscope and a crystal ball; offering the potential for a deeper understanding of experimental observations whilst also presenting a forward-looking avenue for the in silico design and evaluation of hitherto unsynthesized compounds. Indeed, as the intricacy of our scientific inquiries has grown, so too has the computational prowess we seek to deploy in our pursuit of answers. As we enter the Exascale era, this mini-review surveys the computational landscape from both the point of view of the development of new and ever more powerful systems, and the simulations that are run on them. Moreover, as we stand on the cusp of a transformative phase in computational biology, this article offers a contemplative glance into the future, speculating on the profound implications of artificial intelligence and quantum computing for large-scale biomolecular simulations.
Collapse
|
29
|
Plé T, Adjoua O, Lagardère L, Piquemal JP. FeNNol: An efficient and flexible library for building force-field-enhanced neural network potentials. J Chem Phys 2024; 161:042502. [PMID: 39051830 DOI: 10.1063/5.0217688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2024] [Accepted: 06/28/2024] [Indexed: 07/27/2024] Open
Abstract
Neural network interatomic potentials (NNPs) have recently proven to be powerful tools to accurately model complex molecular systems while bypassing the high numerical cost of ab initio molecular dynamics simulations. In recent years, numerous advances in model architectures as well as the development of hybrid models combining machine-learning (ML) with more traditional, physically motivated, force-field interactions have considerably increased the design space of ML potentials. In this paper, we present FeNNol, a new library for building, training, and running force-field-enhanced neural network potentials. It provides a flexible and modular system for building hybrid models, allowing us to easily combine state-of-the-art embeddings with ML-parameterized physical interaction terms without the need for explicit programming. Furthermore, FeNNol leverages the automatic differentiation and just-in-time compilation features of the Jax Python library to enable fast evaluation of NNPs, shrinking the performance gap between ML potentials and standard force-fields. This is demonstrated with the popular ANI-2x model reaching simulation speeds nearly on par with the AMOEBA polarizable force-field on commodity GPUs (graphics processing units). We hope that FeNNol will facilitate the development and application of new hybrid NNP architectures for a wide range of molecular simulation problems.
Collapse
Affiliation(s)
- Thomas Plé
- Sorbonne Université, LCT, UMR 7616 CNRS, 75005 Paris, France
| | - Olivier Adjoua
- Sorbonne Université, LCT, UMR 7616 CNRS, 75005 Paris, France
| | - Louis Lagardère
- Sorbonne Université, LCT, UMR 7616 CNRS, 75005 Paris, France
| | | |
Collapse
|
30
|
Martire S, Decherchi S, Cavalli A. OBIWAN: An Element-Wise Scalable Feed-Forward Neural Network Potential. J Chem Theory Comput 2024; 20:6287-6302. [PMID: 38978155 DOI: 10.1021/acs.jctc.4c00342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Estimating the potential energy of a molecular system at a quantum level of theory is a task of paramount importance in computational chemistry. The often employed density functional theory approach allows one to accomplish this task, yet most often at significant computational costs. This prompted the community to develop so-called machine learning potentials to achieve near-quantum accuracy at molecular mechanics computational cost. In this paper, we introduce OBIWAN, a feed-forward neural network that bears some relevant structural properties that also led to the definition of a new kind of general-purpose neural network layer. Its featurization process scales efficiently with newly added atomic species. This allows one to seamlessly add new atom types without requiring to change the topology of the network. Also, this allows one to train on new data sets leveraging a previously trained OBIWAN, hence converging very quickly. This avoids training from scratch and renders the approach more compliant with a green computing perspective.
Collapse
Affiliation(s)
- Stefano Martire
- Department of Pharmacy and Biotechnology, University of Bologna, Via Belmeloro 6, Bologna 40126, Italy
- Computational and Chemical Biology, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, Genoa 16163, Italy
| | - Sergio Decherchi
- Data Science and Computation Facility, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, Genoa 16163, Italy
| | - Andrea Cavalli
- Computational and Chemical Biology, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, Genoa 16163, Italy
- Centre Européen de Calcul Atomique et Moléculaire, Ecole Polytechnique Fédérale de Lausanne, Avenue de Forel 3, Lausanne 1015, Switzerland
| |
Collapse
|
31
|
Migliaro I, Cundari TR. Integrated Study on Methane Activation: Exploring Main Group Frustrated Lewis Pairs through Density Functional Theory, Machine Learning, and Machine-Learned Force Fields. J Chem Theory Comput 2024; 20:6388-6401. [PMID: 38941286 DOI: 10.1021/acs.jctc.4c00354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Frustrated Lewis Pairs (FLP) are an important advance in metal-free catalysis due to their ability to activate a variety of small molecules. Many studies have focused on a very limited sample of Lewis acids and bases. Herein, we disclose an automated exploration algorithm using density functional methods, artificial neural networks (ANNs), and a molecule builder that incentivizes the exploration of favorable FLP space for the activation of methane via two mechanisms: deprotonation and hydride abstraction. The exploration algorithm creates FLPs with different Lewis acids (LA), Lewis bases (LB), and their substituents (LA/LB), which proved successful in quickly converging in the favorable chemical space, suggesting chemically sound structures, and generating thousands of potential candidates for methane activating FLPs. By modeling thousands of reactions, an FLP database of methane activation was created, allowing one to data mine properties, e.g., adduct bond length, highest occupied molecular orbital-lowest-unoccupied molecular orbital (HOMO-LUMO) gap, global electrophilicity index, favored Lewis acids/bases/substituents, and substituent steric volume. These properties not only successfully narrow the FLP chemical space but also provide meaningful insight into the chemical nature of competent methane activators. The machine learning discovery strategy disclosed here is general enough to be applicable to many chemical optimization tasks. This study also investigates the efficacy of a Machine-Learned Force Field (MLFF) in predicting the formation energies of Frustrated Lewis Pairs (FLPs). Our model, exhibiting a test error of ±10 kcal/mol, highlighted impressive computational efficiency by enabling the calculation of all possible FLP permutations within our chemical space. The MLFF demonstrated proficiency in predicting energies, providing a significant acceleration compared to quantum mechanics methods. However, challenges emerged in accurately capturing forces, necessitating recourse to classical force fields for reliable structure relaxation. The present study sheds light on the MLFF's potential as a tool for rapid energy predictions, emphasizing the need for further refinement to enhance its accuracy, particularly in force predictions, to expand its utility in chemical simulations.
Collapse
Affiliation(s)
- Ignacio Migliaro
- Department of Chemistry, Center of Advanced Scientific Computing and Modeling, University of North Texas, Denton, Texas 76203, United States
| | - Thomas R Cundari
- Department of Chemistry, Center of Advanced Scientific Computing and Modeling, University of North Texas, Denton, Texas 76203, United States
| |
Collapse
|
32
|
Slootman E, Poltavsky I, Shinde R, Cocomello J, Moroni S, Tkatchenko A, Filippi C. Accurate Quantum Monte Carlo Forces for Machine-Learned Force Fields: Ethanol as a Benchmark. J Chem Theory Comput 2024; 20:6020-6027. [PMID: 39003522 PMCID: PMC11270822 DOI: 10.1021/acs.jctc.4c00498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2024] [Revised: 05/31/2024] [Accepted: 06/03/2024] [Indexed: 07/15/2024]
Abstract
Quantum Monte Carlo (QMC) is a powerful method to calculate accurate energies and forces for molecular systems. In this work, we demonstrate how we can obtain accurate QMC forces for the fluxional ethanol molecule at room temperature by using either multideterminant Jastrow-Slater wave functions in variational Monte Carlo or just a single determinant in diffusion Monte Carlo. The excellent performance of our protocols is assessed against high-level coupled cluster calculations on a diverse set of representative configurations of the system. Finally, we train machine-learning force fields on the QMC forces and compare them to models trained on coupled cluster reference data, showing that a force field based on the diffusion Monte Carlo forces with a single determinant can faithfully reproduce coupled cluster power spectra in molecular dynamics simulations.
Collapse
Affiliation(s)
- E. Slootman
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - I. Poltavsky
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - R. Shinde
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - J. Cocomello
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| | - S. Moroni
- CNR-IOM
DEMOCRITOS, Istituto Officina dei Materiali,
and SISSA Scuola Internazionale Superiore di Studi Avanzati, Via Bonomea 265, I-34136 Trieste, Italy
| | - A. Tkatchenko
- Department
of Physics and Materials Science, University
of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - C. Filippi
- MESA+
Institute for Nanotechnology, University
of Twente, P.O. Box 217,
7500 AE Enschede, The Netherlands
| |
Collapse
|
33
|
Yang ZX, Xie XT, Kang PL, Wang ZX, Shang C, Liu ZP. Many-Body Function Corrected Neural Network with Atomic Attention (MBNN-att) for Molecular Property Prediction. J Chem Theory Comput 2024. [PMID: 39034686 DOI: 10.1021/acs.jctc.4c00660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/23/2024]
Abstract
Recent years have seen a surge of machine learning (ML) in chemistry for predicting chemical properties, but a low-cost, general-purpose, and high-performance model, desirable to be accessible on central processing unit (CPU) devices, remains not available. For this purpose, here we introduce an atomic attention mechanism into many-body function corrected neural network (MBNN), namely, MBNN-att ML model, to predict both the extensive and intensive properties of molecules and materials. The MBNN-att uses explicit function descriptors as the inputs for the atom-based feed-forward neural network (NN). The output of the NN is designed to be a vector to implement the multihead self-attention mechanism. This vector is split into two parts: the atomic attention weight part and the many-body-function part. The final property is obtained by summing the products of each atomic attention weight and the corresponding many-body function. We show that MBNN-att performs well on all QM9 properties, i.e., errors on all properties, below chemical accuracy, and, in particular, achieves the top performance for the energy-related extensive properties. By systematically comparing with other explicit-function-type descriptor ML models and the graph representation ML models, we demonstrate that the many-body-function framework and atomic attention mechanism are key ingredients for the high performance and the good transferability of MBNN-att in molecular property prediction.
Collapse
Affiliation(s)
- Zheng-Xin Yang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Xin-Tian Xie
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Pei-Lin Kang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Zhen-Xiong Wang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
- Shanghai Qi Zhi Institution, Shanghai 200030, China
| | - Zhi-Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
- Key Laboratory of Synthetic and Self-Assembly Chemistry for Organic Functional Molecules, Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences, Shanghai 200032, China
- Shanghai Qi Zhi Institution, Shanghai 200030, China
| |
Collapse
|
34
|
Zubatyuk R, Biczysko M, Ranasinghe K, Moriarty NW, Gokcan H, Kruse H, Poon BK, Adams PD, Waller MP, Roitberg AE, Isayev O, Afonine PV. AQuaRef: Machine learning accelerated quantum refinement of protein structures. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.21.604493. [PMID: 39071315 PMCID: PMC11275739 DOI: 10.1101/2024.07.21.604493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/30/2024]
Abstract
Cryo-EM and X-ray crystallography provide crucial experimental data for obtaining atomic-detail models of biomacromolecules. Refining these models relies on library- based stereochemical restraints, which, in addition to being limited to known chemical entities, do not include meaningful noncovalent interactions relying solely on nonbonded repulsions. Quantum mechanical (QM) calculations could alleviate these issues but are too expensive for large molecules. We present a novel AI-enabled Quantum Refinement (AQuaRef) based on AIMNet2 neural network potential mimicking QM at substantially lower computational costs. By refining 41 cryo-EM and 30 X-ray structures, we show that this approach yields atomic models with superior geometric quality compared to standard techniques, while maintaining an equal or better fit to experimental data.
Collapse
|
35
|
Zhang H, Juraskova V, Duarte F. Modelling chemical processes in explicit solvents with machine learning potentials. Nat Commun 2024; 15:6114. [PMID: 39030199 PMCID: PMC11271496 DOI: 10.1038/s41467-024-50418-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 07/08/2024] [Indexed: 07/21/2024] Open
Abstract
Solvent effects influence all stages of the chemical processes, modulating the stability of intermediates and transition states, as well as altering reaction rates and product ratios. However, accurately modelling these effects remains challenging. Here, we present a general strategy for generating reactive machine learning potentials to model chemical processes in solution. Our approach combines active learning with descriptor-based selectors and automation, enabling the construction of data-efficient training sets that span the relevant chemical and conformational space. We apply this strategy to investigate a Diels-Alder reaction in water and methanol. The generated machine learning potentials enable us to obtain reaction rates that are in agreement with experimental data and analyse the influence of these solvents on the reaction mechanism. Our strategy offers an efficient approach to the routine modelling of chemical reactions in solution, opening up avenues for studying complex chemical processes in an efficient manner.
Collapse
Affiliation(s)
- Hanwen Zhang
- Chemistry Research Laboratory, Oxford, United Kingdom
| | | | | |
Collapse
|
36
|
Chen G, Jaffrelot Inizan T, Plé T, Lagardère L, Piquemal JP, Maday Y. Advancing Force Fields Parameterization: A Directed Graph Attention Networks Approach. J Chem Theory Comput 2024; 20:5558-5569. [PMID: 38875012 DOI: 10.1021/acs.jctc.3c01421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2024]
Abstract
Force fields (FFs) are an established tool for simulating large and complex molecular systems. However, parametrizing FFs is a challenging and time-consuming task that relies on empirical heuristics, experimental data, and computational data. Recent efforts aim to automate the assignment of FF parameters using pre-existing databases and on-the-fly ab initio data. In this study, we propose a graph-based force field (GB-FFs) model to directly derive parameters for the Generalized Amber Force Field (GAFF) from chemical environments and research into the influence of functional forms. Our end-to-end parametrization approach predicts parameters by aggregating the basic information in directed molecular graphs, eliminating the need for expert-defined procedures and enhances the accuracy and transferability of GAFF across a broader range of molecular complexes. Simulation results are compared to the original GAFF parametrization. In practice, our results demonstrate an improved transferability of the model, showcasing its improved accuracy in modeling intermolecular and torsional interactions, as well as improved solvation free energies. The optimization approach developed in this work is fully applicable to other nonpolarizable FFs as well as to polarizable ones.
Collapse
Affiliation(s)
- Gong Chen
- Sorbonne Université, CNRS, Université Paris Cité, Laboratoire Jacques-Louis Lions (LJLL), UMR 7598 CNRS, 75005 Paris, France
| | - Théo Jaffrelot Inizan
- Sorbonne Université, Laboratoire de Chimie Théorique (LCT), UMR 7616 CNRS, 75005 Paris, France
| | - Thomas Plé
- Sorbonne Université, Laboratoire de Chimie Théorique (LCT), UMR 7616 CNRS, 75005 Paris, France
| | - Louis Lagardère
- Sorbonne Université, Laboratoire de Chimie Théorique (LCT), UMR 7616 CNRS, 75005 Paris, France
| | - Jean-Philip Piquemal
- Sorbonne Université, Laboratoire de Chimie Théorique (LCT), UMR 7616 CNRS, 75005 Paris, France
| | - Yvon Maday
- Sorbonne Université, CNRS, Université Paris Cité, Laboratoire Jacques-Louis Lions (LJLL), UMR 7598 CNRS, 75005 Paris, France
| |
Collapse
|
37
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
38
|
Medrano Sandonas L, Van Rompaey D, Fallani A, Hilfiker M, Hahn D, Perez-Benito L, Verhoeven J, Tresadern G, Kurt Wegner J, Ceulemans H, Tkatchenko A. Dataset for quantum-mechanical exploration of conformers and solvent effects in large drug-like molecules. Sci Data 2024; 11:742. [PMID: 38972891 PMCID: PMC11228031 DOI: 10.1038/s41597-024-03521-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 06/13/2024] [Indexed: 07/09/2024] Open
Abstract
We here introduce the Aquamarine (AQM) dataset, an extensive quantum-mechanical (QM) dataset that contains the structural and electronic information of 59,783 low-and high-energy conformers of 1,653 molecules with a total number of atoms ranging from 2 to 92 (mean: 50.9), and containing up to 54 (mean: 28.2) non-hydrogen atoms. To gain insights into the solvent effects as well as collective dispersion interactions for drug-like molecules, we have performed QM calculations supplemented with a treatment of many-body dispersion (MBD) interactions of structures and properties in the gas phase and implicit water. Thus, AQM contains over 40 global and local physicochemical properties (including ground-state and response properties) per conformer computed at the tightly converged PBE0+MBD level of theory for gas-phase molecules, whereas PBE0+MBD with the modified Poisson-Boltzmann (MPB) model of water was used for solvated molecules. By addressing both molecule-solvent and dispersion interactions, AQM dataset can serve as a challenging benchmark for state-of-the-art machine learning methods for property modeling and de novo generation of large (solvated) molecules with pharmaceutical and biological relevance.
Collapse
Affiliation(s)
- Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
- Institute for Materials Science and Max Bergmann Center of Biomaterials, TU Dresden, 01062, Dresden, Germany.
| | - Dries Van Rompaey
- Drug Discovery Data Sciences (D3S), Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium.
| | - Alessio Fallani
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
- Drug Discovery Data Sciences (D3S), Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Mathias Hilfiker
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - David Hahn
- Computational Chemistry, Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Laura Perez-Benito
- Computational Chemistry, Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Jonas Verhoeven
- Drug Discovery Data Sciences (D3S), Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Gary Tresadern
- Computational Chemistry, Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Joerg Kurt Wegner
- Drug Discovery Data Sciences (D3S), Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
- Drug Discovery Data Sciences (D3S), Johnson & Johnson Innovative Medicine, 301 Binney Street, MA 02142, Cambridge, USA
| | - Hugo Ceulemans
- Drug Discovery Data Sciences (D3S), Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340, Beerse, Belgium
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
39
|
Yang L, Guo Q, Zhang L. AI-assisted chemistry research: a comprehensive analysis of evolutionary paths and hotspots through knowledge graphs. Chem Commun (Camb) 2024; 60:6977-6987. [PMID: 38910536 DOI: 10.1039/d4cc01892c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Artificial intelligence (AI) offers transformative potential for chemical research through its ability to optimize reactions and processes, enhance energy efficiency, and reduce waste. AI-assisted chemical research (AI + chem) has become a global hotspot. To better understand the current research status of "AI + chem", this study conducted a scientific bibliometric investigation using CiteSpace. The web of science core collection was utilized to retrieve original articles related to "AI + chem" published from 2000 to 2024. The obtained data allowed for the visualization of the knowledge background, current research status, and latest knowledge structure of "AI + chem". The "AI + chem" has entered a stage of explosive growth, and the number of papers will maintain long-term high-speed growth. This article systematically analyzes the latest progress in "AI + chem" and objectively predicts future trends, including molecular design, reaction prediction, materials design, drug design, and quantum chemistry. The outcomes of this study will provide readers with a comprehensive understanding of the overall landscape of "AI + chem".
Collapse
Affiliation(s)
- Lin Yang
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Qingle Guo
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Lijing Zhang
- School of Chemistry, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China.
| |
Collapse
|
40
|
Lawson CL, Kryshtafovych A, Pintilie GD, Burley SK, Černý J, Chen VB, Emsley P, Gobbi A, Joachimiak A, Noreng S, Prisant MG, Read RJ, Richardson JS, Rohou AL, Schneider B, Sellers BD, Shao C, Sourial E, Williams CI, Williams CJ, Yang Y, Abbaraju V, Afonine PV, Baker ML, Bond PS, Blundell TL, Burnley T, Campbell A, Cao R, Cheng J, Chojnowski G, Cowtan KD, DiMaio F, Esmaeeli R, Giri N, Grubmüller H, Hoh SW, Hou J, Hryc CF, Hunte C, Igaev M, Joseph AP, Kao WC, Kihara D, Kumar D, Lang L, Lin S, Maddhuri Venkata Subramaniya SR, Mittal S, Mondal A, Moriarty NW, Muenks A, Murshudov GN, Nicholls RA, Olek M, Palmer CM, Perez A, Pohjolainen E, Pothula KR, Rowley CN, Sarkar D, Schäfer LU, Schlicksup CJ, Schröder GF, Shekhar M, Si D, Singharoy A, Sobolev OV, Terashi G, Vaiana AC, Vedithi SC, Verburgt J, Wang X, Warshamanage R, Winn MD, Weyand S, Yamashita K, Zhao M, Schmid MF, Berman HM, Chiu W. Outcomes of the EMDataResource cryo-EM Ligand Modeling Challenge. Nat Methods 2024; 21:1340-1348. [PMID: 38918604 DOI: 10.1038/s41592-024-02321-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Accepted: 05/24/2024] [Indexed: 06/27/2024]
Abstract
The EMDataResource Ligand Model Challenge aimed to assess the reliability and reproducibility of modeling ligands bound to protein and protein-nucleic acid complexes in cryogenic electron microscopy (cryo-EM) maps determined at near-atomic (1.9-2.5 Å) resolution. Three published maps were selected as targets: Escherichia coli beta-galactosidase with inhibitor, SARS-CoV-2 virus RNA-dependent RNA polymerase with covalently bound nucleotide analog and SARS-CoV-2 virus ion channel ORF3a with bound lipid. Sixty-one models were submitted from 17 independent research groups, each with supporting workflow details. The quality of submitted ligand models and surrounding atoms were analyzed by visual inspection and quantification of local map quality, model-to-map fit, geometry, energetics and contact scores. A composite rather than a single score was needed to assess macromolecule+ligand model quality. These observations lead us to recommend best practices for assessing cryo-EM structures of liganded macromolecules reported at near-atomic resolution.
Collapse
Affiliation(s)
- Catherine L Lawson
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA.
| | | | - Grigore D Pintilie
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA
| | - Stephen K Burley
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Rutgers Cancer Institute of New Jersey, Rutgers, The State University of New Jersey, New Brunswick, NJ, USA
- RCSB Protein Data Bank and San Diego Supercomputer Center, University of California San Diego, La Jolla, CA, USA
| | - Jiří Černý
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, Czech Republic
| | - Vincent B Chen
- Department of Biochemistry, Duke University, Durham, NC, USA
| | - Paul Emsley
- MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Alberto Gobbi
- Discovery Chemistry, Genentech Inc., San Francisco, CA, USA
- , Berlin, Germany
| | - Andrzej Joachimiak
- Structural Biology Center, X-ray Science Division, Argonne National Laboratory, Argonne, IL, USA
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Sigrid Noreng
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
- Protein Science, Septerna, South San Francisco, CA, USA
| | | | - Randy J Read
- Department of Haematology, Cambridge Institute for Medical Research, University of Cambridge, Cambridge, UK
| | | | - Alexis L Rohou
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
| | - Bohdan Schneider
- Institute of Biotechnology, Czech Academy of Sciences, Vestec, Czech Republic
| | - Benjamin D Sellers
- Discovery Chemistry, Genentech Inc., San Francisco, CA, USA
- Computational Chemistry, Vilya, South San Francisco, CA, USA
| | - Chenghua Shao
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | | | | | | | - Ying Yang
- Structural Biology, Genentech Inc., South San Francisco, CA, USA
| | - Venkat Abbaraju
- RCSB Protein Data Bank and Institute for Quantitative Biomedicine, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
| | - Pavel V Afonine
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Matthew L Baker
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Paul S Bond
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Tom L Blundell
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | - Tom Burnley
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Arthur Campbell
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, USA
| | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | | | - K D Cowtan
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Frank DiMaio
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | - Reza Esmaeeli
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Nabin Giri
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, USA
| | - Helmut Grubmüller
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Soon Wen Hoh
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
| | - Jie Hou
- Department of Computer Science, Saint Louis University, St. Louis, MO, USA
| | - Corey F Hryc
- Department of Biochemistry and Molecular Biology, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Carola Hunte
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS-Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany
| | - Maxim Igaev
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Agnel P Joseph
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Wei-Chun Kao
- Institute of Biochemistry and Molecular Biology, ZBMZ, Faculty of Medicine and CIBSS-Centre for Integrative Biological Signalling Studies, University of Freiburg, Freiburg, Germany
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Dilip Kumar
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, Houston, TX, USA
- Trivedi School of Biosciences, Ashoka University, Sonipat, India
| | - Lijun Lang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
- The Chinese University of Hong Kong, Hong Kong, China
| | - Sean Lin
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Sumit Mittal
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- School of Advanced Sciences and Languages, VIT Bhopal University, Bhopal, India
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
- National Renewable Energy Laboratory (NREL), Golden, CO, USA
| | - Nigel W Moriarty
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Andrew Muenks
- Department of Biochemistry and Institute for Protein Design, University of Washington, Seattle, WA, USA
| | | | - Robert A Nicholls
- MRC Laboratory of Molecular Biology, Cambridge, UK
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Mateusz Olek
- York Structural Biology Laboratory, Department of Chemistry, University of York, York, UK
- Electron Bio-Imaging Centre, Diamond Light Source, Harwell Science and Innovation Campus, Didcot, UK
| | - Colin M Palmer
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, FL, USA
| | - Emmi Pohjolainen
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
| | - Karunakar R Pothula
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | | | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
- MSU-DOE Plant Research Laboratory, East Lansing, MI, USA
- School of Molecular Sciences, Arizona State University, Tempe, AZ, USA
| | - Luisa U Schäfer
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
| | - Christopher J Schlicksup
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Gunnar F Schröder
- Institute of Biological Information Processing (IBI-7, Structural Biochemistry) and Jülich Centre for Structural Biology (JuStruct), Forschungszentrum Jülich, Jülich, Germany
- Physics Department, Heinrich Heine University Düsseldorf, Düsseldorf, Germany
| | - Mrinal Shekhar
- Center for Development of Therapeutics, Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Biodesign Institute, Arizona State University, Tempe, AZ, USA
| | - Dong Si
- Division of Computing & Software Systems, University of Washington, Bothell, WA, USA
| | | | - Oleg V Sobolev
- Molecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Andrea C Vaiana
- Theoretical and Computational Biophysics Department, Max Planck Institute for Multidisciplinary Sciences, Göttingen, Germany
- Nature's Toolbox (NTx), Rio Rancho, NM, USA
| | | | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | | | - Martyn D Winn
- Scientific Computing Department, UKRI Science and Technology Facilities Council, Research Complex at Harwell, Didcot, UK
| | - Simone Weyand
- Department of Biochemistry, University of Cambridge, Cambridge, UK
| | | | - Minglei Zhao
- Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, IL, USA
| | - Michael F Schmid
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA
| | - Helen M Berman
- Department of Chemistry and Chemical Biology, Rutgers, The State University of New Jersey, Piscataway, NJ, USA
- Department of Quantitative and Computational Biology, University of Southern California, Los Angeles, CA, USA
| | - Wah Chiu
- Departments of Bioengineering and of Microbiology and Immunology, Stanford University, Stanford, CA, USA.
- Division of Cryo-EM and Bioimaging, SSRL, SLAC National Accelerator Laboratory, Menlo Park, CA, USA.
| |
Collapse
|
41
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
42
|
Li H, Guo W, Guo Y. Impart of Heterogeneous Charge Polarization and Distribution on Friction at Water-Graphene Interfaces: a Density-Functional-Theory based Machine Learning Study. J Phys Chem Lett 2024; 15:6585-6591. [PMID: 38885449 DOI: 10.1021/acs.jpclett.4c01274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately characterizing friction behaviors at water-solid interfaces remains a challenge because of the dynamic nature of water molecules and temporal variations in solid surface charges. By using a density-functional-theory (DFT) based machine learning (ML) technique and long-time ML-parametrized molecular dynamics simulations, we have systematically investigated water-induced charge polarization and redistribution on graphene, as well as its impact on friction at water-graphene interfaces. Heterogeneous charge polarization and distribution are observed for water-covered graphene accompanied by the formation of electric double layers (EDLs). The introduction of defects into graphene significantly enhances the heterogeneity in charge polarization and distribution. Compared to pristine graphene, defected graphene exhibits reduced friction at water-graphene interfaces due to stronger charge heterogeneity, resulting in lower surface charge density and the inverse relationship between slip length and surface charge density for EDLs. Our results highlight the pivotal roles of defects and charge heterogeneity in reducing friction at water-graphene interfaces.
Collapse
Affiliation(s)
- Hao Li
- State Key Laboratory of Mechanics and Control for Aerospace Structures, MOE Key Laboratory for Intelligent Nano Materials and Devices, College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
| | - Wanlin Guo
- State Key Laboratory of Mechanics and Control for Aerospace Structures, MOE Key Laboratory for Intelligent Nano Materials and Devices, College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
| | - Yufeng Guo
- State Key Laboratory of Mechanics and Control for Aerospace Structures, MOE Key Laboratory for Intelligent Nano Materials and Devices, College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing, 210016, China
| |
Collapse
|
43
|
Butin O, Pereyaslavets L, Kamath G, Illarionov A, Sakipov S, Kurnikov IV, Voronina E, Ivahnenko I, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Cherniavskyi YK, Lock C, Greenslade S, Kornberg RD, Levitt M, Fain B. The Determination of Free Energy of Hydration of Water Ions from First Principles. J Chem Theory Comput 2024; 20:5215-5224. [PMID: 38842599 DOI: 10.1021/acs.jctc.3c01411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/07/2024]
Abstract
We model the autoionization of water by determining the free energy of hydration of the major intermediate species of water ions. We represent the smallest ions─the hydroxide ion OH-, the hydronium ion H3O+, and the Zundel ion H5O2+─by bonded models and the more extended ionic structures by strong nonbonded interactions (e.g., the Eigen H9O4+ = H3O+ + 3(H2O) and the Stoyanov H13O6+ = H5O2+ + 4(H2O)). Our models are faithful to the precise QM energies and their components to within 1% or less. Using the calculated free energies and atomization energies, we compute the pKa of pure water from first principles as a consistency check and arrive at a value within 1.3 log units of the experimental one. From these calculations, we conclude that the hydronium ion, and its hydrated state, the Eigen cation, are the dominant species in the water autoionization process.
Collapse
Affiliation(s)
- Oleg Butin
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Alexey Illarionov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan Sakipov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor V Kurnikov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Skobeltsyn Institute of Nuclear Physics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Ilya Ivahnenko
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Physics, Lomonosov Moscow State University, Moscow 119991, Russia
| | - Yevhen K Cherniavskyi
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher Lock
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Sean Greenslade
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Roger D Kornberg
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, California 94305, United States
| | - Boris Fain
- InterX, Inc. (a subsidiary of NeoTX Therapeutics, Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
44
|
Sivaraman G, Benmore CJ. Deciphering diffuse scattering with machine learning and the equivariant foundation model: the case of molten FeO. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 36:381501. [PMID: 38866028 DOI: 10.1088/1361-648x/ad577b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Accepted: 06/12/2024] [Indexed: 06/14/2024]
Abstract
Bridging the gap between diffuse x-ray or neutron scattering measurements and predicted structures derived from atom-atom pair potentials in disordered materials, has been a longstanding challenge in condensed matter physics. This perspective gives a brief overview of the traditional approaches employed over the past several decades. Namely, the use of approximate interatomic pair potentials that relate three-dimensional structural models to the measured structure factor and its' associated pair distribution function. The use of machine learned interatomic potentials has grown in the past few years, and has been particularly successful in the cases of ionic and oxide systems. Recent advances in large scale sampling, along with a direct integration of scattering measurements into the model development, has provided improved agreement between experiments and large-scale models calculated with quantum mechanical accuracy. However, details of local polyhedral bonding and connectivity in meta-stable disordered systems still require improvement. Here we leverage MACE-MP-0; a newly introduced equivariant foundation model and validate the results against high-quality experimental scattering data for the case of molten iron(II) oxide (FeO). These preliminary results suggest that the emerging foundation model has the potential to surpass the traditional limitations of classical interatomic potentials.
Collapse
Affiliation(s)
- Ganesh Sivaraman
- Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, IL 61801, United States of America
- C-STEEL Center for Steel Electrification by Electrosynthesis, Argonne National Laboratory, Argonne, IL 60438, United States of America
| | - Chris J Benmore
- C-STEEL Center for Steel Electrification by Electrosynthesis, Argonne National Laboratory, Argonne, IL 60438, United States of America
- X-Ray Science Division, Advanced Photon Source, Argonne National Laboratory, Argonne, IL 60438, United States of America
| |
Collapse
|
45
|
Weymuth T, Unsleber JP, Türtscher PL, Steiner M, Sobez JG, Müller CH, Mörchen M, Klasovita V, Grimmel SA, Eckhoff M, Csizi KS, Bosia F, Bensberg M, Reiher M. SCINE-Software for chemical interaction networks. J Chem Phys 2024; 160:222501. [PMID: 38857173 DOI: 10.1063/5.0206974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Accepted: 05/09/2024] [Indexed: 06/12/2024] Open
Abstract
The software for chemical interaction networks (SCINE) project aims at pushing the frontier of quantum chemical calculations on molecular structures to a new level. While calculations on individual structures as well as on simple relations between them have become routine in chemistry, new developments have pushed the frontier in the field to high-throughput calculations. Chemical relations may be created by a search for specific molecular properties in a molecular design attempt, or they can be defined by a set of elementary reaction steps that form a chemical reaction network. The software modules of SCINE have been designed to facilitate such studies. The features of the modules are (i) general applicability of the applied methodologies ranging from electronic structure (no restriction to specific elements of the periodic table) to microkinetic modeling (with little restrictions on molecularity), full modularity so that SCINE modules can also be applied as stand-alone programs or be exchanged for external software packages that fulfill a similar purpose (to increase options for computational campaigns and to provide alternatives in case of tasks that are hard or impossible to accomplish with certain programs), (ii) high stability and autonomous operations so that control and steering by an operator are as easy as possible, and (iii) easy embedding into complex heterogeneous environments for molecular structures taken individually or in the context of a reaction network. A graphical user interface unites all modules and ensures interoperability. All components of the software have been made available as open source and free of charge.
Collapse
Affiliation(s)
- Thomas Weymuth
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan P Unsleber
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Paul L Türtscher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Miguel Steiner
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Jan-Grimo Sobez
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Charlotte H Müller
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Maximilian Mörchen
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Veronika Klasovita
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Stephanie A Grimmel
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Marco Eckhoff
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Katja-Sophia Csizi
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Francesco Bosia
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Moritz Bensberg
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- ETH Zurich, Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
46
|
Fu W, Mo Y, Xiao Y, Liu C, Zhou F, Wang Y, Zhou J, Zhang YJ. Enhancing Molecular Energy Predictions with Physically Constrained Modifications to the Neural Network Potential. J Chem Theory Comput 2024; 20:4533-4544. [PMID: 38828925 DOI: 10.1021/acs.jctc.3c01181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Exclusively prioritizing the precision of energy prediction frequently proves inadequate in satisfying multifaceted requirements. A heightened focus is warranted on assessing the rationality of potential energy curves predicted by machine learning-based force fields (MLFFs), alongside evaluating the pragmatic utility of these MLFFs. This study introduces SWANI, an optimized neural network potential stemming from the ANI framework. Through the incorporation of supplementary physical constraints, SWANI aligns more cohesively with chemical expectations, yielding rational potential energy profiles. It also exhibits superior predictive precision compared with that of the ANI model. Additionally, a comprehensive comparison is conducted between SWANI and a prominent graph neural network-based model. The findings indicate that SWANI outperforms the latter, particularly for molecules exceeding the dimensions of the training set. This outcome underscores SWANI's exceptional capacity for generalization and its proficiency in handling larger molecular systems.
Collapse
Affiliation(s)
- Weiqiang Fu
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yujie Mo
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yi Xiao
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Chang Liu
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Feng Zhou
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yang Wang
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Jielong Zhou
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yingsheng J Zhang
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| |
Collapse
|
47
|
Lei YK, Yagi K, Sugita Y. Learning QM/MM potential using equivariant multiscale model. J Chem Phys 2024; 160:214109. [PMID: 38828815 DOI: 10.1063/5.0205123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2024] [Accepted: 05/09/2024] [Indexed: 06/05/2024] Open
Abstract
The machine learning (ML) method emerges as an efficient and precise surrogate model for high-level electronic structure theory. Its application has been limited to closed chemical systems without considering external potentials from the surrounding environment. To address this limitation and incorporate the influence of external potentials, polarization effects, and long-range interactions between a chemical system and its environment, the first two terms of the Taylor expansion of an electrostatic operator have been used as extra input to the existing ML model to represent the electrostatic environments. However, high-order electrostatic interaction is often essential to account for external potentials from the environment. The existing models based only on invariant features cannot capture significant distribution patterns of the external potentials. Here, we propose a novel ML model that includes high-order terms of the Taylor expansion of an electrostatic operator and uses an equivariant model, which can generate a high-order tensor covariant with rotations as a base model. Therefore, we can use the multipole-expansion equation to derive a useful representation by accounting for polarization and intermolecular interaction. Moreover, to deal with long-range interactions, we follow the same strategy adopted to derive long-range interactions between a target system and its environment media. Our model achieves higher prediction accuracy and transferability among various environment media with these modifications.
Collapse
Affiliation(s)
- Yao-Kun Lei
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
| | - Kiyoshi Yagi
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, Wako, Saitama 351-0198, Japan
- Computational Biophysics Research Team, RIKEN Center for Computational Science, Kobe, Hyogo 650-0047, Japan
- RIKEN Interdisciplinary Theoretical and Mathematical Sciences Program (iTHEMS), Wako, Saitama 351-0198, Japan
- Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, Kobe, Hyogo 650-0047, Japan
| |
Collapse
|
48
|
Chen T, Liu A, Ma D. Editorial: Novel design, synthesis, and environmental applications of covalent organic frameworks. Front Chem 2024; 12:1434454. [PMID: 38903203 PMCID: PMC11187299 DOI: 10.3389/fchem.2024.1434454] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2024] [Accepted: 05/28/2024] [Indexed: 06/22/2024] Open
Affiliation(s)
- Tanyue Chen
- Department of Chemistry, School of Light Industry Science and Engineering, Beijing Technology and Business University, Beijing, China
| | - Anan Liu
- Basic Experimental Centre for Natural Science, University of Science and Technology Beijing, Beijing, China
| | - Dongge Ma
- Department of Chemistry, School of Light Industry Science and Engineering, Beijing Technology and Business University, Beijing, China
| |
Collapse
|
49
|
Tiwary P. Modeling prebiotic chemistries with quantum accuracy at classical costs. Proc Natl Acad Sci U S A 2024; 121:e2408742121. [PMID: 38809708 PMCID: PMC11161769 DOI: 10.1073/pnas.2408742121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2024] Open
Affiliation(s)
- Pratyush Tiwary
- Institute for Physical Science and Technology, University of Maryland, College Park, MD20742
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD20742
- University of Maryland Institute for Health Computing, Bethesda, MD20852
| |
Collapse
|
50
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|