1
|
Margraf JT. Neural graph distance embedding for molecular geometry generation. J Comput Chem 2024; 45:1784-1790. [PMID: 38655845 DOI: 10.1002/jcc.27349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 03/05/2024] [Accepted: 03/08/2024] [Indexed: 04/26/2024]
Abstract
This article introduces neural graph distance embedding (nGDE), a method for generating 3D molecular geometries. Leveraging a graph neural network trained on the OE62 dataset of molecular geometries, nGDE predicts interatomic distances based on molecular graphs. These distances are then used in multidimensional scaling to produce 3D geometries, subsequently refined with standard bioorganic forcefields. The machine learning-based graph distance introduced herein is found to be an improvement over the conventional shortest path distances used in graph drawing. Comparative analysis with a state-of-the-art distance geometry method demonstrates nGDE's competitive performance, particularly showcasing robustness in handling polycyclic molecules-a challenge for existing methods.
Collapse
Affiliation(s)
- Johannes T Margraf
- Bavarian Center for Battery Technology (BayBatt), University of Bayreuth, Bayreuth, Germany
| |
Collapse
|
2
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
3
|
Wang F, Ma Z, Cheng J. Accelerating Computation of Acidity Constants and Redox Potentials for Aqueous Organic Redox Flow Batteries by Machine Learning Potential-Based Molecular Dynamics. J Am Chem Soc 2024; 146:14566-14575. [PMID: 38659097 DOI: 10.1021/jacs.4c01221] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Due to the increased concern about energy and environmental issues, significant attention has been paid to the development of large-scale energy storage devices to facilitate the utilization of clean energy sources. The redox flow battery (RFB) is one of the most promising systems. Recently, the high cost of transition-metal complex-based RFB has promoted the development of aqueous RFBs with redox-active organic molecules. To expand the working voltage, computational chemistry has been applied to search for organic molecules with lower or higher redox potentials. However, redox potential computation based on implicit solvation models would be challenging due to difficulty in parametrization when considering the complex solvation of supporting electrolytes. Besides, although ab initio molecular dynamics (AIMD) describes the supporting electrolytes with the same level of electronic structure theory as the redox couple, the application is impeded by the high computation costs. Recently, machine learning molecular dynamics (MLMD) has been illustrated to accelerate AIMD by several orders of magnitude without sacrificing the accuracy. It has been established that redox potentials can be computed by MLMD with two separated machine learning potentials (MLPs) for reactant and product states, which is redundant and inefficient. In this work, an automated workflow is developed to construct a universal MLP for both states, which can compute the redox potentials or acidity constants of redox-active organic molecules more efficiently. Furthermore, the predicted redox potentials can be evaluated at the hybrid functional level with much lower costs, which would facilitate the design of aqueous organic RFBs.
Collapse
Affiliation(s)
- Feng Wang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Zebing Ma
- State Key Laboratory of Physical Chemistry of Solid Surfaces, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
| | - Jun Cheng
- State Key Laboratory of Physical Chemistry of Solid Surfaces, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, China
- Laboratory of AI for Electrochemistry (AI4EC), IKKEM, Xiamen 361005, China
- Institute of Artificial Intelligence, Xiamen University, Xiamen 361005, China
| |
Collapse
|
4
|
Wan K, He J, Shi X. Construction of High Accuracy Machine Learning Interatomic Potential for Surface/Interface of Nanomaterials-A Review. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2305758. [PMID: 37640376 DOI: 10.1002/adma.202305758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/24/2023] [Indexed: 08/31/2023]
Abstract
The inherent discontinuity and unique dimensional attributes of nanomaterial surfaces and interfaces bestow them with various exceptional properties. These properties, however, also introduce difficulties for both experimental and computational studies. The advent of machine learning interatomic potential (MLIP) addresses some of the limitations associated with empirical force fields, presenting a valuable avenue for accurate simulations of these surfaces/interfaces of nanomaterials. Central to this approach is the idea of capturing the relationship between system configuration and potential energy, leveraging the proficiency of machine learning (ML) to precisely approximate high-dimensional functions. This review offers an in-depth examination of MLIP principles and their execution and elaborates on their applications in the realm of nanomaterial surface and interface systems. The prevailing challenges faced by this potent methodology are also discussed.
Collapse
Affiliation(s)
- Kaiwei Wan
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jianxin He
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xinghua Shi
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|
5
|
Žugec I, Geilhufe RM, Lončarić I. Global machine learning potentials for molecular crystals. J Chem Phys 2024; 160:154106. [PMID: 38624120 DOI: 10.1063/5.0196232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/29/2024] [Indexed: 04/17/2024] Open
Abstract
Molecular crystals are difficult to model with accurate first-principles methods due to large unit cells. On the other hand, accurate modeling is required as polymorphs often differ by only 1 kJ/mol. Machine learning interatomic potentials promise to provide accuracy of the baseline first-principles methods with a cost lower by orders of magnitude. Using the existing databases of the density functional theory calculations for molecular crystals and molecules, we train global machine learning interatomic potentials, usable for any molecular crystal. We test the performance of the potentials on experimental benchmarks and show that they perform better than classical force fields and, in some cases, are comparable to the density functional theory calculations.
Collapse
Affiliation(s)
- Ivan Žugec
- Centro de Física de Materiales CFM/MPC (CSIC-UPV/EHU), Donostia-San Sebastián, Spain
| | - R Matthias Geilhufe
- Department of Physics, Chalmers University of Technology, Gothenburg, Sweden
| | - Ivor Lončarić
- Ruđer Bošković Institute, Bijenička 54, Zagreb, Croatia
| |
Collapse
|
6
|
Butler PV, Hafizi R, Day GM. Machine-Learned Potentials by Active Learning from Organic Crystal Structure Prediction Landscapes. J Phys Chem A 2024; 128:945-957. [PMID: 38277275 PMCID: PMC10860135 DOI: 10.1021/acs.jpca.3c07129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/04/2024] [Accepted: 01/11/2024] [Indexed: 01/28/2024]
Abstract
A primary challenge in organic molecular crystal structure prediction (CSP) is accurately ranking the energies of potential structures. While high-level solid-state density functional theory (DFT) methods allow for mostly reliable discrimination of the low-energy structures, their high computational cost is problematic because of the need to evaluate tens to hundreds of thousands of trial crystal structures to fully explore typical crystal energy landscapes. Consequently, lower-cost but less accurate empirical force fields are often used, sometimes as the first stage of a hierarchical scheme involving multiple stages of increasingly accurate energy calculations. Machine-learned interatomic potentials (MLIPs), trained to reproduce the results of ab initio methods with computational costs close to those of force fields, can improve the efficiency of the CSP by reducing or eliminating the need for costly DFT calculations. Here, we investigate active learning methods for training MLIPs with CSP datasets. The combination of active learning with the well-developed sampling methods from CSP yields potentials in a highly automated workflow that are relevant over a wide range of the crystal packing space. To demonstrate these potentials, we illustrate efficiently reranking large, diverse crystal structure landscapes to near-DFT accuracy from force field-based CSP, improving the reliability of the final energy ranking. Furthermore, we demonstrate how these potentials can be extended to more accurately model structures far from lattice energy minima through additional on-the-fly training within Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Roohollah Hafizi
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| | - Graeme M. Day
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| |
Collapse
|
7
|
Coste A, Slejko E, Zavadlav J, Praprotnik M. Developing an Implicit Solvation Machine Learning Model for Molecular Simulations of Ionic Media. J Chem Theory Comput 2024; 20:411-420. [PMID: 38118122 PMCID: PMC10782447 DOI: 10.1021/acs.jctc.3c00984] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 12/04/2023] [Accepted: 12/04/2023] [Indexed: 12/22/2023]
Abstract
Molecular dynamics (MD) simulations of biophysical systems require accurate modeling of their native environment, i.e., aqueous ionic solution, as it critically impacts the structure and function of biomolecules. On the other hand, the models should be computationally efficient to enable simulations of large spatiotemporal scales. Here, we present the deep implicit solvation model for sodium chloride solutions that satisfies both requirements. Owing to the use of the neural network potential, the model can capture the many-body potential of mean force, while the implicit water treatment renders the model inexpensive. We demonstrate our approach first for pure ionic solutions with concentrations ranging from physiological to 2 M. We then extend the model to capture the effective ion interactions in the vicinity and far away from a DNA molecule. In both cases, the structural properties are in good agreement with all-atom MD, showcasing a general methodology for the efficient and accurate modeling of ionic media.
Collapse
Affiliation(s)
- Amaury Coste
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
| | - Ema Slejko
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| | - Julija Zavadlav
- Professorship
of Multiscale Modeling of Fluid Materials, TUM School of Engineering
and Design, Technical University of Munich, Garching Near Munich DE-85748, Germany
| | - Matej Praprotnik
- Laboratory
for Molecular Modeling, National Institute of Chemistry, Ljubljana SI-1001, Slovenia
- Department
of Physics, Faculty of Mathematics and Physics, University of Ljubljana, Ljubljana SI-1000, Slovenia
| |
Collapse
|
8
|
Kadan A, Ryczko K, Wildman A, Wang R, Roitberg A, Yamazaki T. Accelerated Organic Crystal Structure Prediction with Genetic Algorithms and Machine Learning. J Chem Theory Comput 2023; 19:9388-9402. [PMID: 38059458 DOI: 10.1021/acs.jctc.3c00853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023]
Abstract
We present a high-throughput, end-to-end pipeline for organic crystal structure prediction (CSP)─the problem of identifying the stable crystal structures that will form from a given molecule based only on its molecular composition. Our tool uses neural network potentials to allow for efficient screening and structural relaxation of generated crystal candidates. Our pipeline consists of two distinct stages: random search, whereby crystal candidates are randomly generated and screened, and optimization, where a genetic algorithm (GA) optimizes this screened population. We assess the performance of each stage of our pipeline on 21 molecules taken from the Cambridge Crystallographic Data Centre's CSP blind tests. We show that random search alone yields matches for ≈50% of targets. We then validate the potential of our full pipeline, making use of the GA to optimize the root-mean-square deviation between crystal candidates and the experimentally derived structure. With this approach, we are able to find matches for ≈80% of candidates with 10-100 times smaller initial population sizes than when using random search. Lastly, we run our full pipeline with an ANI model that is trained on a small data set of molecules extracted from crystal structures in the Cambridge Structural Database, generating ≈60% of targets. By leveraging machine learning models trained to predict energies at the density functional theory level, our pipeline has the potential to approach the accuracy of ab initio methods and the efficiency of empirical force fields.
Collapse
Affiliation(s)
- Amit Kadan
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Kevin Ryczko
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Andrew Wildman
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Rodrigo Wang
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| | - Adrian Roitberg
- Department of Chemistry, University of Florida, P.O. Box 117200, Gainesville, Florida 32611-7200, United States
| | - Takeshi Yamazaki
- Good Chemistry Company, 1285 W Pender Street, Vancouver, British Columbia V6E 4B1, Canada
| |
Collapse
|
9
|
Beran GJO. Frontiers of molecular crystal structure prediction for pharmaceuticals and functional organic materials. Chem Sci 2023; 14:13290-13312. [PMID: 38033897 PMCID: PMC10685338 DOI: 10.1039/d3sc03903j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Accepted: 11/02/2023] [Indexed: 12/02/2023] Open
Abstract
The reliability of organic molecular crystal structure prediction has improved tremendously in recent years. Crystal structure predictions for small, mostly rigid molecules are quickly becoming routine. Structure predictions for larger, highly flexible molecules are more challenging, but their crystal structures can also now be predicted with increasing rates of success. These advances are ushering in a new era where crystal structure prediction drives the experimental discovery of new solid forms. After briefly discussing the computational methods that enable successful crystal structure prediction, this perspective presents case studies from the literature that demonstrate how state-of-the-art crystal structure prediction can transform how scientists approach problems involving the organic solid state. Applications to pharmaceuticals, porous organic materials, photomechanical crystals, organic semi-conductors, and nuclear magnetic resonance crystallography are included. Finally, efforts to improve our understanding of which predicted crystal structures can actually be produced experimentally and other outstanding challenges are discussed.
Collapse
Affiliation(s)
- Gregory J O Beran
- Department of Chemistry, University of California Riverside Riverside CA 92521 USA
| |
Collapse
|
10
|
Brown M, Skelton JM, Popelier PLA. Application of the FFLUX Force Field to Molecular Crystals: A Study of Formamide. J Chem Theory Comput 2023; 19:7946-7959. [PMID: 37847867 PMCID: PMC10653110 DOI: 10.1021/acs.jctc.3c00578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Indexed: 10/19/2023]
Abstract
In this work, we present the first application of the quantum chemical topology force field FFLUX to the solid state. FFLUX utilizes Gaussian process regression machine learning models trained on data from the interacting quantum atom partitioning scheme to predict atomic energies and flexible multipole moments that change with geometry. Here, the ambient (α) and high-pressure (β) polymorphs of formamide are used as test systems and optimized using FFLUX. Optimizing the structures with increasing multipolar ranks indicates that the lattice parameters of the α phase differ by less than 5% to the experimental structure when multipole moments up to the quadrupole are used. These differences are found to be in line with the dispersion-corrected density functional theory. Lattice dynamics calculations are also found to be possible using FFLUX, yielding harmonic phonon spectra comparable to dispersion-corrected DFT while enabling larger supercells to be considered than is typically possible with first-principles calculations. These promising results indicate that FFLUX can be used to accurately determine properties of molecular solids that are difficult to access using DFT, including the structural dynamics, free energies, and properties at finite temperature.
Collapse
Affiliation(s)
- Matthew
L. Brown
- Department of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, Britain
| | - Jonathan M. Skelton
- Department of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, Britain
| | - Paul L. A. Popelier
- Department of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, Britain
| |
Collapse
|
11
|
Taniguchi T, Hosokawa M, Asahi T. Graph Comparison of Molecular Crystals in Band Gap Prediction Using Neural Networks. ACS OMEGA 2023; 8:39481-39489. [PMID: 37901497 PMCID: PMC10601046 DOI: 10.1021/acsomega.3c05224] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2023] [Accepted: 10/03/2023] [Indexed: 10/31/2023]
Abstract
In material informatics, the representation of the material structure is fundamentally essential to obtaining better prediction results, and graph representation has attracted much attention in recent years. Molecular crystals can be graphically represented in molecular and crystal representations, but a comparison of which representation is more effective has not been examined. In this study, we compared the prediction accuracy between molecular and crystal graphs for band gap prediction. The results showed that the prediction accuracies using crystal graphs were better than those obtained using molecular graphs. While this result is not surprising, error analysis quantitatively evaluated that the error of the crystal graph was 0.4 times that of the molecular graph with moderate correlation. The novelty of this study lies in the comparison of molecular crystal representations and in the quantitative evaluation of the contribution of crystal structures to the band gap.
Collapse
Affiliation(s)
- Takuya Taniguchi
- Center
for Data Science, Waseda University, 1-6-1 Nishiwaseda, Shinjuku-ku, Tokyo 169-8050, Japan
| | - Mayuko Hosokawa
- Department
of Advanced Science and Engineering, Graduate School of Advanced Science
and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-Ku, Tokyo 169-8555, Japan
| | - Toru Asahi
- Department
of Advanced Science and Engineering, Graduate School of Advanced Science
and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-Ku, Tokyo 169-8555, Japan
| |
Collapse
|
12
|
Holm S, Unzueta PA, Thompson K, Martínez TJ. Single-Point Extrapolation to the Complete Basis Set Limit through Deep Learning. J Chem Theory Comput 2023. [PMID: 37192428 DOI: 10.1021/acs.jctc.2c01298] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Machine learning (ML) offers an attractive method for making predictions about molecular systems while circumventing the need to run expensive electronic structure calculations. Once trained on ab initio data, the promise of ML is to deliver accurate predictions of molecular properties that were previously computationally infeasible. In this work, we develop and train a graph neural network model to correct the basis set incompleteness error (BSIE) between a small and large basis set at the RHF and B3LYP levels of theory. Our results show that, when compared to fitting to the total potential, an ML model fitted to correct the BSIE is better at generalizing to systems not seen during training. We test this ability by training on single molecules while evaluating on molecular complexes. We also show that ensemble models yield better behaved potentials in situations where the training data is insufficient. However, even when only fitting to the BSIE, acceptable performance is only achieved when the training data sufficiently resemble the systems one wants to make predictions on. The test error of the final model trained to predict the difference between the cc-pVDZ and cc-pV5Z potential is 0.184 kcal/mol for the B3LYP density functional, and the ensemble model accurately reproduces the large basis set interaction energy curves on the S66x8 dataset.
Collapse
Affiliation(s)
- Soren Holm
- Department of Chemistry and The PULSE Institute, Stanford University,Stanford, California 94305, United States
- SLAC National Accelerator Laboratory, Menlo Park, California 94024, United States
| | - Pablo A Unzueta
- Department of Chemistry and The PULSE Institute, Stanford University,Stanford, California 94305, United States
- SLAC National Accelerator Laboratory, Menlo Park, California 94024, United States
| | - Keiran Thompson
- Department of Chemistry and The PULSE Institute, Stanford University,Stanford, California 94305, United States
- SLAC National Accelerator Laboratory, Menlo Park, California 94024, United States
| | - Todd J Martínez
- Department of Chemistry and The PULSE Institute, Stanford University,Stanford, California 94305, United States
- SLAC National Accelerator Laboratory, Menlo Park, California 94024, United States
| |
Collapse
|
13
|
Goldman N, Fried LE, Lindsey RK, Pham CH, Dettori R. Enhancing the accuracy of density functional tight binding models through ChIMES many-body interaction potentials. J Chem Phys 2023; 158:144112. [PMID: 37061479 DOI: 10.1063/5.0141616] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023] Open
Abstract
Semi-empirical quantum models such as Density Functional Tight Binding (DFTB) are attractive methods for obtaining quantum simulation data at longer time and length scales than possible with standard approaches. However, application of these models can require lengthy effort due to the lack of a systematic approach for their development. In this work, we discuss the use of the Chebyshev Interaction Model for Efficient Simulation (ChIMES) to create rapidly parameterized DFTB models, which exhibit strong transferability due to the inclusion of many-body interactions that might otherwise be inaccurate. We apply our modeling approach to silicon polymorphs and review previous work on titanium hydride. We also review the creation of a general purpose DFTB/ChIMES model for organic molecules and compounds that approaches hybrid functional and coupled cluster accuracy with two orders of magnitude fewer parameters than similar neural network approaches. In all cases, DFTB/ChIMES yields similar accuracy to the underlying quantum method with orders of magnitude improvement in computational cost. Our developments provide a way to create computationally efficient and highly accurate simulations over varying extreme thermodynamic conditions, where physical and chemical properties can be difficult to interrogate directly, and there is historically a significant reliance on theoretical approaches for interpretation and validation of experimental results.
Collapse
Affiliation(s)
- Nir Goldman
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Laurence E Fried
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - Rebecca K Lindsey
- Department of Chemical Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - C Huy Pham
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| | - R Dettori
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, USA
| |
Collapse
|
14
|
Cersonsky RK, Pakhnova M, Engel EA, Ceriotti M. A data-driven interpretation of the stability of organic molecular crystals. Chem Sci 2023; 14:1272-1285. [PMID: 36756329 PMCID: PMC9891366 DOI: 10.1039/d2sc06198h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2022] [Accepted: 12/06/2022] [Indexed: 01/17/2023] Open
Abstract
Due to the subtle balance of intermolecular interactions that govern structure-property relations, predicting the stability of crystal structures formed from molecular building blocks is a highly non-trivial scientific problem. A particularly active and fruitful approach involves classifying the different combinations of interacting chemical moieties, as understanding the relative energetics of different interactions enables the design of molecular crystals and fine-tuning of their stabilities. While this is usually performed based on the empirical observation of the most commonly encountered motifs in known crystal structures, we propose to apply a combination of supervised and unsupervised machine-learning techniques to automate the construction of an extensive library of molecular building blocks. We introduce a structural descriptor tailored to the prediction of the binding (lattice) energy and apply it to a curated dataset of organic crystals, exploiting its atom-centered nature to obtain a data-driven assessment of the contribution of different chemical groups to the lattice energy of the crystal. We then interpret this library using a low-dimensional representation of the structure-energy landscape and discuss selected examples of the insights into crystal engineering that can be extracted from this analysis, providing a complete database to guide the design of molecular materials.
Collapse
Affiliation(s)
- Rose K Cersonsky
- Laboratory of Computational Science and Modeling (COSMO), École Polytechnique Fédérale de Lausanne Lausanne Switzerland
| | - Maria Pakhnova
- Laboratory of Computational Science and Modeling (COSMO), École Polytechnique Fédérale de Lausanne Lausanne Switzerland
| | - Edgar A Engel
- TCM Group, Trinity College, Cambridge University Cambridge UK
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling (COSMO), École Polytechnique Fédérale de Lausanne Lausanne Switzerland
| |
Collapse
|
15
|
McSloy A, Fan G, Sun W, Hölzer C, Friede M, Ehlert S, Schütte NE, Grimme S, Frauenheim T, Aradi B. TBMaLT, a flexible toolkit for combining tight-binding and machine learning. J Chem Phys 2023; 158:034801. [PMID: 36681630 DOI: 10.1063/5.0132892] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Tight-binding approaches, especially the Density Functional Tight-Binding (DFTB) and the extended tight-binding schemes, allow for efficient quantum mechanical simulations of large systems and long-time scales. They are derived from ab initio density functional theory using pragmatic approximations and some empirical terms, ensuring a fine balance between speed and accuracy. Their accuracy can be improved by tuning the empirical parameters using machine learning techniques, especially when information about the local environment of the atoms is incorporated. As the significant quantum mechanical contributions are still provided by the tight-binding models, and only short-ranged corrections are fitted, the learning procedure is typically shorter and more transferable as it were with predicting the quantum mechanical properties directly with machine learning without an underlying physically motivated model. As a further advantage, derived quantum mechanical quantities can be calculated based on the tight-binding model without the need for additional learning. We have developed the open-source framework-Tight-Binding Machine Learning Toolkit-which allows the easy implementation of such combined approaches. The toolkit currently contains layers for the DFTB method and an interface to the GFN1-xTB Hamiltonian, but due to its modular structure and its well-defined interfaces, additional atom-based schemes can be implemented easily. We are discussing the general structure of the framework, some essential implementation details, and several proof-of-concept applications demonstrating the perspectives of the combined methods and the functionality of the toolkit.
Collapse
Affiliation(s)
- A McSloy
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - G Fan
- Bremen Center of Computational Materials Science, University of Bremen, 28359 Bremen, Germany
| | - W Sun
- Bremen Center of Computational Materials Science, University of Bremen, 28359 Bremen, Germany
| | - C Hölzer
- Mulliken Center for Theoretical Chemistry, University of Bonn, 53115 Bonn, Germany
| | - M Friede
- Mulliken Center for Theoretical Chemistry, University of Bonn, 53115 Bonn, Germany
| | - S Ehlert
- Mulliken Center for Theoretical Chemistry, University of Bonn, 53115 Bonn, Germany
| | - N-E Schütte
- Bremen Center of Computational Materials Science, University of Bremen, 28359 Bremen, Germany
| | - S Grimme
- Mulliken Center for Theoretical Chemistry, University of Bonn, 53115 Bonn, Germany
| | - T Frauenheim
- Bremen Center of Computational Materials Science, University of Bremen, 28359 Bremen, Germany
| | - B Aradi
- Bremen Center of Computational Materials Science, University of Bremen, 28359 Bremen, Germany
| |
Collapse
|
16
|
Petsev ND, Nikoubashman A, Latinwo F, Stillinger FH, Debenedetti PG. Crystal Prediction via Genetic Algorithms in a Model Chiral System. J Phys Chem B 2022; 126:7771-7780. [PMID: 36162405 DOI: 10.1021/acs.jpcb.2c04501] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Chiral crystals and their constituent molecules play a prominent role in theories about the origin of biological homochirality and in drug discovery, design, and stability. Although the prediction and identification of stable chiral crystal structures is crucial for numerous technologies, including separation processes and polymorph selection and control, predictive ability is often complicated by a combination of many-body interactions and molecular complexity and handedness. In this work, we address these challenges by applying genetic algorithms to predict the ground-state crystal lattices formed by a chiral tetramer molecular model, which we have previously shown to exhibit complex fluid-phase behavior. Using this approach, we explore the relative stability and structures of the model's conglomerate and racemic crystals, and present a structural phase diagram for the stable Bravais crystal types in the zero-temperature limit.
Collapse
Affiliation(s)
- Nikolai D Petsev
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Arash Nikoubashman
- Institute of Physics, Johannes Gutenberg University Mainz, Staudingerweg 7, 55128 Mainz, Germany
| | - Folarin Latinwo
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States.,Synopsys Inc., Austin, Texas 78746, United States
| | - Frank H Stillinger
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, United States
| | - Pablo G Debenedetti
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
17
|
Mattei A, Hong RS, Dietrich H, Firaha D, Helfferich J, Liu YM, Sasikumar K, Abraham NS, Miglani Bhardwaj R, Neumann MA, Sheikh AY. Efficient Crystal Structure Prediction for Structurally Related Molecules with Accurate and Transferable Tailor-Made Force Fields. J Chem Theory Comput 2022; 18:5725-5738. [PMID: 35930763 PMCID: PMC9476662 DOI: 10.1021/acs.jctc.2c00451] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Crystal structure prediction (CSP) his generally used to complement experimental solid form screening and applied to individual molecules in drug development. The fast development of algorithms and computing resources offers the opportunity to use CSP earlier and for a broader range of applications in the drug design cycle. This study presents a novel paradigm of CSP specifically designed for structurally related molecules, referred to as Quick-CSP. The approach prioritizes more accurate physics through robust and transferable tailor-made force fields (TMFFs), such that significant efficiency gains are achieved through the reduction of expensive ab initio calculations. The accuracy of the TMFF is increased by the introduction of electrostatic multipoles, and the fragment-based force field parameterization scheme is demonstrated to be transferable for a family of chemically related molecules. The protocol is benchmarked with structurally related compounds from the Bromodomain and Extraterminal (BET) domain inhibitors series. A new convergence criterion is introduced that aims at performing only as many ab initio optimizations of crystal structures as required to locate the bottom of the crystal energy landscape within a user-defined accuracy. The overall approach provides significant cost savings ranging from three- to eight-fold less than the full-CSP workflow. The reported advancements expand the scope and utility of the underlying CSP building blocks as well as their novel reassembly to other applications earlier in the drug design cycle to guide molecule design and selection.
Collapse
Affiliation(s)
- Alessandra Mattei
- Solid State Chemistry, Research & Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Richard S Hong
- Solid State Chemistry, Research & Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Hanno Dietrich
- Avant-garde Materials Simulation, GmbH, Alte Str. 2, 79249 Merzhausen, Germany
| | - Dzmitry Firaha
- Avant-garde Materials Simulation, GmbH, Alte Str. 2, 79249 Merzhausen, Germany
| | - Julian Helfferich
- Avant-garde Materials Simulation, GmbH, Alte Str. 2, 79249 Merzhausen, Germany
| | - Yifei Michelle Liu
- Avant-garde Materials Simulation, GmbH, Alte Str. 2, 79249 Merzhausen, Germany
| | - Kiran Sasikumar
- Avant-garde Materials Simulation, GmbH, Alte Str. 2, 79249 Merzhausen, Germany
| | - Nathan S Abraham
- Solid State Chemistry, Research & Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Rajni Miglani Bhardwaj
- Solid State Chemistry, Research & Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Marcus A Neumann
- Avant-garde Materials Simulation, GmbH, Alte Str. 2, 79249 Merzhausen, Germany
| | - Ahmad Y Sheikh
- Solid State Chemistry, Research & Development, AbbVie Inc., 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| |
Collapse
|
18
|
Farrar EHE, Grayson MN. Machine learning and semi-empirical calculations: a synergistic approach to rapid, accurate, and mechanism-based reaction barrier prediction. Chem Sci 2022; 13:7594-7603. [PMID: 35872815 PMCID: PMC9242013 DOI: 10.1039/d2sc02925a] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2022] [Accepted: 06/08/2022] [Indexed: 11/21/2022] Open
Abstract
Modern QM modelling methods, such as DFT, have provided detailed mechanistic insights into countless reactions. However, their computational cost inhibits their ability to rapidly screen large numbers of substrates and catalysts in reaction discovery. For a C-C bond forming nitro-Michael addition, we introduce a synergistic semi-empirical quantum mechanical (SQM) and machine learning (ML) approach that allows the prediction of DFT-quality reaction barriers in minutes, even on a standard laptop using widely available modelling software. Mean absolute errors (MAEs) are obtained that are below the accepted chemical accuracy threshold of 1 kcal mol-1 and substantially better than SQM methods without ML correction (5.71 kcal mol-1). Predictive power is shown to hold when the ML models are applied to an unseen set of compounds from the toxicology literature. Mechanistic insight is also achieved via the generation of full SQM transition state (TS) structures which are found to be very good approximations for the DFT-level geometries, revealing important steric interactions in some TSs. This combination of speed, accuracy, and mechanistic insight is unprecedented; current ML barrier models compromise on at least one of these important criteria.
Collapse
Affiliation(s)
- Elliot H E Farrar
- Department of Chemistry, University of Bath Claverton Down Bath BA2 7AY UK
| | - Matthew N Grayson
- Department of Chemistry, University of Bath Claverton Down Bath BA2 7AY UK
| |
Collapse
|
19
|
Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022; 122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Artificial intelligence and specifically machine learning applications are nowadays used in a variety of scientific applications and cutting-edge technologies, where they have a transformative impact. Such an assembly of statistical and linear algebra methods making use of large data sets is becoming more and more integrated into chemistry and crystallization research workflows. This review aims to present, for the first time, a holistic overview of machine learning and cheminformatics applications as a novel, powerful means to accelerate the discovery of new crystal structures, predict key properties of organic crystalline materials, simulate, understand, and control the dynamics of complex crystallization process systems, as well as contribute to high throughput automation of chemical process development involving crystalline materials. We critically review the advances in these new, rapidly emerging research areas, raising awareness in issues such as the bridging of machine learning models with first-principles mechanistic models, data set size, structure, and quality, as well as the selection of appropriate descriptors. At the same time, we propose future research at the interface of applied mathematics, chemistry, and crystallography. Overall, this review aims to increase the adoption of such methods and tools by chemists and scientists across industry and academia.
Collapse
Affiliation(s)
- Christos Xiouras
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Fabio Cameli
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Gustavo Lunardon Quilló
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium.,Chemical and BioProcess Technology and Control, Department of Chemical Engineering, Faculty of Engineering Technology, KU Leuven, Gebroeders de Smetstraat 1, 9000 Ghent, Belgium
| | - Mihail E Kavousanakis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Dionisios G Vlachos
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Georgios D Stefanidis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece.,Laboratory for Chemical Technology, Ghent University; Tech Lane Ghent Science Park 125, B-9052 Ghent, Belgium
| |
Collapse
|
20
|
Wengert S, Csányi G, Reuter K, Margraf JT. A Hybrid Machine Learning Approach for Structure Stability Prediction in Molecular Co-crystal Screenings. J Chem Theory Comput 2022; 18:4586-4593. [PMID: 35709378 PMCID: PMC9281391 DOI: 10.1021/acs.jctc.2c00343] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
![]()
Co-crystals are a
highly interesting material class as varying
their components and stoichiometry in principle allows tuning supramolecular
assemblies toward desired physical properties. The in silico prediction of co-crystal structures represents a daunting task,
however, as they span a vast search space and usually feature large
unit cells. This requires theoretical models that are accurate and
fast to evaluate, a combination that can in principle be accomplished
by modern machine-learned (ML) potentials trained on first-principles
data. Crucially, these ML potentials need to account for the description
of long-range interactions, which are essential for the stability
and structure of molecular crystals. In this contribution, we present
a strategy for developing Δ-ML potentials for co-crystals, which
use a physical baseline model to describe long-range interactions.
The applicability of this approach is demonstrated for co-crystals
of variable composition consisting of an active pharmaceutical ingredient
and various co-formers. We find that the Δ-ML approach offers
a strong and consistent improvement over the density functional tight
binding baseline. Importantly, this even holds true when extrapolating
beyond the scope of the training set, for instance in molecular dynamics
simulations under ambient conditions.
Collapse
Affiliation(s)
- Simon Wengert
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany.,Chair of Theoretical Chemistry, Technische Universitát München, 85747 Garching, Germany
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| | - Karsten Reuter
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
| | - Johannes T Margraf
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, Faradayweg 4-6, 14195 Berlin, Germany
| |
Collapse
|
21
|
Pham CH, Lindsey RK, Fried LE, Goldman N. High-Accuracy Semiempirical Quantum Models Based on a Minimal Training Set. J Phys Chem Lett 2022; 13:2934-2942. [PMID: 35343698 DOI: 10.1021/acs.jpclett.2c00453] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
A great need exists for computationally efficient quantum simulation approaches that can achieve an accuracy similar to high-level theories at a fraction of the computational cost. In this regard, we have leveraged a machine-learned interaction potential based on Chebyshev polynomials to improve density functional tight binding (DFTB) models for organic materials. The benefit of our approach is two-fold: (1) many-body interactions can be corrected for in a systematic and rapidly tunable process, and (2) high-level quantum accuracy for a broad range of compounds can be achieved with ∼0.3% of data required for one advanced deep learning potential. Our model exhibits both transferability and extensibility through comparison to quantum chemical results for organic clusters, solid carbon phases, and molecular crystal phase stability rankings. Our efforts thus allow for high-throughput physical and chemical predictions with up to coupled-cluster accuracy for systems that are computationally intractable with standard approaches.
Collapse
Affiliation(s)
- Cong Huy Pham
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Rebecca K Lindsey
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Laurence E Fried
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
| | - Nir Goldman
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California 94550, United States
- Department of Chemical Engineering, University of California, Davis, California 95616, United States
| |
Collapse
|
22
|
Habershon S. Program Synthesis of Sparse Algorithms for Wave Function and Energy Prediction in Grid-Based Quantum Simulations. J Chem Theory Comput 2022; 18:2462-2478. [PMID: 35293216 PMCID: PMC9009083 DOI: 10.1021/acs.jctc.2c00035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
We have recently shown how program synthesis (PS), or the concept of "self-writing code", can generate novel algorithms that solve the vibrational Schrödinger equation, providing approximations to the allowed wave functions for bound, one-dimensional (1-D) potential energy surfaces (PESs). The resulting algorithms use a grid-based representation of the underlying wave function ψ(x) and PES V(x), providing codes which represent approximations to standard discrete variable representation (DVR) methods. In this Article, we show how this inductive PS strategy can be improved and modified to enable prediction of both vibrational wave functions and energy eigenvalues of representative model PESs (both 1-D and multidimensional). We show that PS can generate algorithms that offer some improvements in energy eigenvalue accuracy over standard DVR schemes; however, we also demonstrate that PS can identify accurate numerical methods that exhibit desirable computational features, such as employing very sparse (tridiagonal) matrices. The resulting PS-generated algorithms are initially developed and tested for 1-D vibrational eigenproblems, before solution of multidimensional problems is demonstrated; we find that our new PS-generated algorithms can reduce calculation times for grid-based eigenvector computation by an order of magnitude or more. More generally, with further development and optimization, we anticipate that PS-generated algorithms based on effective Hamiltonian approximations, such as those proposed here, could be useful in direct simulations of quantum dynamics via wave function propagation and evaluation of molecular electronic structure.
Collapse
Affiliation(s)
- Scott Habershon
- Department of Chemistry, University of Warwick, Coventry, CV4 7AL, United Kingdom
| |
Collapse
|
23
|
Beran GJO, Wright SE, Greenwell C, Cruz-Cabeza AJ. The interplay of intra- and intermolecular errors in modeling conformational polymorphs. J Chem Phys 2022; 156:104112. [DOI: 10.1063/5.0088027] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Conformational polymorphs of organic molecular crystals represent a challenging test for quantum chemistry because they require careful balancing of the intra- and intermolecular interactions. This study examines 54 molecular conformations from 20 sets of conformational polymorphs, along with the relative lattice energies and 173 dimer interactions taken from six of the polymorph sets. These systems are studied with a variety of van der Waals-inclusive density functionals theory models; dispersion-corrected spin-component-scaled second-order Møller–Plesset perturbation theory (SCS-MP2D); and domain local pair natural orbital coupled cluster singles, doubles, and perturbative triples [DLPNO-CCSD(T)]. We investigate how delocalization error in conventional density functionals impacts monomer conformational energies, systematic errors in the intermolecular interactions, and the nature of error cancellation that occurs in the overall crystal. The density functionals B86bPBE-XDM, PBE-D4, PBE-MBD, PBE0-D4, and PBE0-MBD are found to exhibit sizable one-body and two-body errors vs DLPNO-CCSD(T) benchmarks, and the level of success in predicting the relative polymorph energies relies heavily on error cancellation between different types of intermolecular interactions or between intra- and intermolecular interactions. The SCS-MP2D and, to a lesser extent, ωB97M-V models exhibit smaller errors and rely less on error cancellation. Implications for crystal structure prediction of flexible compounds are discussed. Finally, the one-body and two-body DLPNO-CCSD(T) energies taken from these conformational polymorphs establish the CP1b and CP2b benchmark datasets that could be useful for testing quantum chemistry models in challenging real-world systems with complex interplay between intra- and intermolecular interactions, a number of which are significantly impacted by delocalization error.
Collapse
Affiliation(s)
- Gregory J. O. Beran
- Department of Chemistry, University of California, Riverside, California 92521, USA
| | - Sarah E. Wright
- Department of Chemical Engineering and Analytical Science, University of Manchester, Manchester, United Kingdom
| | - Chandler Greenwell
- Department of Chemistry, University of California, Riverside, California 92521, USA
| | - Aurora J. Cruz-Cabeza
- Department of Chemical Engineering and Analytical Science, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
24
|
Calcinelli F, Jeindl A, Hörmann L, Ghan S, Oberhofer H, Hofmann OT. Interfacial Charge Transfer Influences Thin-Film Polymorphism. THE JOURNAL OF PHYSICAL CHEMISTRY. C, NANOMATERIALS AND INTERFACES 2022; 126:2868-2876. [PMID: 35178141 PMCID: PMC8842301 DOI: 10.1021/acs.jpcc.1c09986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/22/2021] [Revised: 01/12/2022] [Indexed: 05/05/2023]
Abstract
The structure and chemical composition are the key parameters influencing the properties of organic thin films deposited on inorganic substrates. Such films often display structures that substantially differ from the bulk, and the substrate has a relevant influence on their polymorphism. In this work, we illuminate the role of the substrate by studying its influence on para-benzoquinone on two different substrates, Ag(111) and graphene. We employ a combination of first-principles calculations and machine learning to identify the energetically most favorable structures on both substrates and study their electronic properties. Our results indicate that for the first layer, similar structures are favorable for both substrates. For the second layer, we find two significantly different structures. Interestingly, graphene favors the one with less, while Ag favors the one with more electronic coupling. We explain this switch in stability as an effect of the different charge transfer on the two substrates.
Collapse
Affiliation(s)
- Fabio Calcinelli
- Institute
of Solid State Physics, Graz University
of Technology, 8010 Graz, Austria
| | - Andreas Jeindl
- Institute
of Solid State Physics, Graz University
of Technology, 8010 Graz, Austria
| | - Lukas Hörmann
- Institute
of Solid State Physics, Graz University
of Technology, 8010 Graz, Austria
| | - Simiam Ghan
- Chair
for Theoretical Chemistry and Catalysis Research Center, Technical University Munich, 85748 Garching, Germany
| | - Harald Oberhofer
- Chair
for Theoretical Chemistry and Catalysis Research Center, Technical University Munich, 85748 Garching, Germany
- Chair
for Theoretical Physics VII and Bavarian Center for Battery Technology
(BayBatt), University of Bayreuth, Universitätsstraße 30, 95447 Bayreuth, Germany
| | - Oliver T. Hofmann
- Institute
of Solid State Physics, Graz University
of Technology, 8010 Graz, Austria
| |
Collapse
|
25
|
A complete description of thermodynamic stabilities of molecular crystals. Proc Natl Acad Sci U S A 2022; 119:2111769119. [PMID: 35131847 PMCID: PMC8832981 DOI: 10.1073/pnas.2111769119] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 12/23/2021] [Indexed: 12/27/2022] Open
Abstract
Predicting stable polymorphs of molecular crystals remains one of the grand challenges of computational science. Current methods invoke approximations to electronic structure and statistical mechanics and thus fail to consistently reproduce the delicate balance of physical effects determining thermodynamic stability. We compute the rigorous ab initio Gibbs free energies for competing polymorphs of paradigmatic compounds, using machine learning to mitigate costs. The accurate description of electronic structure and full treatment of quantum statistical mechanics allow us to predict the experimentally observed phase behavior. This constitutes a key step toward the first-principles design of functional materials for applications from photovoltaics to pharmaceuticals. Predictions of relative stabilities of (competing) molecular crystals are of great technological relevance, most notably for the pharmaceutical industry. However, they present a long-standing challenge for modeling, as often minuscule free energy differences are sensitively affected by the description of electronic structure, the statistical mechanics of the nuclei and the cell, and thermal expansion. The importance of these effects has been individually established, but rigorous free energy calculations for general molecular compounds, which simultaneously account for all effects, have hitherto not been computationally viable. Here we present an efficient “end to end” framework that seamlessly combines state-of-the art electronic structure calculations, machine-learning potentials, and advanced free energy methods to calculate ab initio Gibbs free energies for general organic molecular materials. The facile generation of machine-learning potentials for a diverse set of polymorphic compounds—benzene, glycine, and succinic acid—and predictions of thermodynamic stabilities in qualitative and quantitative agreement with experiments highlight that predictive thermodynamic studies of industrially relevant molecular materials are no longer a daunting task.
Collapse
|
26
|
Bissuel D, Albaret T, Niehaus TA. Critical assessment of machine-learned repulsive potentials for the Density Functional based Tight-Binding method: a case study for pure silicon. J Chem Phys 2022; 156:064101. [DOI: 10.1063/5.0081159] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
27
|
Dudek MK, Druzbicki K. Along the road to Crystal Structure Prediction (CSP) of pharmaceutical-like molecules. CrystEngComm 2022. [DOI: 10.1039/d1ce01564h] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Computational methods used for predicting crystal structures of organic compounds are mature enough to be routinely used with many rigid and semi-rigid organic molecules. The usefulness of Crystal Structure Prediction...
Collapse
|
28
|
Deringer VL, Bartók AP, Bernstein N, Wilkins DM, Ceriotti M, Csányi G. Gaussian Process Regression for Materials and Molecules. Chem Rev 2021; 121:10073-10141. [PMID: 34398616 PMCID: PMC8391963 DOI: 10.1021/acs.chemrev.1c00022] [Citation(s) in RCA: 232] [Impact Index Per Article: 77.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Indexed: 12/18/2022]
Abstract
We provide an introduction to Gaussian process regression (GPR) machine-learning methods in computational materials science and chemistry. The focus of the present review is on the regression of atomistic properties: in particular, on the construction of interatomic potentials, or force fields, in the Gaussian Approximation Potential (GAP) framework; beyond this, we also discuss the fitting of arbitrary scalar, vectorial, and tensorial quantities. Methodological aspects of reference data generation, representation, and regression, as well as the question of how a data-driven model may be validated, are reviewed and critically discussed. A survey of applications to a variety of research questions in chemistry and materials science illustrates the rapid growth in the field. A vision is outlined for the development of the methodology in the years to come.
Collapse
Affiliation(s)
- Volker L. Deringer
- Department
of Chemistry, Inorganic Chemistry Laboratory, University of Oxford, Oxford OX1 3QR, United Kingdom
| | - Albert P. Bartók
- Department
of Physics and Warwick Centre for Predictive Modelling, School of
Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Noam Bernstein
- Center
for Computational Materials Science, U.S.
Naval Research Laboratory, Washington D.C. 20375, United States
| | - David M. Wilkins
- Atomistic
Simulation Centre, School of Mathematics and Physics, Queen’s University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - Michele Ceriotti
- Laboratory
of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, Lausanne 1015, Switzerland
- National
Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale
de Lausanne, Lausanne, Switzerland
| | - Gábor Csányi
- Engineering
Laboratory, University of Cambridge, Cambridge CB2 1PZ, United Kingdom
| |
Collapse
|
29
|
Young TA, Johnston-Wood T, Deringer VL, Duarte F. A transferable active-learning strategy for reactive molecular force fields. Chem Sci 2021; 12:10944-10955. [PMID: 34476072 PMCID: PMC8372546 DOI: 10.1039/d1sc01825f] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/04/2021] [Indexed: 11/25/2022] Open
Abstract
Predictive molecular simulations require fast, accurate and reactive interatomic potentials. Machine learning offers a promising approach to construct such potentials by fitting energies and forces to high-level quantum-mechanical data, but doing so typically requires considerable human intervention and data volume. Here we show that, by leveraging hierarchical and active learning, accurate Gaussian Approximation Potential (GAP) models can be developed for diverse chemical systems in an autonomous manner, requiring only hundreds to a few thousand energy and gradient evaluations on a reference potential-energy surface. The approach uses separate intra- and inter-molecular fits and employs a prospective error metric to assess the accuracy of the potentials. We demonstrate applications to a range of molecular systems with relevance to computational organic chemistry: ranging from bulk solvents, a solvated metal ion and a metallocage onwards to chemical reactivity, including a bifurcating Diels-Alder reaction in the gas phase and non-equilibrium dynamics (a model SN2 reaction) in explicit solvent. The method provides a route to routinely generating machine-learned force fields for reactive molecular systems.
Collapse
Affiliation(s)
- Tom A Young
- Chemistry Research Laboratory, University of Oxford Mansfield Road Oxford OX1 3TA UK
| | - Tristan Johnston-Wood
- Chemistry Research Laboratory, University of Oxford Mansfield Road Oxford OX1 3TA UK
| | - Volker L Deringer
- Department of Chemistry, Inorganic Chemistry Laboratory, University of Oxford Oxford OX1 3QR UK
| | - Fernanda Duarte
- Chemistry Research Laboratory, University of Oxford Mansfield Road Oxford OX1 3TA UK
| |
Collapse
|