1
|
Butler PV, Hafizi R, Day GM. Machine-Learned Potentials by Active Learning from Organic Crystal Structure Prediction Landscapes. J Phys Chem A 2024; 128:945-957. [PMID: 38277275 PMCID: PMC10860135 DOI: 10.1021/acs.jpca.3c07129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 01/04/2024] [Accepted: 01/11/2024] [Indexed: 01/28/2024]
Abstract
A primary challenge in organic molecular crystal structure prediction (CSP) is accurately ranking the energies of potential structures. While high-level solid-state density functional theory (DFT) methods allow for mostly reliable discrimination of the low-energy structures, their high computational cost is problematic because of the need to evaluate tens to hundreds of thousands of trial crystal structures to fully explore typical crystal energy landscapes. Consequently, lower-cost but less accurate empirical force fields are often used, sometimes as the first stage of a hierarchical scheme involving multiple stages of increasingly accurate energy calculations. Machine-learned interatomic potentials (MLIPs), trained to reproduce the results of ab initio methods with computational costs close to those of force fields, can improve the efficiency of the CSP by reducing or eliminating the need for costly DFT calculations. Here, we investigate active learning methods for training MLIPs with CSP datasets. The combination of active learning with the well-developed sampling methods from CSP yields potentials in a highly automated workflow that are relevant over a wide range of the crystal packing space. To demonstrate these potentials, we illustrate efficiently reranking large, diverse crystal structure landscapes to near-DFT accuracy from force field-based CSP, improving the reliability of the final energy ranking. Furthermore, we demonstrate how these potentials can be extended to more accurately model structures far from lattice energy minima through additional on-the-fly training within Monte Carlo simulations.
Collapse
Affiliation(s)
| | - Roohollah Hafizi
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| | - Graeme M. Day
- School of Chemistry, University
of Southampton, Southampton SO17 1BJ, U.K.
| |
Collapse
|
2
|
Egorova O, Hafizi R, Woods DC, Day GM. Multifidelity Statistical Machine Learning for Molecular Crystal Structure Prediction. J Phys Chem A 2020; 124:8065-8078. [PMID: 32881496 DOI: 10.1021/acs.jpca.0c05006] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The prediction of crystal structures from first-principles requires highly accurate energies for large numbers of putative crystal structures. High accuracy of solid state density functional theory (DFT) calculations is often required, but hundreds or more structures can be present in the low energy region of interest, so that the associated computational costs are prohibitive. Here, we apply statistical machine learning to predict expensive hybrid functional DFT (PBE0) calculations using a multifidelity approach to re-evaluate the energies of crystal structures predicted with an inexpensive force field. The method uses an autoregressive Gaussian process, making use of less expensive GGA DFT (PBE) calculations to bridge the gap between the force field and PBE0 energies. The method is benchmarked on the crystal structure landscapes of three small, hydrogen-bonded organic molecules and shown to produce accurate predictions of energies and crystal structure ranking using small numbers of the most expensive calculations; the PBE0 energies can be predicted with errors of less than 1 kJ mol-1 with between 4.2 and 6.8% of the cost of the full calculations. As the model that we have developed is probabilistic, we discuss how the uncertainties in predicted energies impact the assessment of the energetic ranking of crystal structures.
Collapse
Affiliation(s)
- Olga Egorova
- Statistical Sciences Research Institute, University of Southampton, Southampton, SO17 1BJ, U.K
| | - Roohollah Hafizi
- Computational Systems Chemistry, School of Chemistry, University of Southampton, Southampton, SO17 1BJ, U.K
| | - David C Woods
- Statistical Sciences Research Institute, University of Southampton, Southampton, SO17 1BJ, U.K
| | - Graeme M Day
- Computational Systems Chemistry, School of Chemistry, University of Southampton, Southampton, SO17 1BJ, U.K
| |
Collapse
|
3
|
McDonagh D, Skylaris CK, Day GM. Machine-Learned Fragment-Based Energies for Crystal Structure Prediction. J Chem Theory Comput 2019; 15:2743-2758. [PMID: 30817152 DOI: 10.1021/acs.jctc.9b00038] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Crystal structure prediction involves a search of a complex configurational space for local minima corresponding to stable crystal structures, which can be performed efficiently using atom-atom force fields for the assessment of intermolecular interactions. However, for challenging systems, the limitations in the accuracy of force fields prevent a reliable assessment of the relative thermodynamic stability of potential structures, while the cost of fully quantum mechanical approaches can limit applications of the methods. We present a method to rapidly improve force field lattice energies by correcting two-body interactions with a higher level of theory in a fragment-based approach and predicting these corrections with machine learning. Corrected lattice energies with commonly used density functionals and second order perturbation theory (MP2) all significantly improve the ranking of experimentally known polymorphs where the rigid molecule model is applicable. The relative lattice energies of known polymorphs are also found to systematically improve with the fragment corrections. Predicting two-body interactions with atom-centered symmetry functions in a Gaussian process is found to give highly accurate results using as little as 10-20% of the data for training, reducing the cost of the energy correction by up to an order of magnitude. The machine learning approach opens up the possibility of more widespread use of fragment-based methods in crystal structure prediction, whose increased accuracy at a low computational cost will benefit applications in areas such as polymorph screening and computer-guided materials discovery.
Collapse
Affiliation(s)
- David McDonagh
- School of Chemistry , University of Southampton , Highfield, Southampton , SO17 1BJ , United Kingdom
| | - Chris-Kriton Skylaris
- School of Chemistry , University of Southampton , Highfield, Southampton , SO17 1BJ , United Kingdom
| | - Graeme M Day
- School of Chemistry , University of Southampton , Highfield, Southampton , SO17 1BJ , United Kingdom
| |
Collapse
|
4
|
Aguiar DLMD, San Gil RADS, Alencastro RBD, Souza EFD, Borré LB, Vaiss VDS, Leitão AA. 6-Aminopenicillanic acid revisited: A combined solid state NMR and in silico refinement. Chem Phys Lett 2016. [DOI: 10.1016/j.cplett.2016.08.030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
5
|
Takahashi H, Umino S, Morita A. Construction of exchange repulsion in terms of the wave functions at QM/MM boundary region. J Chem Phys 2015; 143:084104. [DOI: 10.1063/1.4928762] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
6
|
Okoth MO, Vrcelj RM, Sheen DB, Sherwood JN. Hydration studies of a simple molecular solid. CrystEngComm 2012. [DOI: 10.1039/c2ce06301h] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
7
|
|
8
|
Blair SA, Thakkar AJ. How many intramolecular hydrogen bonds does the oxalic acid dimer have? Chem Phys Lett 2010. [DOI: 10.1016/j.cplett.2010.07.019] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
|
9
|
King MD, Korter TM. Effect of Waters of Crystallization on Terahertz Spectra: Anhydrous Oxalic Acid and Its Dihydrate. J Phys Chem A 2010; 114:7127-38. [DOI: 10.1021/jp101935n] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Matthew D. King
- Department of Chemistry, Syracuse University, 1-014 Center for Science and Technology Syracuse, New York, 13244-4100
| | - Timothy M. Korter
- Department of Chemistry, Syracuse University, 1-014 Center for Science and Technology Syracuse, New York, 13244-4100
| |
Collapse
|
10
|
Totton TS, Misquitta AJ, Kraft M. A First Principles Development of a General Anisotropic Potential for Polycyclic Aromatic Hydrocarbons. J Chem Theory Comput 2010; 6:683-95. [PMID: 26613299 DOI: 10.1021/ct9004883] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Standard empirical atom-atom potentials are shown to be unable to describe the binding of polycyclic aromatic hydrocarbon (PAH) molecules in the variety of configurations seen in clusters. The main reason for this inadequacy is the lack of anisotropy in these potentials. We have constructed an anisotropic atom-atom intermolecular potential for the benzene molecule from first principles using a symmetry-adapted perturbation theory based on density functional theory (SAPT(DFT)), interaction energy calculations and the Williams-Stone-Misquitta method for obtaining molecular properties in distributed form. Using this potential as a starting point, we have constructed a transferable anisotropic potential to model intermolecular interactions between PAHs. This new potential has been shown to accurately model interaction energies for a variety of dimer configurations for four different PAH molecules, including certain configurations which are poorly modeled with current isotropic potentials. It is intended that this potential will form the basis for further work on the aggregation of PAHs.
Collapse
Affiliation(s)
- Tim S Totton
- Department of Chemical Engineering and Biotechnology, University of Cambridge, New Museums Site, Pembroke Street, Cambridge CB2 3RA, United Kingdom, and Department of Physics,Cavendish Laboratory, University of Cambridge, J J Thomson Avenue, Cambridge, CB3 0HE, United Kingdom
| | - Alston J Misquitta
- Department of Chemical Engineering and Biotechnology, University of Cambridge, New Museums Site, Pembroke Street, Cambridge CB2 3RA, United Kingdom, and Department of Physics,Cavendish Laboratory, University of Cambridge, J J Thomson Avenue, Cambridge, CB3 0HE, United Kingdom
| | - Markus Kraft
- Department of Chemical Engineering and Biotechnology, University of Cambridge, New Museums Site, Pembroke Street, Cambridge CB2 3RA, United Kingdom, and Department of Physics,Cavendish Laboratory, University of Cambridge, J J Thomson Avenue, Cambridge, CB3 0HE, United Kingdom
| |
Collapse
|
11
|
|
12
|
Misquitta AJ, Welch GW, Stone AJ, Price SL. A first principles prediction of the crystal structure of C6Br2ClFH2. Chem Phys Lett 2008. [DOI: 10.1016/j.cplett.2008.02.113] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
13
|
|
14
|
Das D, Banerjee R, Mondal R, Howard JAK, Boese R, Desiraju GR. Synthon evolution and unit cell evolution during crystallisation. A study of symmetry-independent molecules (Z' > 1) in crystals of some hydroxy compounds. Chem Commun (Camb) 2005:555-7. [PMID: 16432581 DOI: 10.1039/b514076e] [Citation(s) in RCA: 97] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
A kinetically favoured crystal, with many molecules in the asymmetric unit, may be a fossil relic of the crystal nucleus of a more stable polymorph.
Collapse
Affiliation(s)
- Dinabandhu Das
- School of Chemistry, University of Hyderabad, Hyderabad, 500 046, India
| | | | | | | | | | | |
Collapse
|
15
|
Modelling Intermolecular Forces for Organic Crystal Structure Prediction. STRUCTURE AND BONDING 2005. [DOI: 10.1007/b135616] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/07/2023]
|
16
|
Electronic Density Approaches to the Energetics of Noncovalent Interactions. Int J Mol Sci 2004. [DOI: 10.3390/i5040130] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
|
17
|
Ma Y, Politzer P. Calculation of electrostatic and polarization energies from electron densities. J Chem Phys 2004; 120:3152-7. [PMID: 15268467 DOI: 10.1063/1.1640991] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We investigate procedures for calculating the electrostatic and polarization energies, Ees and Epol, associated with noncovalent interactions. The starting points are the electron densities of the isolated components and the complex; these could be obtained either computationally or experimentally. A slightly modified version of a scheme proposed by Gavezzotti is used to carry out numerical integrations over these electron densities. Our approach to estimating Epol is based upon partitioning the charge distributions of the components into overlapping and nonoverlapping regions. The effects of varying the integration parameters, computational techniques and basis sets are examined in detail for several noncovalently bound molecular dimers. Our results are in good agreement with the values of Ees and Epol produced by other methods, which require analytical integrations over interaction Hamiltonian matrix elements.
Collapse
Affiliation(s)
- Yuguang Ma
- Department of Chemistry, University of New Orleans, New Orleans, Louisiana 70148, USA
| | | |
Collapse
|
18
|
Day GM, Price SL. A Nonempirical Anisotropic Atom−Atom Model Potential for Chlorobenzene Crystals. J Am Chem Soc 2003; 125:16434-43. [PMID: 14692787 DOI: 10.1021/ja0383625] [Citation(s) in RCA: 82] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A nearly nonempirical, transferable model potential is developed for the chlorobenzene molecules (C6ClnH6-n, n = 1 to 6) with anisotropy in the atom-atom form of both electrostatic and repulsion interactions. The potential is largely derived from the charge densities of the molecules, using a distributed multipole electrostatic model and a transferable dispersion model derived from the molecular polarizabilities. A nonempirical transferable repulsion model is obtained by analyzing the overlap of the charge densities in dimers as a function of orientation and separation and then calibrating this anisotropic atom-atom model against a limited number of intermolecular perturbation theory calculations of the short-range energies. The resulting model potential is a significant improvement over empirical model potentials in reproducing the twelve chlorobenzene crystal structures. Further validation calculations of the lattice energies and rigid-body k = 0 phonon frequencies provide satisfactory agreement with experiment, with the discrepancies being primarily due to approximations in the theoretical methods rather than the model intermolecular potential. The potential is able to give a good account of the three polymorphs of p-dichlorobenzene in a detailed crystal structure prediction study. Thus, by introducing repulsion anisotropy into a transferable potential scheme, it is possible to produce a set of potentials for the chlorobenzenes that can account for their crystal properties in an unprecedentedly realistic fashion.
Collapse
Affiliation(s)
- Graeme M Day
- Department of Chemistry, University College London, 20 Gordon Street, London, WC1H 0AJ, United Kingdom
| | | |
Collapse
|
19
|
Ikeda T, Nagayoshi K, Kitaura K. Ab initio MO based lattice energy for molecular crystals: packing structure of electron donor–acceptor (EDA) complex H3N–BF3. Chem Phys Lett 2003. [DOI: 10.1016/s0009-2614(03)00081-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
|
20
|
Gavezzotti A. Calculation of Intermolecular Interaction Energies by Direct Numerical Integration over Electron Densities. I. Electrostatic and Polarization Energies in Molecular Crystals. J Phys Chem B 2002. [DOI: 10.1021/jp0144202] [Citation(s) in RCA: 244] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- A. Gavezzotti
- Dipartimento di Chimica Strutturale e Stereochimica inorganica, Università di Milano, via Venezian 21, 20133 Milano, Italy
| |
Collapse
|
21
|
Van Eijck BP. Crystal structure predictions using five space groups with two independent molecules. The case of small organic acids. J Comput Chem 2002; 23:456-62. [PMID: 11908081 DOI: 10.1002/jcc.10042] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Crystal structure generations with two independent molecules have been performed for a series of carboxylic acids, using a slightly modified version of the OPLS force field. It was found that in this way the experimental structures with one independent molecule were produced as special cases, except for the molecules with four or more internal degrees of freedom. This work shows that a search with two independent molecules in only five space groups, although costly in computer power, can automatically also find structures with one independent molecule in many supergroups. Considering the observed abundances of structural classes, such a search should cover more than 95% of the possible homomolecular crystal structures.
Collapse
Affiliation(s)
- Bouke P Van Eijck
- Department of Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands.
| |
Collapse
|
22
|
|
23
|
Mitchell JBO, Price SL, Leslie M, Buttar D, Roberts RJ. Anisotropic Repulsion Potentials for Cyanuric Chloride (C3N3Cl3) and Their Application to Modeling the Crystal Structures of Azaaromatic Chlorides. J Phys Chem A 2001. [DOI: 10.1021/jp0125350] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- John B. O. Mitchell
- Centre for Theoretical and Computational Chemistry, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K
| | - Sarah L. Price
- Centre for Theoretical and Computational Chemistry, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K
| | | | - David Buttar
- AstraZeneca, Mereside, Alderley Park, Macclesfield, Cheshire SK10 4TG, U.K
| | - Ron J. Roberts
- AstraZeneca, Silk Road Business Park, Charter Way, Macclesfield, Cheshire SK10 2NA, U.K
| |
Collapse
|
24
|
Popelier PLA, Joubert L, Kosov DS. Convergence of the Electrostatic Interaction Based on Topological Atoms. J Phys Chem A 2001. [DOI: 10.1021/jp011511q] [Citation(s) in RCA: 113] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - L. Joubert
- Department of Chemistry, UMIST, Manchester, M60 1QD, England
| | - D. S. Kosov
- Department of Chemistry, UMIST, Manchester, M60 1QD, England
| |
Collapse
|
25
|
Mitchell JBO, Price SL. A Systematic Nonempirical Method of Deriving Model Intermolecular Potentials for Organic Molecules: Application To Amides. J Phys Chem A 2000. [DOI: 10.1021/jp002400e] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- John B. O. Mitchell
- Centre for Theoretical and Computational Chemistry, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K
| | - Sarah L. Price
- Centre for Theoretical and Computational Chemistry, Department of Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K
| |
Collapse
|
26
|
Mooij WTM, van Eijck BP, Kroon J. Ab Initio Crystal Structure Predictions for Flexible Hydrogen-Bonded Molecules. J Am Chem Soc 2000. [DOI: 10.1021/ja993945t] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Affiliation(s)
- Wijnand T. M. Mooij
- Contribution from the Department of Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Bouke P. van Eijck
- Contribution from the Department of Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Jan Kroon
- Contribution from the Department of Crystal and Structural Chemistry, Bijvoet Center for Biomolecular Research, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| |
Collapse
|
27
|
Beyer T, Price SL. Dimer or Catemer? Low-Energy Crystal Packings for Small Carboxylic Acids. J Phys Chem B 2000. [DOI: 10.1021/jp9941413] [Citation(s) in RCA: 129] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Theresa Beyer
- Centre of Theoretical and Computational Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K
| | - Sarah L. Price
- Centre of Theoretical and Computational Chemistry, University College London, 20 Gordon Street, London WC1H 0AJ, U.K
| |
Collapse
|