1
|
Houston PL, Qu C, Yu Q, Conte R, Nandi A, Li JK, Bowman JM. PESPIP: Software to fit complex molecular and many-body potential energy surfaces with permutationally invariant polynomials. J Chem Phys 2023; 158:044109. [PMID: 36725524 DOI: 10.1063/5.0134442] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
We wish to describe a potential energy surface by using a basis of permutationally invariant polynomials whose coefficients will be determined by numerical regression so as to smoothly fit a dataset of electronic energies as well as, perhaps, gradients. The polynomials will be powers of transformed internuclear distances, usually either Morse variables, exp(-ri,j/λ), where λ is a constant range hyperparameter, or reciprocals of the distances, 1/ri,j. The question we address is how to create the most efficient basis, including (a) which polynomials to keep or discard, (b) how many polynomials will be needed, (c) how to make sure the polynomials correctly reproduce the zero interaction at a large distance, (d) how to ensure special symmetries, and (e) how to calculate gradients efficiently. This article discusses how these questions can be answered by using a set of programs to choose and manipulate the polynomials as well as to write efficient Fortran programs for the calculation of energies and gradients. A user-friendly interface for access to monomial symmetrization approach results is also described. The software for these programs is now publicly available.
Collapse
Affiliation(s)
- Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Chen Qu
- Independent Researcher, Toronto, Ontario M9B0E3, Canada
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, Via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Jeffrey K Li
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
2
|
Bowman JM, Qu C, Conte R, Nandi A, Houston PL, Yu Q. Δ-Machine Learned Potential Energy Surfaces and Force Fields. J Chem Theory Comput 2023; 19:1-17. [PMID: 36527383 DOI: 10.1021/acs.jctc.2c01034] [Citation(s) in RCA: 25] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
There has been great progress in developing machine-learned potential energy surfaces (PESs) for molecules and clusters with more than 10 atoms. Unfortunately, this number of atoms generally limits the level of electronic structure theory to less than the "gold standard" CCSD(T) level. Indeed, for the well-known MD17 dataset for molecules with 9-20 atoms, all of the energies and forces were obtained with DFT calculations (PBE). This Perspective is focused on a Δ-machine learning method that we recently proposed and applied to bring DFT-based PESs to close to CCSD(T) accuracy. This is demonstrated for hydronium, N-methylacetamide, acetyl acetone, and ethanol. For 15-atom tropolone, it appears that special approaches (e.g., molecular tailoring, local CCSD(T)) are needed to obtain the CCSD(T) energies. A new aspect of this approach is the extension of Δ-machine learning to force fields. The approach is based on many-body corrections to polarizable force field potentials. This is examined in detail using the TTM2.1 water potential. The corrections make use of our recent CCSD(T) datasets for 2-b, 3-b, and 4-b interactions for water. These datasets were used to develop a new fully ab initio potential for water, termed q-AQUA.
Collapse
Affiliation(s)
- Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Independent Researcher, Toronto, Canada 66777
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States.,Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| |
Collapse
|
3
|
Bowman JM, Qu C, Conte R, Nandi A, Houston PL, Yu Q. The MD17 datasets from the perspective of datasets for gas-phase “small” molecule potentials. J Chem Phys 2022; 156:240901. [DOI: 10.1063/5.0089200] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
There has been great progress in developing methods for machine-learned potential energy surfaces. There have also been important assessments of these methods by comparing so-called learning curves on datasets of electronic energies and forces, notably the MD17 database. The dataset for each molecule in this database generally consists of tens of thousands of energies and forces obtained from DFT direct dynamics at 500 K. We contrast the datasets from this database for three “small” molecules, ethanol, malonaldehyde, and glycine, with datasets we have generated with specific targets for the potential energy surfaces (PESs) in mind: a rigorous calculation of the zero-point energy and wavefunction, the tunneling splitting in malonaldehyde, and, in the case of glycine, a description of all eight low-lying conformers. We found that the MD17 datasets are too limited for these targets. We also examine recent datasets for several PESs that describe small-molecule but complex chemical reactions. Finally, we introduce a new database, “QM-22,” which contains datasets of molecules ranging from 4 to 15 atoms that extend to high energies and a large span of configurations.
Collapse
Affiliation(s)
- Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Chen Qu
- Independent Researcher, Toronto, Canada
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, USA
| |
Collapse
|
4
|
Houston PL, Qu C, Nandi A, Conte R, Yu Q, Bowman JM. Permutationally invariant polynomial regression for energies and gradients, using reverse differentiation, achieves orders of magnitude speed-up with high precision compared to other machine learning methods. J Chem Phys 2022; 156:044120. [DOI: 10.1063/5.0080506] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Affiliation(s)
- Paul L. Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA and Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, USA
| | - Chen Qu
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland 20742, USA
| | - Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06511, USA
| | - Joel M. Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, USA
| |
Collapse
|
5
|
Nandi A, Qu C, Houston PL, Conte R, Yu Q, Bowman JM. A CCSD(T)-Based 4-Body Potential for Water. J Phys Chem Lett 2021; 12:10318-10324. [PMID: 34662138 DOI: 10.1021/acs.jpclett.1c03152] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
High-level, ab initio calculations find that the 4-body (4-b) interaction is needed to account for near-100% of the total interaction energy for water clusters as large as the 21-mer. Motivated by this, we report a permutationally invariant polynomial potential energy surface (PES) for the 4-body interaction. This machine-learned PES is a fit to 2119 symmetry-unique, CCSD(T)-F12a/haTZ 4-b interaction energies. Configurations for these come from tetramer direct-dynamics calculations, fragments from an MD water simulation at 300 K, and tetramer fragments in a variety of water clusters. The PIP basis is purified to ensure that the PES goes rigorously to zero in monomer+trimer and dimer+dimer dissociations. The 4-b energies of isomers of the hexamer calculated with the new PES are shown to be in better agreement with benchmark CCSD(T) results than those from the MB-pol potential. Tests on larger clusters further validate the high-fidelity of the PES. The PES is shown to be fast to evaluate, taking 2.4 s for 105 evaluations on a single core of 2.4 GHz Intel Xeon processor, and significantly faster using a parallel version of the PES.
Collapse
Affiliation(s)
- Apurba Nandi
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| | - Chen Qu
- Department of Chemistry & Biochemistry, University of Maryland, College Park, Maryland 20742, United States
| | - Paul L Houston
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, United States
- Department of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Riccardo Conte
- Dipartimento di Chimica, Università Degli Studi di Milano, via Golgi 19, 20133 Milano, Italy
| | - Qi Yu
- Department of Chemistry, Yale University, New Haven, Connecticut 06520, United States
| | - Joel M Bowman
- Department of Chemistry and Cherry L. Emerson Center for Scientific Computation, Emory University, Atlanta, Georgia 30322, United States
| |
Collapse
|