1
|
Agbaglo DA, Summers TJ, Cheng Q, DeYonker NJ. The influence of model building schemes and molecular dynamics sampling on QM-cluster models: the chorismate mutase case study. Phys Chem Chem Phys 2024; 26:12467-12482. [PMID: 38618904 PMCID: PMC11090134 DOI: 10.1039/d3cp06100k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2024]
Abstract
Most QM-cluster models of enzymes are constructed based on X-ray crystal structures, which limits comparison to in vivo structure and mechanism. The active site of chorismate mutase from Bacillus subtilis and the enzymatic transformation of chorismate to prephenate is used as a case study to guide construction of QM-cluster models built first from the X-ray crystal structure, then from molecular dynamics (MD) simulation snapshots. The Residue Interaction Network ResidUe Selector (RINRUS) software toolkit, developed by our group to simplify and automate the construction of QM-cluster models, is expanded to handle MD to QM-cluster model workflows. Several options, some employing novel topological clustering from residue interaction network (RIN) information, are evaluated for generating conformational clustering from MD simulation. RINRUS then generates a statistical thermodynamic framework for QM-cluster modeling of the chorismate mutase mechanism via refining 250 MD frames with density functional theory (DFT). The 250 QM-cluster models sampled provide a mean ΔG‡ of 10.3 ± 2.6 kcal mol-1 compared to the experimental value of 15.4 kcal mol-1 at 25 °C. While the difference between theory and experiment is consequential, the level of theory used is modest and therefore "chemical" accuracy is unexpected. More important are the comparisons made between QM-cluster models designed from the X-ray crystal structure versus those from MD frames. The large variations in kinetic and thermodynamic properties arise from geometric changes in the ensemble of QM-cluster models, rather from the composition of the QM-cluster models or from the active site-solvent interface. The findings open the way for further quantitative and reproducible calibration in the field of computational enzymology using the model construction framework afforded with the RINRUS software toolkit.
Collapse
Affiliation(s)
- Donatus A Agbaglo
- Department of Chemistry, University of Memphis, Memphis, TN 38152, USA.
| | - Thomas J Summers
- Department of Chemistry, University of Memphis, Memphis, TN 38152, USA.
| | - Qianyi Cheng
- Department of Chemistry, University of Memphis, Memphis, TN 38152, USA.
| | - Nathan J DeYonker
- Department of Chemistry, University of Memphis, Memphis, TN 38152, USA.
| |
Collapse
|
2
|
Manikandan M, Nicolini P, Hapala P. Computational Design of Photosensitive Polymer Templates To Drive Molecular Nanofabrication. ACS NANO 2024; 18:9969-9979. [PMID: 38545921 PMCID: PMC11008366 DOI: 10.1021/acsnano.3c10575] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2023] [Revised: 03/15/2024] [Accepted: 03/21/2024] [Indexed: 04/10/2024]
Abstract
Molecular electronics promises the ultimate level of miniaturization of computers and other machines as organic molecules are the smallest known physical objects with nontrivial structure and function. But despite the plethora of molecular switches, memories, and motors developed during the almost 50-years long history of molecular electronics, mass production of molecular computers is still an elusive goal. This is mostly due to the lack of scalable nanofabrication methods capable of rapidly producing complex structures (similar to silicon chips or living cells) with atomic precision and a small number of defects. Living nature solves this problem by using linear polymer templates encoding large volumes of structural information into sequence of hydrogen bonded end groups which can be efficiently replicated and which can drive assembly of other molecular components into complex supramolecular structures. In this paper, we propose a nanofabrication method based on a class of photosensitive polymers inspired by these natural principles, which can operate in concert with UV photolithography used for fabrication of current microelectronic processors. We believe that such a method will enable a smooth transition from silicon toward molecular nanoelectronics and photonics. To demonstrate its feasibility, we performed a computational screening of candidate molecules that can selectively bind and therefore allow the deterministic assembly of molecular components. In the process, we unearthed trends and design principles applicable beyond the immediate scope of our proposed nanofabrication method, e.g., to biologically relevant DNA analogues and molecular recognition within hydrogen-bonded systems.
Collapse
Affiliation(s)
- Mithun Manikandan
- Institute of Physics (FZU), Czech
Academy of Sciences, Na Slovance 2, 182 00 Prague, Czech Republic
| | - Paolo Nicolini
- Institute of Physics (FZU), Czech
Academy of Sciences, Na Slovance 2, 182 00 Prague, Czech Republic
| | - Prokop Hapala
- Institute of Physics (FZU), Czech
Academy of Sciences, Na Slovance 2, 182 00 Prague, Czech Republic
| |
Collapse
|
3
|
Xu P, Leonard SL, O'Brien W, Gordon MS. R -8 Dispersion Interaction: Derivation and Application to the Effective Fragment Potential Method. J Phys Chem A 2024; 128:292-327. [PMID: 38150458 DOI: 10.1021/acs.jpca.3c05115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2023]
Abstract
The anisotropic and isotropic R-8 dispersion contributions (disp8) are derived and implemented within the framework of the effective fragment potential (EFP) method formulated with imaginary frequency-dependent Cartesian polarizability tensors distributed at the centroids of the localized molecular orbitals (LMOs). Two forms of damping functions, intermolecular overlap-based and Tang-Toennies, are extended for disp8. To obtain LMO polarizability tensors centered at LMO centroids, an origin-shifting transformation is derived and implemented for the dipole-octopole polarizability tensor and the quadrupole-quadrupole polarizability tensor. The analytic gradient is derived and implemented for the isotropic disp8 contribution. Relative to the previously implemented empirical EFP disp8 energy, the isotropic disp8 component of the interaction energy improves the overall agreement of the EFP dispersion energies with the symmetry-adapted perturbation theory (SAPT) benchmarks, reducing the mean absolute errors (MAEs) and mean absolute percentage errors for most of the databases examined in this work. While the anisotropic disp8 can further enhance the accuracy of the EFP dispersion energy and yield smaller MAEs, significantly overbound dispersion energies are predicted by the anisotropic disp8 when the maximum element in the intermolecular overlap matrix is greater than 0.1, possibly due to the breakdown of the approximations made in the EFP dispersion derivation at a short range. For potential energy scan databases, the newly developed EFP dispersion model with isotropic disp8 yields the overall correct curvature and good agreement with SAPT benchmarks around equilibrium and longer but overestimates the dispersion interactions at a short range. While the overlap-based dispersion-damping functions produce better MAEs than Tang-Toennies damping functions, further improvement is needed to better screen the large attractive dispersion energies at a short range (overlap >0.1).
Collapse
Affiliation(s)
- Peng Xu
- Department of Chemistry, Iowa State University and Ames National Laboratory, Ames, Iowa 50014, United States
| | - Samuel L Leonard
- Department of Chemistry, Iowa State University and Ames National Laboratory, Ames, Iowa 50014, United States
| | - William O'Brien
- Science Undergraduate Research Internship (SULI): Department of Energy, Ames National Laboratory, Iowa State University, Ames, Iowa50011-3020, United States
| | - Mark S Gordon
- Department of Chemistry, Iowa State University and Ames National Laboratory, Ames, Iowa 50014, United States
| |
Collapse
|
4
|
Fan ZX, Chao SD. A Machine Learning Force Field for Bio-Macromolecular Modeling Based on Quantum Chemistry-Calculated Interaction Energy Datasets. Bioengineering (Basel) 2024; 11:51. [PMID: 38247928 PMCID: PMC11154266 DOI: 10.3390/bioengineering11010051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 12/23/2023] [Accepted: 12/25/2023] [Indexed: 01/23/2024] Open
Abstract
Accurate energy data from noncovalent interactions are essential for constructing force fields for molecular dynamics simulations of bio-macromolecular systems. There are two important practical issues in the construction of a reliable force field with the hope of balancing the desired chemical accuracy and working efficiency. One is to determine a suitable quantum chemistry level of theory for calculating interaction energies. The other is to use a suitable continuous energy function to model the quantum chemical energy data. For the first issue, we have recently calculated the intermolecular interaction energies using the SAPT0 level of theory, and we have systematically organized these energies into the ab initio SOFG-31 (homodimer) and SOFG-31-heterodimer datasets. In this work, we re-calculate these interaction energies by using the more advanced SAPT2 level of theory with a wider series of basis sets. Our purpose is to determine the SAPT level of theory proper for interaction energies with respect to the CCSD(T)/CBS benchmark chemical accuracy. Next, to utilize these energy datasets, we employ one of the well-developed machine learning techniques, called the CLIFF scheme, to construct a general-purpose force field for biomolecular dynamics simulations. Here we use the SOFG-31 dataset and the SOFG-31-heterodimer dataset as the training and test sets, respectively. Our results demonstrate that using the CLIFF scheme can reproduce a diverse range of dimeric interaction energy patterns with only a small training set. The overall errors for each SAPT energy component, as well as the SAPT total energy, are all well below the desired chemical accuracy of ~1 kcal/mol.
Collapse
Affiliation(s)
- Zhen-Xuan Fan
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
5
|
Demir Gİ, Tekin A. NICE-FF: A non-empirical, intermolecular, consistent, and extensible force field for nucleic acids and beyond. J Chem Phys 2023; 159:244117. [PMID: 38153156 DOI: 10.1063/5.0176641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 12/04/2023] [Indexed: 12/29/2023] Open
Abstract
A new non-empirical ab initio intermolecular force field (NICE-FF in buffered 14-7 potential form) has been developed for nucleic acids and beyond based on the dimer interaction energies (IEs) calculated at the spin component scaled-MI-second order Møller-Plesset perturbation theory. A fully automatic framework has been implemented for this purpose, capable of generating well-polished computational grids, performing the necessary ab initio calculations, conducting machine learning (ML) assisted force field (FF) parametrization, and extending existing FF parameters by incorporating new atom types. For the ML-assisted parametrization of NICE-FF, interaction energies of ∼18 000 dimer geometries (with IE < 0) were used, and the best fit gave a mean square deviation of about 0.46 kcal/mol. During this parametrization, atom types apparent in four deoxyribonucleic acid (DNA) bases have been first trained using the generated DNA base datasets. Both uracil and hypoxanthine, which contain the same atom types found in DNA bases, have been considered as test molecules. Three new atom types have been added to the DNA atom types by using IE datasets of both pyrazinamide and 9-methylhypoxanthine. Finally, the last test molecule, theophylline, has been selected, which contains already-fitted atom-type parameters. The performance of NICE-FF has been investigated on the S22 dataset, and it has been found that NICE-FF outperforms the well-known FFs by generating the most consistent IEs with the high-level ab initio ones. Moreover, NICE-FF has been integrated into our in-house developed crystal structure prediction (CSP) tool [called FFCASP (Fast and Flexible CrystAl Structure Predictor)], aiming to find the experimental crystal structures of all considered molecules. CSPs, which were performed up to 4 formula units (Z), resulted in NICE-FF being able to locate almost all the known experimental crystal structures with sufficiently low RMSD20 values to provide good starting points for density functional theory optimizations.
Collapse
Affiliation(s)
- Gözde İniş Demir
- Informatics Institute, Istanbul Technical University, 34469 Maslak, Istanbul, Türkiye
| | - Adem Tekin
- Informatics Institute, Istanbul Technical University, 34469 Maslak, Istanbul, Türkiye
- Research Institute for Fundamental Sciences (TÜBİTAK-TBAE), Kocaeli, Türkiye
| |
Collapse
|
6
|
Beran GJO, Greenwell C, Cook C, Řezáč J. Improved Description of Intra- and Intermolecular Interactions through Dispersion-Corrected Second-Order Møller-Plesset Perturbation Theory. Acc Chem Res 2023; 56:3525-3534. [PMID: 37963266 DOI: 10.1021/acs.accounts.3c00578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2023]
Abstract
ConspectusThe quantum chemical modeling of organic crystals and other molecular condensed-phase problems requires computationally affordable electronic structure methods which can simultaneously describe intramolecular conformational energies and intermolecular interactions accurately. To achieve this, we have developed a spin-component-scaled, dispersion-corrected second-order Møller-Plesset perturbation theory (SCS-MP2D) model. SCS-MP2D augments canonical MP2 with a dispersion correction which removes the uncoupled Hartree-Fock dispersion energy present in canonical MP2 and replaces it with a more reliable coupled Kohn-Sham treatment, all evaluated within the framework of Grimme's D3 dispersion model. The spin-component scaling is then used to improve the description of the residual (nondispersion) portion of the correlation energy.The SCS-MP2D model improves upon earlier corrected MP2 models in a few ways. Compared to the highly successful dispersion-corrected MP2C model, which is based solely on intermolecular perturbation theory, the SCS-MP2D dispersion correction improves the description of both inter- and intramolecular interactions. The dispersion correction can also be evaluated with trivial computational cost, and nuclear analytic gradients are computed readily to enable geometry optimizations. In contrast to earlier spin-component scaling MP2 models, the optimal spin-component scaling coefficients are only mildly sensitive to the choice of training data, and a single global parametrization of the model can describe both thermochemistry and noncovalent interactions.The resulting dispersion-corrected, spin-component-scaled MP2 (SCS-MP2D) model predicts conformational energies and intermolecular interactions with accuracy comparable to or better than those of many range-separated and double-hybrid density functionals, as is demonstrated on a variety of benchmark tests. Among the functionals considered here, only the revDSD-PBEP86-D3(BJ) functional gives consistently smaller errors in benchmark tests. The results presented also hint that further improvements of SCS-MP2D may be possible through a more robust fitting procedure for the seven empirical parameters.To demonstrate the performance of SCS-MP2D further, several applications to molecular crystal problems are presented. The three chosen examples all represent cases where density-driven delocalization error causes GGA or hybrid density functionals to artificially stabilize crystals exhibiting more extended π-conjugation. Our pragmatic strategy addresses the delocalization error by combining a periodic density functional theory (DFT) treatment of the infinite lattice with intramolecular/conformational energy corrections computed with SCS-MP2D. For the anticancer drug axitinib, applying the SCS-MP2D conformational energy correction produces crystal polymorph stabilities that are consistent with experiment, in contrast to earlier studies. For the crystal structure prediction of the ROY molecule, so named for its colorful red, orange, and yellow crystals, this approach leads to the first plausible crystal energy landscape, and it reveals that the lowest-energy polymorphs have already been found experimentally. Finally, in the context of photomechanical crystals, which transform light into mechanical work, these techniques are used to predict the structural transformations and extract design principles for maximizing the work performed.
Collapse
Affiliation(s)
- Gregory J O Beran
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Chandler Greenwell
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Cameron Cook
- Department of Chemistry, University of California, Riverside, California 92521, United States
| | - Jan Řezáč
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, 160 00 Prague, Czech Republic
| |
Collapse
|
7
|
Chen JA, Chao SD. Intermolecular Non-Bonded Interactions from Machine Learning Datasets. Molecules 2023; 28:7900. [PMID: 38067629 PMCID: PMC10707888 DOI: 10.3390/molecules28237900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 04/04/2024] Open
Abstract
Accurate determination of intermolecular non-covalent-bonded or non-bonded interactions is the key to potentially useful molecular dynamics simulations of polymer systems. However, it is challenging to balance both the accuracy and computational cost in force field modelling. One of the main difficulties is properly representing the calculated energy data as a continuous force function. In this paper, we employ well-developed machine learning techniques to construct a general purpose intermolecular non-bonded interaction force field for organic polymers. The original ab initio dataset SOFG-31 was calculated by us and has been well documented, and here we use it as our training set. The CLIFF kernel type machine learning scheme is used for predicting the interaction energies of heterodimers selected from the SOFG-31 dataset. Our test results show that the overall errors are well below the chemical accuracy of about 1 kcal/mol, thus demonstrating the promising feasibility of machine learning techniques in force field modelling.
Collapse
Affiliation(s)
- Jia-An Chen
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
8
|
Thürlemann M, Riniker S. Hybrid classical/machine-learning force fields for the accurate description of molecular condensed-phase systems. Chem Sci 2023; 14:12661-12675. [PMID: 38020395 PMCID: PMC10646964 DOI: 10.1039/d3sc04317g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2023] [Accepted: 10/24/2023] [Indexed: 12/01/2023] Open
Abstract
Electronic structure methods offer in principle accurate predictions of molecular properties, however, their applicability is limited by computational costs. Empirical methods are cheaper, but come with inherent approximations and are dependent on the quality and quantity of training data. The rise of machine learning (ML) force fields (FFs) exacerbates limitations related to training data even further, especially for condensed-phase systems for which the generation of large and high-quality training datasets is difficult. Here, we propose a hybrid ML/classical FF model that is parametrized exclusively on high-quality ab initio data of dimers and monomers in vacuum but is transferable to condensed-phase systems. The proposed hybrid model combines our previous ML-parametrized classical model with ML corrections for situations where classical approximations break down, thus combining the robustness and efficiency of classical FFs with the flexibility of ML. Extensive validation on benchmarking datasets and experimental condensed-phase data, including organic liquids and small-molecule crystal structures, showcases how the proposed approach may promote FF development and unlock the full potential of classical FFs.
Collapse
Affiliation(s)
- Moritz Thürlemann
- Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 Zürich 8093 Switzerland
| | - Sereina Riniker
- Department of Chemistry and Applied Biosciences, ETH Zürich Vladimir-Prelog-Weg 2 Zürich 8093 Switzerland
| |
Collapse
|
9
|
Carter-Fenk K, Liu M, Pujal L, Loipersberger M, Tsanai M, Vernon RM, Forman-Kay JD, Head-Gordon M, Heidar-Zadeh F, Head-Gordon T. The Energetic Origins of Pi-Pi Contacts in Proteins. J Am Chem Soc 2023; 145. [PMID: 37917924 PMCID: PMC10655088 DOI: 10.1021/jacs.3c09198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2023] [Revised: 10/04/2023] [Accepted: 10/05/2023] [Indexed: 11/04/2023]
Abstract
Accurate potential energy models of proteins must describe the many different types of noncovalent interactions that contribute to a protein's stability and structure. Pi-pi contacts are ubiquitous structural motifs in all proteins, occurring between aromatic and nonaromatic residues and play a nontrivial role in protein folding and in the formation of biomolecular condensates. Guided by a geometric criterion for isolating pi-pi contacts from classical molecular dynamics simulations of proteins, we use quantum mechanical energy decomposition analysis to determine the molecular interactions that stabilize different pi-pi contact motifs. We find that neutral pi-pi interactions in proteins are dominated by Pauli repulsion and London dispersion rather than repulsive quadrupole electrostatics, which is central to the textbook Hunter-Sanders model. This results in a notable lack of variability in the interaction profiles of neutral pi-pi contacts even with extreme changes in the dielectric medium, explaining the prevalence of pi-stacked arrangements in and between proteins. We also find interactions involving pi-containing anions and cations to be extremely malleable, interacting like neutral pi-pi contacts in polar media and like typical ion-pi interactions in nonpolar environments. Like-charged pairs such as arginine-arginine contacts are particularly sensitive to the polarity of their immediate surroundings and exhibit canonical pi-pi stacking behavior only if the interaction is mediated by environmental effects, such as aqueous solvation.
Collapse
Affiliation(s)
- Kevin Carter-Fenk
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Meili Liu
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Leila Pujal
- Department
of Chemistry, Queen’s University, Kingston, Ontario K7L 3N6, Canada
| | - Matthias Loipersberger
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Maria Tsanai
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Robert M. Vernon
- Molecular
Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department
of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Julie D. Forman-Kay
- Molecular
Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada
- Department
of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada
| | - Martin Head-Gordon
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
| | - Farnaz Heidar-Zadeh
- Department
of Chemistry, Queen’s University, Kingston, Ontario K7L 3N6, Canada
- Center
for Molecular Modeling (CMM), Ghent University, 9052 Zwijnaarde, Belgium
| | - Teresa Head-Gordon
- Kenneth
S. Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemistry, University of California, Berkeley, California 94720, United States
- Department
of Chemical and Biomolecular Engineering, University of California, Berkeley, California 94720, United States
- Department
of Bioengineering, University of California, Berkeley, California 94720, United States
| |
Collapse
|
10
|
Huguenin-Dumittan K, Loche P, Haoran N, Ceriotti M. Physics-Inspired Equivariant Descriptors of Nonbonded Interactions. J Phys Chem Lett 2023; 14:9612-9618. [PMID: 37862712 PMCID: PMC10626632 DOI: 10.1021/acs.jpclett.3c02375] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Accepted: 10/13/2023] [Indexed: 10/22/2023]
Abstract
One essential ingredient in many machine learning (ML) based methods for atomistic modeling of materials and molecules is the use of locality. While allowing better system-size scaling, this systematically neglects long-range (LR) effects such as electrostatic or dispersion interactions. We present an extension of the long distance equivariant (LODE) framework that can handle diverse LR interactions in a consistent way and seamlessly integrates with preexisting methods by building new sets of atom centered features. We provide a direct physical interpretation of these using the multipole expansion, which allows for simpler and more efficient implementations. The framework is applied to simple toy systems as proof of concept and a heterogeneous set of molecular dimers to push the method to its limits. By generalizing LODE to arbitrary asymptotic behaviors, we provide a coherent approach to treat arbitrary two- and many-body nonbonded interactions in the data-driven modeling of matter.
Collapse
Affiliation(s)
- Kevin
K. Huguenin-Dumittan
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Philip Loche
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Ni Haoran
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory
of Computational Science and Modeling, IMX,
École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
11
|
Ochieng SA, Patkowski K. Accurate three-body noncovalent interactions: the insights from energy decomposition. Phys Chem Chem Phys 2023; 25:28621-28637. [PMID: 37874287 DOI: 10.1039/d3cp03938b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
An impressive collection of accurate two-body interaction energies for small complexes has been assembled into benchmark databases and used to improve the performance of multiple density functional, semiempirical, and machine learning methods. Similar benchmark data on nonadditive three-body energies in molecular trimers are comparatively scarce, and the existing ones are practically limited to homotrimers. In this work, we present a benchmark dataset of 20 equilibrium noncovalent interaction energies for a small but diverse selection of 10 heteromolecular trimers. The new 3BHET dataset presents complexes that combine different interactions including π-π, anion-π, cation-π, and various motifs of hydrogen and halogen bonding in each trimer. A detailed symmetry-adapted perturbation theory (SAPT)-based energy decomposition of the two- and three-body interaction energies shows that 3BHET consists of electrostatics- and dispersion-dominated complexes. The nonadditive three-body contribution is dominated by induction, but its influence on the overall bonding type in the complex (as exemplified by its position on the ternary diagram) is quite small. We also tested the extended SAPT (XSAPT) approach which is capable of including some nonadditive interactions in clusters of any size. The resulting three-body dispersion term (obtained from the many-body dispersion formalism) is mostly in good agreement with the supermolecular CCSD(T)-MP2 values and the nonadditive induction term is similar to the three-body SAPT(DFT) data, but the overall three-body XSAPT energies are not very accurate as they are missing the first-order exchange terms.
Collapse
Affiliation(s)
- Sharon A Ochieng
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama 36849, USA.
| | - Konrad Patkowski
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama 36849, USA.
| |
Collapse
|
12
|
Nickerson CJ, Bryenton KR, Price AJA, Johnson ER. Comparison of Density-Functional Theory Dispersion Corrections for the DES15K Database. J Phys Chem A 2023; 127:8712-8722. [PMID: 37793049 DOI: 10.1021/acs.jpca.3c04332] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/06/2023]
Abstract
While density-functional theory (DFT) remains one of the most widely used tools in computational chemistry, most functionals fail to properly account for the effects of London dispersion. Hence, there are many popular post-self-consistent methods to add a dispersion correction to the DFT energy. Until now, the most popular methods have never been compared on equal footing due to not being implemented in the same electronic structure packages. In this work, we performed a large-scale benchmarking study, directly comparing the accuracy of the exchange-hole dipole moment (XDM), D3BJ, D4, TS, MBD, and MBD-NL dispersion models when applied to the recent DES15K database of nearly 15,000 molecular complexes at both expanded and compressed geometries. Our study showed similarly good performance for all dispersion methods (except TS) when applied to neutral complexes. However, they all performed worse for ionic complexes, particularly those involving dications of alkaline earth metals, due to systematic overbinding by the base PBE0 density functional. Investigation of the largest outliers also revealed that only the MBD and MBD-NL methods demonstrate surprising errors for complexes involving alkali metal cations at compressed geometries where they tended to significantly overbind. As we would expect minimal dispersion binding for such complexes, we further investigated the origins of these errors for the potential energy curve of a model cation-π complex. Overall, there is little choice between the XDM, D3BJ, D4, MBD, and MBD-NL dispersion methods for most systems. However, the MBD-based methods are not recommended for complexes involving organic species and alkali or alkaline earth metal cations, for example when modeling Li+ intercalation into graphite.
Collapse
Affiliation(s)
- Cameron J Nickerson
- Department of Physics and Atmospheric Science, Dalhousie University, 6310 Coburg Rd, Halifax, Nova Scotia B3H 4R2, Canada
| | - Kyle R Bryenton
- Department of Physics and Atmospheric Science, Dalhousie University, 6310 Coburg Rd, Halifax, Nova Scotia B3H 4R2, Canada
| | - Alastair J A Price
- Department of Chemistry, Dalhousie University, 6274 Coburg Rd, Halifax, Nova Scotia B3H 4R2, Canada
| | - Erin R Johnson
- Department of Chemistry, Dalhousie University, 6274 Coburg Rd, Halifax, Nova Scotia B3H 4R2, Canada
| |
Collapse
|
13
|
Spronk SA, Glick ZL, Metcalf DP, Sherrill CD, Cheney DL. A quantum chemical interaction energy dataset for accurately modeling protein-ligand interactions. Sci Data 2023; 10:619. [PMID: 37699937 PMCID: PMC10497680 DOI: 10.1038/s41597-023-02443-1] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 08/03/2023] [Indexed: 09/14/2023] Open
Abstract
Fast and accurate calculation of intermolecular interaction energies is desirable for understanding many chemical and biological processes, including the binding of small molecules to proteins. The Splinter ["Symmetry-adapted perturbation theory (SAPT0) protein-ligand interaction"] dataset has been created to facilitate the development and improvement of methods for performing such calculations. Molecular fragments representing commonly found substructures in proteins and small-molecule ligands were paired into >9000 unique dimers, assembled into numerous configurations using an approach designed to adequately cover the breadth of the dimers' potential energy surfaces while enhancing sampling in favorable regions. ~1.5 million configurations of these dimers were randomly generated, and a structurally diverse subset of these were minimized to obtain an additional ~80 thousand local and global minima. For all >1.6 million configurations, SAPT0 calculations were performed with two basis sets to complete the dataset. It is expected that Splinter will be a useful benchmark dataset for training and testing various methods for the calculation of intermolecular interaction energies.
Collapse
Affiliation(s)
- Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P. O. Box 5400, Princeton, NJ, 08543, USA.
| | - Zachary L Glick
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA
| | - Derek P Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, 30332-0400, USA.
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P. O. Box 5400, Princeton, NJ, 08543, USA
| |
Collapse
|
14
|
Lu T, Chen Q. Simple, Efficient, and Universal Energy Decomposition Analysis Method Based on Dispersion-Corrected Density Functional Theory. J Phys Chem A 2023; 127:7023-7035. [PMID: 37582201 DOI: 10.1021/acs.jpca.3c04374] [Citation(s) in RCA: 22] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/17/2023]
Abstract
Energy decomposition analysis (EDA) is an important class of methods to explore the nature of interaction between fragments in a chemical system. It can decompose the interaction energy into different physical components to understand the factors that play key roles in the interaction. This work proposes an EDA strategy based on dispersion-corrected density functional theory (DFT), called sobEDA. This method is fairly easy to implement and very universal. It can be used to study weak interactions, chemical bond interactions, open-shell systems, and interactions between multiple fragments. The total time consumption of sobEDA is only about twice that of conventional DFT single-point calculation for the entire system. This work also proposes a variant of the sobEDA method named sobEDAw, which is designed specifically for decomposing weak interaction energies. Through a proper combination of DFT correlation energy and dispersion correction term, sobEDAw gives a ratio between dispersion energy and electrostatic energy that is highly consistent with the symmetry-adapted perturbation theory, which is quite popular and robust in studying weak interactions but expensive. We present a shell script sobEDA.sh to implement the methods proposed in this work based on the very popular Gaussian quantum chemistry program and Multiwfn wavefunction analysis code. Via the script, theoretical chemists can use the sobEDA and sobEDAw methods very conveniently in their study. Through a series of examples, the rationality of the new methods and their implementation are verified, and their great practical values in the study of various chemical systems are demonstrated.
Collapse
Affiliation(s)
- Tian Lu
- Beijing Kein Research Center for Natural Sciences, Beijing 100024, P.R. China
| | - Qinxue Chen
- Beijing Kein Research Center for Natural Sciences, Beijing 100024, P.R. China
| |
Collapse
|
15
|
Ng WP, Liang Q, Yang J. Low-Data Deep Quantum Chemical Learning for Accurate MP2 and Coupled-Cluster Correlations. J Chem Theory Comput 2023; 19:5439-5449. [PMID: 37506400 DOI: 10.1021/acs.jctc.3c00518] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/30/2023]
Abstract
Accurate ab initio prediction of electronic energies is very expensive for macromolecules by explicitly solving post-Hartree-Fock equations. We here exploit the physically justified local correlation feature in a compact basis of small molecules and construct an expressive low-data deep neural network (dNN) model to obtain machine-learned electron correlation energies on par with MP2 and CCSD levels of theory for more complex molecules and different datasets that are not represented in the training set. We show that our dNN-powered model is data efficient and makes highly transferable predictions across alkanes of various lengths, organic molecules with non-covalent and biomolecular interactions, as well as water clusters of different sizes and morphologies. In particular, by training 800 (H2O)8 clusters with the local correlation descriptors, accurate MP2/cc-pVTZ correlation energies up to (H2O)128 can be predicted with a small random error within chemical accuracy from exact values, while a majority of prediction deviations are attributed to an intrinsically systematic error. Our results reveal that an extremely compact local correlation feature set, which is poor for any direct post-Hartree-Fock calculations, has however a prominent advantage in reserving important electron correlation patterns for making accurate transferable predictions across distinct molecular compositions, bond types, and geometries.
Collapse
Affiliation(s)
- Wai-Pan Ng
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| | - Qiujiang Liang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
| | - Jun Yang
- Department of Chemistry, The University of Hong Kong, Hong Kong 999077, P. R. China
- Hong Kong Quantum AI Lab Limited, Hong Kong 999077, P. R. China
| |
Collapse
|
16
|
Villot C, Lao KU. Electronic structure theory on modeling short-range noncovalent interactions between amino acids. J Chem Phys 2023; 158:094301. [PMID: 36889981 DOI: 10.1063/5.0138032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/11/2023] Open
Abstract
While short-range noncovalent interactions (NCIs) are proving to be of importance in many chemical and biological systems, these atypical bindings happen within the so-called van der Waals envelope and pose an enormous challenge for current computational methods. We introduce SNCIAA, a database of 723 benchmark interaction energies of short-range noncovalent interactions between neutral/charged amino acids originated from protein x-ray crystal structures at the "gold standard" coupled-cluster with singles, doubles, and perturbative triples/complete basis set [CCSD(T)/CBS] level of theory with a mean absolute binding uncertainty less than 0.1 kcal/mol. Subsequently, a systematic assessment of commonly used computational methods, such as the second-order Møller-Plesset theory (MP2), density functional theory (DFT), symmetry-adapted perturbation theory (SAPT), composite electronic-structure methods, semiempirical approaches, and the physical-based potentials with machine learning (IPML) on SNCIAA is carried out. It is shown that the inclusion of dispersion corrections is essential even though these dimers are dominated by electrostatics, such as hydrogen bondings and salt bridges. Overall, MP2, ωB97M-V, and B3LYP+D4 turned out to be the most reliable methods for the description of short-range NCIs even in strongly attractive/repulsive complexes. SAPT is also recommended in describing short-range NCIs only if the δMP2 correction has been included. The good performance of IPML for dimers at close-equilibrium and long-range conditions is not transferable to the short-range. We expect that SNCIAA will assist the development/improvement/validation of computational methods, such as DFT, force-fields, and ML models, in describing NCIs across entire potential energy surfaces (short-, intermediate-, and long-range NCIs) on the same footing.
Collapse
Affiliation(s)
- Corentin Villot
- Department of Chemistry, Virginia Commonwealth University, Richmond, Virginia 23284, USA
| | - Ka Un Lao
- Department of Chemistry, Virginia Commonwealth University, Richmond, Virginia 23284, USA
| |
Collapse
|
17
|
Summers TJ, Hemmati R, Miller JE, Agbaglo DA, Cheng Q, DeYonker NJ. Evaluating the active site-substrate interplay between x-ray crystal structure and molecular dynamics in chorismate mutase. J Chem Phys 2023; 158:065101. [PMID: 36792523 DOI: 10.1063/5.0127106] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023] Open
Abstract
Designing realistic quantum mechanical (QM) models of enzymes is dependent on reliably discerning and modeling residues, solvents, and cofactors important in crafting the active site microenvironment. Interatomic van der Waals contacts have previously demonstrated usefulness toward designing QM-models, but their measured values (and subsequent residue importance rankings) are expected to be influenceable by subtle changes in protein structure. Using chorismate mutase as a case study, this work examines the differences in ligand-residue interatomic contacts between an x-ray crystal structure and structures from a molecular dynamics simulation. Select structures are further analyzed using symmetry adapted perturbation theory to compute ab initio ligand-residue interaction energies. The findings of this study show that ligand-residue interatomic contacts measured for an x-ray crystal structure are not predictive of active site contacts from a sampling of molecular dynamics frames. In addition, the variability in interatomic contacts among structures is not correlated with variability in interaction energies. However, the results spotlight using interaction energies to characterize and rank residue importance in future computational enzymology workflows.
Collapse
Affiliation(s)
- Thomas J Summers
- Department of Chemistry, The University of Memphis, 213 Smith Chemistry Building, Memphis, Tennessee 38152-3550, USA
| | - Reza Hemmati
- Department of Chemistry, The University of Memphis, 213 Smith Chemistry Building, Memphis, Tennessee 38152-3550, USA
| | - Justin E Miller
- Department of Chemistry, The University of Memphis, 213 Smith Chemistry Building, Memphis, Tennessee 38152-3550, USA
| | - Donatus A Agbaglo
- Department of Chemistry, The University of Memphis, 213 Smith Chemistry Building, Memphis, Tennessee 38152-3550, USA
| | - Qianyi Cheng
- Department of Chemistry, The University of Memphis, 213 Smith Chemistry Building, Memphis, Tennessee 38152-3550, USA
| | - Nathan J DeYonker
- Department of Chemistry, The University of Memphis, 213 Smith Chemistry Building, Memphis, Tennessee 38152-3550, USA
| |
Collapse
|
18
|
Kříž K, Schmidt L, Andersson AT, Walz MM, van der Spoel D. An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
Affiliation(s)
- Kristian Kříž
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Lisa Schmidt
- Faculty
of Biosciences, University of Heidelberg, Heidelberg69117, Germany
| | - Alfred T. Andersson
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Marie-Madeleine Walz
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - David van der Spoel
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| |
Collapse
|
19
|
Thürlemann M, Böselt L, Riniker S. Regularized by Physics: Graph Neural Network Parametrized Potentials for the Description of Intermolecular Interactions. J Chem Theory Comput 2023; 19:562-579. [PMID: 36633918 PMCID: PMC9878731 DOI: 10.1021/acs.jctc.2c00661] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Indexed: 01/13/2023]
Abstract
Simulations of molecular systems using electronic structure methods are still not feasible for many systems of biological importance. As a result, empirical methods such as force fields (FF) have become an established tool for the simulation of large and complex molecular systems. The parametrization of FF is, however, time-consuming and has traditionally been based on experimental data. Recent years have therefore seen increasing efforts to automatize FF parametrization or to replace FF with machine-learning (ML) based potentials. Here, we propose an alternative strategy to parametrize FF, which makes use of ML and gradient-descent based optimization while retaining a functional form founded in physics. Using a predefined functional form is shown to enable interpretability, robustness, and efficient simulations of large systems over long time scales. To demonstrate the strength of the proposed method, a fixed-charge and a polarizable model are trained on ab initio potential-energy surfaces. Given only information about the constituting elements, the molecular topology, and reference potential energies, the models successfully learn to assign atom types and corresponding FF parameters from scratch. The resulting models and parameters are validated on a wide range of experimentally and computationally derived properties of systems including dimers, pure liquids, and molecular crystals.
Collapse
Affiliation(s)
- Moritz Thürlemann
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Lennard Böselt
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
20
|
A Cost Effective Scheme for the Highly Accurate Description of Intermolecular Binding in Large Complexes. Int J Mol Sci 2022; 23:ijms232415773. [PMID: 36555413 PMCID: PMC9780852 DOI: 10.3390/ijms232415773] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Revised: 11/23/2022] [Accepted: 12/07/2022] [Indexed: 12/15/2022] Open
Abstract
There has been a growing interest in quantitative predictions of the intermolecular binding energy of large complexes. One of the most important quantum chemical techniques capable of such predictions is the domain-based local pair natural orbital (DLPNO) scheme for the coupled cluster theory with singles, doubles, and iterative triples [CCSD(T)], whose results are extrapolated to the complete basis set (CBS) limit. Here, the DLPNO-based focal-point method is devised with the aim of obtaining CBS-extrapolated values that are very close to their canonical CCSD(T)/CBS counterparts, and thus may serve for routinely checking a performance of less expensive computational methods, for example, those based on the density-functional theory (DFT). The efficacy of this method is demonstrated for several sets of noncovalent complexes with varying amounts of the electrostatics, induction, and dispersion contributions to binding (as revealed by accurate DFT-based symmetry-adapted perturbation theory (SAPT) calculations). It is shown that when applied to dimeric models of poly(3-hydroxybutyrate) chains in its two polymorphic forms, the DLPNO-CCSD(T) and DFT-SAPT computational schemes agree to within about 2 kJ/mol of an absolute value of the interaction energy. These computational schemes thus should be useful for a reliable description of factors leading to the enthalpic stabilization of extended systems.
Collapse
|
21
|
Tuca E, DiLabio G, Otero-de-la-Roza A. Minimal Basis Set Hartree-Fock Corrected with Atom-Centered Potentials for Molecular Crystal Modeling and Crystal Structure Prediction. J Chem Inf Model 2022; 62:4107-4121. [PMID: 35980964 DOI: 10.1021/acs.jcim.2c00656] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Crystal structure prediction (CSP), determining the experimentally observable structure of a molecular crystal from the molecular diagram, is an important challenge with technologically relevant applications in materials manufacturing and drug design. For the purpose of screening the randomly generated candidate crystal structures, CSP protocols require energy ranking methods that are fast and can accurately capture the small energy differences between molecular crystals. In addition, a good ranking method should also produce accurate equilibrium geometries, both intramolecular and intermolecular. In this article, we explore the combination of minimal-basis-set Hartree-Fock (HF) with atom-centered potentials (ACPs) as a method for modeling the structure and energetics of molecular crystals. The ACPs are developed for the H, C, N, and O atoms and fitted to a set of reference data at the B86bPBE-XDM level in order to mitigate basis-set incompleteness and missing correlation. In particular, ACPs are developed in combination with two methods: HF-D3/MINIs and HF-3c. The application of ACPs greatly improves the performance of HF-D3/MINIs for lattice energies, crystal energy differences, energy-volume and energy-strain relations, and crystal geometries. In the case of HF-3c, the improvement in the crystal energy differences is much smaller than in HF-D3/MINIs, but lattice energies and particularly crystal geometries are considerably better when ACPs are used. The resulting methods may be useful for CSP but also for quick calculation of molecular crystal lattice energies and geometries.
Collapse
Affiliation(s)
- Emilian Tuca
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna V1 V 1 V7, British Columbia, Canada
| | - Gino DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna V1 V 1 V7, British Columbia, Canada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica and MALTA-Consolider Team, Facultad de Química, Universidad de Oviedo, 33006 Oviedo, Spain
| |
Collapse
|
22
|
Informing geometric deep learning with electronic interactions to accelerate quantum chemistry. Proc Natl Acad Sci U S A 2022; 119:e2205221119. [PMID: 35901215 PMCID: PMC9351474 DOI: 10.1073/pnas.2205221119] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
Predicting electronic energies, densities, and related chemical properties can facilitate the discovery of novel catalysts, medicines, and battery materials. However, existing machine learning techniques are challenged by the scarcity of training data when exploring unknown chemical spaces. We overcome this barrier by systematically incorporating knowledge of molecular electronic structure into deep learning. By developing a physics-inspired equivariant neural network, we introduce a method to learn molecular representations based on the electronic interactions among atomic orbitals. Our method, OrbNet-Equi, leverages efficient tight-binding simulations and learned mappings to recover high-fidelity physical quantities. OrbNet-Equi accurately models a wide spectrum of target properties while being several orders of magnitude faster than density functional theory. Despite only using training samples collected from readily available small-molecule libraries, OrbNet-Equi outperforms traditional semiempirical and machine learning-based methods on comprehensive downstream benchmarks that encompass diverse main-group chemical processes. Our method also describes interactions in challenging charge-transfer complexes and open-shell systems. We anticipate that the strategy presented here will help to expand opportunities for studies in chemistry and materials science, where the acquisition of experimental or reference training data is costly.
Collapse
|
23
|
Huang HH, Wang YS, Chao SD. A Minimum Quantum Chemistry CCSD(T)/CBS Data Set of Dimeric Interaction Energies for Small Organic Functional Groups: Heterodimers. ACS OMEGA 2022; 7:20059-20080. [PMID: 35722020 PMCID: PMC9201891 DOI: 10.1021/acsomega.2c01888] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 05/04/2022] [Indexed: 06/15/2023]
Abstract
We extend our previous quantum chemistry calculations of interaction energies for 31 homodimers of small organic functional groups (the SOFG-31 data set) by including 239 heterodimers with monomers selected within the SOFG-31 data set, thus resulting in the SOFG-31+239 data set. The minimum-level theoretical scheme contains (1) the basis set superposition error corrected supermolecule (BSSE-SM) approach for intermolecular interactions; (2) the second-order Møller-Plesset perturbation theory (MP2) with the Dunning's aug-cc-pVXZ (X = D, T, Q) basis sets for the geometry optimization and correlation energy calculations; and (3) the single-point energy calculations with the coupled cluster with single, double, and perturbative triple excitations method at the complete basis set limit [CCSD(T)/CBS] using the well-tested extrapolation methods for the MP2 energy calibrations. In addition, we have performed a parallel series of energy decomposition calculations based on the symmetry adapted perturbation theory (SAPT) in order to gain chemical insights. That the above procedure cannot be further reduced has been proven to be very crucial for constructing reliable data sets of interaction energies. The calculated CCSD(T)/CBS interaction energy data can serve as a benchmark for testing or training less accurate but more efficient calculation methods, such as the electronic density functional theory. As an application, we employ a segmental SAPT model previously developed for the SOFG-31 data set to predict binding energies of large heterodimer complexes. These model energy "quanta" can be used in coarse-grained molecular dynamics simulations by avoiding large-scale calculations.
Collapse
Affiliation(s)
- Hsing-Hsiang Huang
- Institute
of Applied Mechanics, National Taiwan University, Taipei 10617, Taiwan R.O.C.
| | - Yi-Siang Wang
- School
of Chemistry & Biochemistry, Georgia
Institute of Technology, Atlanta, Georgia 30332, United States
| | - Sheng D. Chao
- Institute
of Applied Mechanics, National Taiwan University, Taipei 10617, Taiwan R.O.C.
| |
Collapse
|
24
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Small-Basis Set Density-Functional Theory Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2913-2930. [PMID: 35412817 DOI: 10.1021/acs.jctc.2c00036] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Density functional theory (DFT) is currently the most popular method for modeling noncovalent interactions and thermochemistry. The accurate calculation of noncovalent interaction energies, reaction energies, and barrier heights requires choosing an appropriate functional and, typically, a relatively large basis set. Deficiencies of the density-functional approximation and the use of a limited basis set are the leading sources of error in the calculation of noncovalent and thermochemical properties in molecular systems. In this article, we present three new DFT methods based on the BLYP, M06-2X, and CAM-B3LYP functionals in combination with the 6-31G* basis set and corrected with atom-centered potentials (ACPs). ACPs are one-electron potentials that have the same form as effective-core potentials, except they do not replace any electrons. The ACPs developed in this work are used to generate energy corrections to the underlying DFT/basis-set method such that the errors in predicted chemical properties are minimized while maintaining the low computational cost of the parent methods. ACPs were developed for the elements H, B, C, N, O, F, Si, P, S, and Cl. The ACP parameters were determined using an extensive training set of 118655 data points, mostly of complete basis set coupled-cluster level quality. The target molecular properties for the ACP-corrected methods include noncovalent interaction energies, molecular conformational energies, reaction energies, barrier heights, and bond separation energies. The ACPs were tested first on the training set and then on a validation set of 42567 additional data points. We show that the ACP-corrected methods can predict the target molecular properties with accuracy close to complete basis set wavefunction theory methods, but at a computational cost of double-ζ DFT methods. This makes the new BLYP/6-31G*-ACP, M06-2X/6-31G*-ACP, and CAM-B3LYP/6-31G*-ACP methods uniquely suited to the calculation of noncovalent, thermochemical, and kinetic properties in large molecular systems.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada
| | - Alberto Otero-de-la-Roza
- Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, MALTA Consolider Team, Oviedo E-33006, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia V1V 1V7, Canada
| |
Collapse
|
25
|
Prasad VK, Otero-de-la-Roza A, DiLabio GA. Fast and Accurate Quantum Mechanical Modeling of Large Molecular Systems Using Small Basis Set Hartree-Fock Methods Corrected with Atom-Centered Potentials. J Chem Theory Comput 2022; 18:2208-2232. [PMID: 35313106 DOI: 10.1021/acs.jctc.1c01128] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
There has been significant interest in developing fast and accurate quantum mechanical methods for modeling large molecular systems. In this work, by utilizing a machine learning regression technique, we have developed new low-cost quantum mechanical approaches to model large molecular systems. The developed approaches rely on using one-electron Gaussian-type functions called atom-centered potentials (ACPs) to correct for the basis set incompleteness and the lack of correlation effects in the underlying minimal or small basis set Hartree-Fock (HF) methods. In particular, ACPs are proposed for ten elements common in organic and bioorganic chemistry (H, B, C, N, O, F, Si, P, S, and Cl) and four different base methods: two minimal basis sets (MINIs and MINIX) plus a double-ζ basis set (6-31G*) in combination with dispersion-corrected HF (HF-D3/MINIs, HF-D3/MINIX, HF-D3/6-31G*) and the HF-3c method. The new ACPs are trained on a very large set (73 832 data points) of noncovalent properties (interaction and conformational energies) and validated additionally on a set of 32 048 data points. All reference data are of complete basis set coupled-cluster quality, mostly CCSD(T)/CBS. The proposed ACP-corrected methods are shown to give errors in the tenths of a kcal/mol range for noncovalent interaction energies and up to 2 kcal/mol for molecular conformational energies. More importantly, the average errors are similar in the training and validation sets, confirming the robustness and applicability of these methods outside the boundaries of the training set. In addition, the performance of the new ACP-corrected methods is similar to complete basis set density functional theory (DFT) but at a cost that is orders of magnitude lower, and the proposed ACPs can be used in any computational chemistry program that supports effective-core potentials without modification. It is also shown that ACPs improve the description of covalent and noncovalent bond geometries of the underlying methods and that the improvement brought about by the application of the ACPs is directly related to the number of atoms to which they are applied, allowing the treatment of systems containing some atoms for which ACPs are not available. Overall, the ACP-corrected methods proposed in this work constitute an alternative accurate, economical, and reliable quantum mechanical approach to describe the geometries, interaction energies, and conformational energies of systems with hundreds to thousands of atoms.
Collapse
Affiliation(s)
- Viki Kumar Prasad
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| | - Alberto Otero-de-la-Roza
- MALTA Consolider Team, Departamento de Química Física y Analítica, Facultad de Química, Universidad de Oviedo, E-33006 Oviedo, Spain
| | - Gino A DiLabio
- Department of Chemistry, University of British Columbia, Okanagan, 3247 University Way, Kelowna, British Columbia, Canada V1V 1V7
| |
Collapse
|
26
|
Beran GJO, Wright SE, Greenwell C, Cruz-Cabeza AJ. The interplay of intra- and intermolecular errors in modeling conformational polymorphs. J Chem Phys 2022; 156:104112. [DOI: 10.1063/5.0088027] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Conformational polymorphs of organic molecular crystals represent a challenging test for quantum chemistry because they require careful balancing of the intra- and intermolecular interactions. This study examines 54 molecular conformations from 20 sets of conformational polymorphs, along with the relative lattice energies and 173 dimer interactions taken from six of the polymorph sets. These systems are studied with a variety of van der Waals-inclusive density functionals theory models; dispersion-corrected spin-component-scaled second-order Møller–Plesset perturbation theory (SCS-MP2D); and domain local pair natural orbital coupled cluster singles, doubles, and perturbative triples [DLPNO-CCSD(T)]. We investigate how delocalization error in conventional density functionals impacts monomer conformational energies, systematic errors in the intermolecular interactions, and the nature of error cancellation that occurs in the overall crystal. The density functionals B86bPBE-XDM, PBE-D4, PBE-MBD, PBE0-D4, and PBE0-MBD are found to exhibit sizable one-body and two-body errors vs DLPNO-CCSD(T) benchmarks, and the level of success in predicting the relative polymorph energies relies heavily on error cancellation between different types of intermolecular interactions or between intra- and intermolecular interactions. The SCS-MP2D and, to a lesser extent, ωB97M-V models exhibit smaller errors and rely less on error cancellation. Implications for crystal structure prediction of flexible compounds are discussed. Finally, the one-body and two-body DLPNO-CCSD(T) energies taken from these conformational polymorphs establish the CP1b and CP2b benchmark datasets that could be useful for testing quantum chemistry models in challenging real-world systems with complex interplay between intra- and intermolecular interactions, a number of which are significantly impacted by delocalization error.
Collapse
Affiliation(s)
- Gregory J. O. Beran
- Department of Chemistry, University of California, Riverside, California 92521, USA
| | - Sarah E. Wright
- Department of Chemical Engineering and Analytical Science, University of Manchester, Manchester, United Kingdom
| | - Chandler Greenwell
- Department of Chemistry, University of California, Riverside, California 92521, USA
| | - Aurora J. Cruz-Cabeza
- Department of Chemical Engineering and Analytical Science, University of Manchester, Manchester, United Kingdom
| |
Collapse
|
27
|
Gokcan H, Isayev O. Learning molecular potentials with neural networks. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1564] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Hatice Gokcan
- Department of Chemistry, Mellon College of Science Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science Carnegie Mellon University Pittsburgh Pennsylvania USA
| |
Collapse
|
28
|
Fabregat R, Fabrizio A, Engel EA, Meyer B, Juraskova V, Ceriotti M, Corminboeuf C. Local Kernel Regression and Neural Network Approaches to the Conformational Landscapes of Oligopeptides. J Chem Theory Comput 2022; 18:1467-1479. [PMID: 35179897 PMCID: PMC8908737 DOI: 10.1021/acs.jctc.1c00813] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
![]()
The application of
machine learning to theoretical chemistry has
made it possible to combine the accuracy of quantum chemical energetics
with the thorough sampling of finite-temperature fluctuations. To
reach this goal, a diverse set of methods has been proposed, ranging
from simple linear models to kernel regression and highly nonlinear
neural networks. Here we apply two widely different approaches to
the same, challenging problem: the sampling of the conformational
landscape of polypeptides at finite temperature. We develop a local
kernel regression (LKR) coupled with a supervised sparsity method
and compare it with a more established approach based on Behler-Parrinello
type neural networks. In the context of the LKR, we discuss how the
supervised selection of the reference pool of environments is crucial
to achieve accurate potential energy surfaces at a competitive computational
cost and leverage the locality of the model to infer which chemical
environments are poorly described by the DFTB baseline. We then discuss
the relative merits of the two frameworks and perform Hamiltonian-reservoir
replica-exchange Monte Carlo sampling and metadynamics simulations,
respectively, to demonstrate that both frameworks can achieve converged
and transferable sampling of the conformational landscape of complex
and flexible biomolecules with comparable accuracy and computational
cost.
Collapse
Affiliation(s)
| | | | - Edgar A Engel
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | | | | | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | | |
Collapse
|
29
|
Low K, Coote ML, Izgorodina EI. Inclusion of More Physics Leads to Less Data: Learning the Interaction Energy as a Function of Electron Deformation Density with Limited Training Data. J Chem Theory Comput 2022; 18:1607-1618. [PMID: 35175045 DOI: 10.1021/acs.jctc.1c01264] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Machine learning (ML) approaches to predicting quantum mechanical (QM) properties have made great strides toward achieving the computational chemist's holy grail of structure-based property prediction. In contrast to direct ML methods, which encode a molecule with only structural information, in this work, we show that QM descriptors improve ML predictions of dimer interaction energy, both in terms of accuracy and data efficiency, by incorporating electronic information into the descriptor. We present the electron deformation density interaction energy machine learning (EDDIE-ML) model, which predicts the interaction energy as a function of Hartree-Fock electron deformation density. We compare its performance with leading direct ML schemes and modern DFT methods for the prediction of interaction energies for dimers of varying charge type, size, and intermolecular separation. Under a low-data regime, EDDIE-ML outperforms other direct ML schemes and is the only model readily transferrable to larger, more complex systems including base pair trimers and porous cages. The underlying physical connection between the density and interaction energy enables EDDIE-ML to reach an accuracy comparable to modern DFT functionals in fewer training data points compared to other ML methods.
Collapse
Affiliation(s)
- Kaycee Low
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| | - Michelle L Coote
- Research School of Chemistry, Australian National University, Canberra, Australian Capital Territory 0200, Australia
| | - Ekaterina I Izgorodina
- Monash Computational Chemistry Group, School of Chemistry, Monash University, Clayton, Victoria 3800, Australia
| |
Collapse
|
30
|
Beran GJO, Greenwell C, Rezac J. Spin-component-scaled and dispersion-corrected second-order Møller-Plesset perturbation theory: A path toward chemical accuracy. Phys Chem Chem Phys 2022; 24:3695-3712. [DOI: 10.1039/d1cp04922d] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Second-order Moller-Plesset perturbation theory (MP2) provides a valuable alternative to density functional theory for modeing problems in organic and biological chemistry. However, MP2 suffers from known limitations in the description...
Collapse
|
31
|
Lobanov MY, Pereyaslavets LB, Likhachev IV, Matkarimov BT, Galzitskaya OV. Is there an advantageous arrangement of aromatic residues in proteins? Statistical analysis of aromatic interactions in globular proteins. Comput Struct Biotechnol J 2021; 19:5960-5968. [PMID: 34849200 PMCID: PMC8604681 DOI: 10.1016/j.csbj.2021.10.036] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 10/11/2021] [Accepted: 10/28/2021] [Indexed: 11/18/2022] Open
Abstract
The aim of this study was to evaluate the favorability of different conformations of aromatic residues in proteins by analysing the occurrence of particular conformations. The clustering of protein structures from the Protein Data Bank (PDB) was performed. Conformations of interacting aromatic residues were analyzed for 511 282 pairs in 35 493 protein structures sharing less than 50% identity. Pairs with a parallel arrangement of aromatic residues made up 6.2% of all possible ones, which was twice as much as expected. Pairs with a perpendicular arrangement of aromatic residues made up 25%. We demonstrate that the most favorable arrangement was at an angle of 60° between the interacting aromatic residues. Among all possible aromatic pairs, the His-His pair was twice as frequent as expected, and the His-Phe pair was less frequent than expected. A server (CARP - Contacts of Aromatic Residues in Proteins) has been created for calculating essential structural features of interacting aromatic residues in proteins: http://bioproteom.protres.ru/arom_q_prog/.
Collapse
Affiliation(s)
- Mikhail Yu. Lobanov
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| | - Leonid B. Pereyaslavets
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| | - Ilya V. Likhachev
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
- Institute of Mathematical Problems of Biology, Russian Academy of Sciences, Keldysh Institute of Applied Mathematics, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
| | | | - Oxana V. Galzitskaya
- Institute of Protein Research, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
- Institute of Theoretical and Experimental Biophysics, Russian Academy of Sciences, 142290 Pushchino, Moscow Region, Russia
- Corresponding author at: Laboratory of Bioinformatics and Proteomics, Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow Region, Russia.
| |
Collapse
|
32
|
Christensen AS, Sirumalla SK, Qiao Z, O'Connor MB, Smith DGA, Ding F, Bygrave PJ, Anandkumar A, Welborn M, Manby FR, Miller TF. OrbNet Denali: A machine learning potential for biological and organic chemistry with semi-empirical cost and DFT accuracy. J Chem Phys 2021; 155:204103. [PMID: 34852495 DOI: 10.1063/5.0061990] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
We present OrbNet Denali, a machine learning model for an electronic structure that is designed as a drop-in replacement for ground-state density functional theory (DFT) energy calculations. The model is a message-passing graph neural network that uses symmetry-adapted atomic orbital features from a low-cost quantum calculation to predict the energy of a molecule. OrbNet Denali is trained on a vast dataset of 2.3 × 106 DFT calculations on molecules and geometries. This dataset covers the most common elements in biochemistry and organic chemistry (H, Li, B, C, N, O, F, Na, Mg, Si, P, S, Cl, K, Ca, Br, and I) and charged molecules. OrbNet Denali is demonstrated on several well-established benchmark datasets, and we find that it provides accuracy that is on par with modern DFT methods while offering a speedup of up to three orders of magnitude. For the GMTKN55 benchmark set, OrbNet Denali achieves WTMAD-1 and WTMAD-2 scores of 7.19 and 9.84, on par with modern DFT functionals. For several GMTKN55 subsets, which contain chemical problems that are not present in the training set, OrbNet Denali produces a mean absolute error comparable to those of DFT methods. For the Hutchison conformer benchmark set, OrbNet Denali has a median correlation coefficient of R2 = 0.90 compared to the reference DLPNO-CCSD(T) calculation and R2 = 0.97 compared to the method used to generate the training data (ωB97X-D3/def2-TZVP), exceeding the performance of any other method with a similar cost. Similarly, the model reaches chemical accuracy for non-covalent interactions in the S66x10 dataset. For torsional profiles, OrbNet Denali reproduces the torsion profiles of ωB97X-D3/def2-TZVP with an average mean absolute error of 0.12 kcal/mol for the potential energy surfaces of the diverse fragments in the TorsionNet500 dataset.
Collapse
Affiliation(s)
| | | | - Zhuoran Qiao
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | | | | | - Feizhi Ding
- Entos, Inc., Los Angeles, California 90027, USA
| | | | - Animashree Anandkumar
- Division of Engineering and Applied Sciences, California Institute of Technology, Pasadena, California 91125, USA
| | | | | | | |
Collapse
|
33
|
Sparrow ZM, Ernst BG, Joo PT, Lao KU, DiStasio RA. NENCI-2021. I. A large benchmark database of non-equilibrium non-covalent interactions emphasizing close intermolecular contacts. J Chem Phys 2021; 155:184303. [PMID: 34773949 DOI: 10.1063/5.0068862] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
In this work, we present NENCI-2021, a benchmark database of ∼8000 Non-Equilibirum Non-Covalent Interaction energies for a large and diverse selection of intermolecular complexes of biological and chemical relevance. To meet the growing demand for large and high-quality quantum mechanical data in the chemical sciences, NENCI-2021 starts with the 101 molecular dimers in the widely used S66 and S101 databases and extends the scope of these works by (i) including 40 cation-π and anion-π complexes, a fundamentally important class of non-covalent interactions that are found throughout nature and pose a substantial challenge to theory, and (ii) systematically sampling all 141 intermolecular potential energy surfaces (PESs) by simultaneously varying the intermolecular distance and intermolecular angle in each dimer. Designed with an emphasis on close contacts, the complexes in NENCI-2021 were generated by sampling seven intermolecular distances along each PES (ranging from 0.7× to 1.1× the equilibrium separation) and nine intermolecular angles per distance (five for each ion-π complex), yielding an extensive database of 7763 benchmark intermolecular interaction energies (Eint) obtained at the coupled-cluster with singles, doubles, and perturbative triples/complete basis set [CCSD(T)/CBS] level of theory. The Eint values in NENCI-2021 span a total of 225.3 kcal/mol, ranging from -38.5 to +186.8 kcal/mol, with a mean (median) Eint value of -1.06 kcal/mol (-2.39 kcal/mol). In addition, a wide range of intermolecular atom-pair distances are also present in NENCI-2021, where close intermolecular contacts involving atoms that are located within the so-called van der Waals envelope are prevalent-these interactions, in particular, pose an enormous challenge for molecular modeling and are observed in many important chemical and biological systems. A detailed symmetry-adapted perturbation theory (SAPT)-based energy decomposition analysis also confirms the diverse and comprehensive nature of the intermolecular binding motifs present in NENCI-2021, which now includes a significant number of primarily induction-bound dimers (e.g., cation-π complexes). NENCI-2021 thus spans all regions of the SAPT ternary diagram, thereby warranting a new four-category classification scheme that includes complexes primarily bound by electrostatics (3499), induction (700), dispersion (1372), or mixtures thereof (2192). A critical error analysis performed on a representative set of intermolecular complexes in NENCI-2021 demonstrates that the Eint values provided herein have an average error of ±0.1 kcal/mol, even for complexes with strongly repulsive Eint values, and maximum errors of ±0.2-0.3 kcal/mol (i.e., ∼±1.0 kJ/mol) for the most challenging cases. For these reasons, we expect that NENCI-2021 will play an important role in the testing, training, and development of next-generation classical and polarizable force fields, density functional theory approximations, wavefunction theory methods, and machine learning based intra- and inter-molecular potentials.
Collapse
Affiliation(s)
- Zachary M Sparrow
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
| | - Brian G Ernst
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
| | - Paul T Joo
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
| | - Ka Un Lao
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
| | - Robert A DiStasio
- Department of Chemistry and Chemical Biology, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
34
|
Loipersberger M, Bertels LW, Lee J, Head-Gordon M. Exploring the Limits of Second- and Third-Order Møller-Plesset Perturbation Theories for Noncovalent Interactions: Revisiting MP2.5 and Assessing the Importance of Regularization and Reference Orbitals. J Chem Theory Comput 2021; 17:5582-5599. [PMID: 34382394 PMCID: PMC9948597 DOI: 10.1021/acs.jctc.1c00469] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
This work systematically assesses the influence of reference orbitals, regularization, and scaling on the performance of second- and third-order Møller-Plesset perturbation theory wave function methods for noncovalent interactions (NCIs). Testing on 19 data sets (A24, DS14, HB15, HSG, S22, X40, HW30, NC15, S66, AlkBind12, CO2Nitrogen16, HB49, Ionic43, TA13, XB18, Bauza30, CT20, XB51, and Orel26rad) covers a wide range of different NCIs including hydrogen bonding, dispersion, and halogen bonding. Inclusion of potential energy surfaces from different hydrogen bonds and dispersion-bound complexes gauges accuracy for nonequilibrium geometries. Fifteen methods are tested. In notation where nonstandard choices of orbitals are denoted as methods:orbitals, these are MP2, κ-MP2, SCS-MP2, OOMP2, κ-OOMP2, MP3, MP2.5, MP3:OOMP2, MP2.5:OOMP2, MP3:κ-OOMP2, MP2.5:κ-OOMP2, κ-MP3:κ-OOMP2, κ-MP2.5:κ-OOMP2, MP3:ωB97X-V, and MP2.5:ωB97X-V. Furthermore, we compare these methods to the ωB97M-V and B3LYP-D3 density functionals, as well as CCSD. We find that the κ-regularization (κ = 1.45 au was used throughout) improves the energetics in almost all data sets for both MP2 (in 17 out of 19 data sets) and OOMP2 (16 out of 19). The improvement is significant (e.g., the root-mean-square deviation (RMSD) for the S66 data set is 0.29 kcal/mol for κ-OOMP2 versus 0.67 kcal/mol for MP2) and for interactions between stable closed-shell molecules, not strongly dependent on the reference orbitals. Scaled MP3 (with a factor of 0.5) using κ-OOMP2 reference orbitals (MP2.5:κ-OOMP2) provides significantly more accurate results for NCIs across all data sets with noniterative O(N6) scaling (S66 data set RMSD: 0.10 kcal/mol). Across the entire data set of 356 points, the improvement over standard MP2.5 is approximately a factor of 2: RMSD for MP3:κ-OOMP2 is 0.25 vs 0.50 kcal/mol for MP2.5. The use of high-quality density functional reference orbitals (ωB97X-V) also significantly improves the results of MP2.5 for NCI over a Hartree-Fock orbital reference. All our assessments and conclusions are based on the use of the medium-sized aug-cc-pVTZ basis to yield results that are directly compared against complete basis set limit reference values.
Collapse
Affiliation(s)
| | - Luke W. Bertels
- Department of Chemistry, University of California, Berkeley, California 94720, USA,Present Address: Department of Chemistry, Virginia Tech, Blacksburg, VA 24061, USA
| | - Joonho Lee
- Department of Chemistry, University of California, Berkeley, California 94720, USA,Present Address: Department of Chemistry, Columbia University, NY
| | - Martin Head-Gordon
- Department of Chemistry, University of California, Berkeley, California 94720, USA,Chemical Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| |
Collapse
|
35
|
Musil F, Grisafi A, Bartók AP, Ortner C, Csányi G, Ceriotti M. Physics-Inspired Structural Representations for Molecules and Materials. Chem Rev 2021; 121:9759-9815. [PMID: 34310133 DOI: 10.1021/acs.chemrev.1c00021] [Citation(s) in RCA: 135] [Impact Index Per Article: 45.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The first step in the construction of a regression model or a data-driven analysis, aiming to predict or elucidate the relationship between the atomic-scale structure of matter and its properties, involves transforming the Cartesian coordinates of the atoms into a suitable representation. The development of atomic-scale representations has played, and continues to play, a central role in the success of machine-learning methods for chemistry and materials science. This review summarizes the current understanding of the nature and characteristics of the most commonly used structural and chemical descriptions of atomistic structures, highlighting the deep underlying connections between different frameworks and the ideas that lead to computationally efficient and universally applicable models. It emphasizes the link between properties, structures, their physical chemistry, and their mathematical description, provides examples of recent applications to a diverse set of chemical and materials science problems, and outlines the open questions and the most promising research directions in the field.
Collapse
Affiliation(s)
- Felix Musil
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Andrea Grisafi
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Albert P Bartók
- Department of Physics and Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry CV4 7AL, United Kingdom
| | - Christoph Ortner
- University of British Columbia, Vancouver, British Columbia V6T 1Z2, Canada
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Trumpington Street, Cambridge CB2 1PZ, United Kingdom
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland.,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
36
|
Briling KR, Fabrizio A, Corminboeuf C. Impact of quantum-chemical metrics on the machine learning prediction of electron density. J Chem Phys 2021; 155:024107. [PMID: 34266253 DOI: 10.1063/5.0055393] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Machine learning (ML) algorithms have undergone an explosive development impacting every aspect of computational chemistry. To obtain reliable predictions, one needs to maintain a proper balance between the black-box nature of ML frameworks and the physics of the target properties. One of the most appealing quantum-chemical properties for regression models is the electron density, and some of us recently proposed a transferable and scalable model based on the decomposition of the density onto an atom-centered basis set. The decomposition, as well as the training of the model, is at its core a minimization of some loss function, which can be arbitrarily chosen and may lead to results of different quality. Well-studied in the context of density fitting (DF), the impact of the metric on the performance of ML models has not been analyzed yet. In this work, we compare predictions obtained using the overlap and the Coulomb-repulsion metrics for both decomposition and training. As expected, the Coulomb metric used as both the DF and ML loss functions leads to the best results for the electrostatic potential and dipole moments. The origin of this difference lies in the fact that the model is not constrained to predict densities that integrate to the exact number of electrons N. Since an a posteriori correction for the number of electrons decreases the errors, we proposed a modification of the model, where N is included directly into the kernel function, which allowed lowering of the errors on the test and out-of-sample sets.
Collapse
Affiliation(s)
- Ksenia R Briling
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Alberto Fabrizio
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Clemence Corminboeuf
- Laboratory for Computational Molecular Design, Institute of Chemical Sciences and Engineering, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
37
|
Schriber JB, Sirianni DA, Smith DGA, Burns LA, Sitkoff D, Cheney DL, Sherrill CD. Optimized damping parameters for empirical dispersion corrections to symmetry-adapted perturbation theory. J Chem Phys 2021; 154:234107. [PMID: 34241276 DOI: 10.1063/5.0049745] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Symmetry-adapted perturbation theory (SAPT) has become an invaluable tool for studying the fundamental nature of non-covalent interactions by directly computing the electrostatics, exchange (steric) repulsion, induction (polarization), and London dispersion contributions to the interaction energy using quantum mechanics. Further application of SAPT is primarily limited by its computational expense, where even its most affordable variant (SAPT0) scales as the fifth power of system size [O(N5)] due to the dispersion terms. The algorithmic scaling of SAPT0 is reduced from O(N5)→O(N4) by replacing these terms with the empirical D3 dispersion correction of Grimme and co-workers, forming a method that may be termed SAPT0-D3. Here, we optimize the damping parameters for the -D3 terms in SAPT0-D3 using a much larger training set than has previously been considered, namely, 8299 interaction energies computed at the complete-basis-set limit of coupled cluster through perturbative triples [CCSD(T)/CBS]. Perhaps surprisingly, with only three fitted parameters, SAPT0-D3 improves on the accuracy of SAPT0, reducing mean absolute errors from 0.61 to 0.49 kcal mol-1 over the full set of complexes. Additionally, SAPT0-D3 exhibits a nearly 2.5× speedup over conventional SAPT0 for systems with ∼300 atoms and is applied here to systems with up to 459 atoms. Finally, we have also implemented a functional group partitioning of the approach (F-SAPT0-D3) and applied it to determine important contacts in the binding of salbutamol to G-protein coupled β1-adrenergic receptor in both active and inactive forms. SAPT0-D3 capabilities have been added to the open-source Psi4 software.
Collapse
Affiliation(s)
- Jeffrey B Schriber
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Dominic A Sirianni
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Daniel G A Smith
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Lori A Burns
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Doree Sitkoff
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| |
Collapse
|
38
|
Schriber JB, Nascimento DR, Koutsoukas A, Spronk SA, Cheney DL, Sherrill CD. CLIFF: A component-based, machine-learned, intermolecular force field. J Chem Phys 2021; 154:184110. [PMID: 34241025 DOI: 10.1063/5.0042989] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Computation of intermolecular interactions is a challenge in drug discovery because accurate ab initio techniques are too computationally expensive to be routinely applied to drug-protein models. Classical force fields are more computationally feasible, and force fields designed to match symmetry adapted perturbation theory (SAPT) interaction energies can remain accurate in this context. Unfortunately, the application of such force fields is complicated by the laborious parameterization required for computations on new molecules. Here, we introduce the component-based machine-learned intermolecular force field (CLIFF), which combines accurate, physics-based equations for intermolecular interaction energies with machine-learning models to enable automatic parameterization. The CLIFF uses functional forms corresponding to electrostatic, exchange-repulsion, induction/polarization, and London dispersion components in SAPT. Molecule-independent parameters are fit with respect to SAPT2+(3)δMP2/aug-cc-pVTZ, and molecule-dependent atomic parameters (atomic widths, atomic multipoles, and Hirshfeld ratios) are obtained from machine learning models developed for C, N, O, H, S, F, Cl, and Br. The CLIFF achieves mean absolute errors (MAEs) no worse than 0.70 kcal mol-1 in both total and component energies across a diverse dimer test set. For the side chain-side chain interaction database derived from protein fragments, the CLIFF produces total interaction energies with an MAE of 0.27 kcal mol-1 with respect to reference data, outperforming similar and even more expensive methods. In applications to a set of model drug-protein interactions, the CLIFF is able to accurately rank-order ligand binding strengths and achieves less than 10% error with respect to SAPT reference values for most complexes.
Collapse
Affiliation(s)
- Jeffrey B Schriber
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | - Daniel R Nascimento
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | - Alexios Koutsoukas
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| |
Collapse
|
39
|
Kodrycka M, Patkowski K. Efficient Density-Fitted Explicitly Correlated Dispersion and Exchange Dispersion Energies. J Chem Theory Comput 2021; 17:1435-1456. [PMID: 33606539 DOI: 10.1021/acs.jctc.0c01158] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The leading-order dispersion and exchange-dispersion terms in symmetry-adapted perturbation theory (SAPT), Edisp(20) and Eexch-disp(20), suffer from slow convergence to the complete basis set limit. To alleviate this problem, explicitly correlated variants of these corrections, Edisp(20)-F12 and Eexch-disp(20)-F12, have been proposed recently. However, the original formalism (M., Kodrycka , J. Chem. Theory Comput. 2019, 15, 5965-5986), while highly successful in terms of improving convergence, was not competitive to conventional orbital-based SAPT in terms of computational efficiency due to the need to manipulate several kinds of two-electron integrals. In this work, we eliminate this need by decomposing all types of two-electron integrals using robust density fitting. We demonstrate that the error of the density fitting approximation is negligible when standard auxiliary bases such as aug-cc-pVXZ/MP2FIT are employed. The new implementation allowed us to study all complexes in the A24 database in basis sets up to aug-cc-pV5Z, and the Edisp(20)-F12 and Eexch-disp(20)-F12 values exhibit vastly improved basis set convergence over their conventional counterparts. The well-converged Edisp(20)-F12 and Eexch-disp(20)-F12 numbers can be substituted for conventional Edisp(20) and Eexch-disp(20) ones in a calculation of the total SAPT interaction energy at any level (SAPT0, SAPT2+3, ...). We show that the addition of F12 terms does not improve the accuracy of low-level SAPT treatments. However, when the theory errors are minimized in high-level SAPT approaches such as SAPT2+3(CCD)δMP2, the reduction of basis set incompleteness errors thanks to the F12 treatment substantially improves the accuracy of small-basis calculations.
Collapse
Affiliation(s)
- Monika Kodrycka
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama 36849, United States
| | - Konrad Patkowski
- Department of Chemistry and Biochemistry, Auburn University, Auburn, Alabama 36849, United States
| |
Collapse
|
40
|
Imbalzano G, Zhuang Y, Kapil V, Rossi K, Engel EA, Grasselli F, Ceriotti M. Uncertainty estimation for molecular dynamics and sampling. J Chem Phys 2021; 154:074102. [DOI: 10.1063/5.0036522] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Giulio Imbalzano
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Yongbin Zhuang
- State Key Laboratory of Physical Chemistry of Solid Surfaces, Collaborative Innovation Center of Chemistry for Energy Materials, Xiamen University, Xiamen 361005, China
| | - Venkat Kapil
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Yusuf Hamied Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Kevin Rossi
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- Laboratory of Nanochemistry for Energy, ISIC, École Polytechnique Fédérale de Lausanne, 1950 Sion, Switzerland
| | - Edgar A. Engel
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Federico Grasselli
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
41
|
Husch T, Sun J, Cheng L, Lee SJR, Miller TF. Improved accuracy and transferability of molecular-orbital-based machine learning: Organics, transition-metal complexes, non-covalent interactions, and transition states. J Chem Phys 2021; 154:064108. [DOI: 10.1063/5.0032362] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Affiliation(s)
- Tamara Husch
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Jiace Sun
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Lixue Cheng
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Sebastian J. R. Lee
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| | - Thomas F. Miller
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, California 91125, USA
| |
Collapse
|
42
|
Donchev AG, Taube AG, Decolvenaere E, Hargus C, McGibbon RT, Law KH, Gregersen BA, Li JL, Palmo K, Siva K, Bergdorf M, Klepeis JL, Shaw DE. Quantum chemical benchmark databases of gold-standard dimer interaction energies. Sci Data 2021; 8:55. [PMID: 33568655 PMCID: PMC7876112 DOI: 10.1038/s41597-021-00833-x] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2020] [Accepted: 12/14/2020] [Indexed: 12/11/2022] Open
Abstract
Advances in computational chemistry create an ongoing need for larger and higher-quality datasets that characterize noncovalent molecular interactions. We present three benchmark collections of quantum mechanical data, covering approximately 3,700 distinct types of interacting molecule pairs. The first collection, which we refer to as DES370K, contains interaction energies for more than 370,000 dimer geometries. These were computed using the coupled-cluster method with single, double, and perturbative triple excitations [CCSD(T)], which is widely regarded as the gold-standard method in electronic structure theory. Our second benchmark collection, a core representative subset of DES370K called DES15K, is intended for more computationally demanding applications of the data. Finally, DES5M, our third collection, comprises interaction energies for nearly 5,000,000 dimer geometries; these were calculated using SNS-MP2, a machine learning approach that provides results with accuracy comparable to that of our coupled-cluster training data. These datasets may prove useful in the development of density functionals, empirically corrected wavefunction-based approaches, semi-empirical methods, force fields, and models trained using machine learning methods.
Collapse
Affiliation(s)
| | | | | | - Cory Hargus
- D. E. Shaw Research, New York, NY, 10036, USA
| | | | - Ka-Hei Law
- D. E. Shaw Research, New York, NY, 10036, USA
| | | | - Je-Luen Li
- D. E. Shaw Research, New York, NY, 10036, USA
| | - Kim Palmo
- D. E. Shaw Research, New York, NY, 10036, USA
| | | | | | | | - David E Shaw
- D. E. Shaw Research, New York, NY, 10036, USA. .,Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY, 10032, USA.
| |
Collapse
|
43
|
Takaya D, Watanabe C, Nagase S, Kamisaka K, Okiyama Y, Moriwaki H, Yuki H, Sato T, Kurita N, Yagi Y, Takagi T, Kawashita N, Takaba K, Ozawa T, Takimoto-Kamimura M, Tanaka S, Fukuzawa K, Honma T. FMODB: The World's First Database of Quantum Mechanical Calculations for Biomacromolecules Based on the Fragment Molecular Orbital Method. J Chem Inf Model 2021; 61:777-794. [PMID: 33511845 DOI: 10.1021/acs.jcim.0c01062] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
We developed the world's first web-based public database for the storage, management, and sharing of fragment molecular orbital (FMO) calculation data sets describing the complex interactions between biomacromolecules, named FMO Database (https://drugdesign.riken.jp/FMODB/). Each entry in the database contains relevant background information on how the data was compiled as well as the total energy of each molecular system and interfragment interaction energy (IFIE) and pair interaction energy decomposition analysis (PIEDA) values. Currently, the database contains more than 13 600 FMO calculation data sets, and a comprehensive search function implemented at the front-end. The procedure for selecting target proteins, preprocessing the experimental structures, construction of the database, and details of the database front-end were described. Then, we demonstrated a use of the FMODB by comparing IFIE value distributions of hydrogen bond, ion-pair, and XH/π interactions obtained by FMO method to those by molecular mechanics approach. From the comparison, the statistical analysis of the data provided standard reference values for the three types of interactions that will be useful for determining whether each interaction in a given system is relatively strong or weak compared to the interactions contained within the data in the FMODB. In the final part, we demonstrate the use of the database to examine the contribution of halogen atoms to the binding affinity between human cathepsin L and its inhibitors. We found that the electrostatic term derived by PIEDA greatly correlated with the binding affinities of the halogen containing cathepsin L inhibitors, indicating the importance of QM calculation for quantitative analysis of halogen interactions. Thus, the FMO calculation data in FMODB will be useful for conducting statistical analyses to drug discovery, for conducting molecular recognition studies in structural biology, and for other studies involving quantum mechanics-based interactions.
Collapse
Affiliation(s)
- Daisuke Takaya
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Chiduru Watanabe
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.,JST PRESTO, 4-1-8, Honcho, Kawaguchi, Saitama 332-0012, Japan
| | - Shunpei Nagase
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Kikuko Kamisaka
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Yoshio Okiyama
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan.,Division of Medicinal Safety Science, National Institute of Health Sciences, 3-25-26 Tonomachi, Kawasaki-ku, Kawasaki, Kanagawa 210-9501, Japan
| | - Hirotomo Moriwaki
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hitomi Yuki
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Tomohiro Sato
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Noriyuki Kurita
- Department of Computer Science and Engineering, Toyohashi University of Technology, 1-1 Hibarigaoka Tempaku-cho, Toyohashi, Aichi 441-8580, Japan
| | - Yoichiro Yagi
- Graduate School of Engineering, Okayama University of Science, Okayama, 1-1 Ridai-cho, Okayama 700-0005, Japan
| | - Tatsuya Takagi
- Graduate School of Pharmaceutical Sciences, Osaka University, 1-6 Yamadaoka, Suita, Osaka 565-0871, Japan
| | - Norihito Kawashita
- Faculty of Science and Engineering, Kindai University, 3-4-1 Kowakae, Higashiosaka, Osaka 577-8502, Japan
| | - Kenichiro Takaba
- Pharmaceutical Research Center, Laboratory for Medicinal Chemistry, Asahi Kasei Pharma Corporation, 632-1 Mifuku, Izunokuni, Shizuoka 410-2321, Japan
| | - Tomonaga Ozawa
- Kissei Pharmaceutical Co., LTD., Frontier Technology Research Lab., Research Div. 4365-1 Hotaka Kashiwabara, Azumino, Nagano 399-8304, Japan
| | - Midori Takimoto-Kamimura
- Teijin Institute for Biomedical Research, Teijin Pharma Ltd., 4-3-2 Asahigaoka, Hino, Tokyo 191-8512, Japan
| | - Shigenori Tanaka
- Graduate School of System Informatics, Department of Computational Science, Kobe University, 1-1 Rokkodai, Kobe, Hyogo 657-8501, Japan
| | - Kaori Fukuzawa
- School of Pharmacy and Pharmaceutical Sciences, Hoshi University, 2-4-41 Ebara, Shinagawa, Tokyo 142-8501, Japan.,Department of Biomolecular Engineering, Graduate School of Engineering, Tohoku University, 6-6-11 Aoba, Aramaki, Sendai, Miyagi 980-8579, Japan
| | - Teruki Honma
- RIKEN Center for Biosystems Dynamics Research, 1-7-22 Suehiro-cho Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| |
Collapse
|
44
|
Vennelakanti V, Qi HW, Mehmood R, Kulik HJ. When are two hydrogen bonds better than one? Accurate first-principles models explain the balance of hydrogen bond donors and acceptors found in proteins. Chem Sci 2021; 12:1147-1162. [PMID: 35382134 PMCID: PMC8908278 DOI: 10.1039/d0sc05084a] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Accepted: 11/18/2020] [Indexed: 01/02/2023] Open
Abstract
Hydrogen bonds (HBs) play an essential role in the structure and catalytic action of enzymes, but a complete understanding of HBs in proteins challenges the resolution of modern structural (i.e., X-ray diffraction) techniques and mandates computationally demanding electronic structure methods from correlated wavefunction theory for predictive accuracy. Numerous amino acid sidechains contain functional groups (e.g., hydroxyls in Ser/Thr or Tyr and amides in Asn/Gln) that can act as either HB acceptors or donors (HBA/HBD) and even form simultaneous, ambifunctional HB interactions. To understand the relative energetic benefit of each interaction, we characterize the potential energy surfaces of representative model systems with accurate coupled cluster theory calculations. To reveal the relationship of these energetics to the balance of these interactions in proteins, we curate a set of 4000 HBs, of which >500 are ambifunctional HBs, in high-resolution protein structures. We show that our model systems accurately predict the favored HB structural properties. Differences are apparent in HBA/HBD preference for aromatic Tyr versus aliphatic Ser/Thr hydroxyls because Tyr forms significantly stronger O–H⋯O HBs than N–H⋯O HBs in contrast to comparable strengths of the two for Ser/Thr. Despite this residue-specific distinction, all models of residue pairs indicate an energetic benefit for simultaneous HBA and HBD interactions in an ambifunctional HB. Although the stabilization is less than the additive maximum due both to geometric constraints and many-body electronic effects, a wide range of ambifunctional HB geometries are more favorable than any single HB interaction. Correlated wavefunction theory predicts and high-resolution crystal structure analysis confirms the important, stabilizing effect of simultaneous hydrogen bond donor and acceptor interactions in proteins.![]()
Collapse
Affiliation(s)
- Vyshnavi Vennelakanti
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Helena W. Qi
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Rimsha Mehmood
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
- Department of Chemistry
| | - Heather J. Kulik
- Department of Chemical Engineering
- Massachusetts Institute of Technology
- Cambridge
- USA
| |
Collapse
|
45
|
Metcalf DP, Jiang A, Spronk SA, Cheney DL, Sherrill CD. Electron-Passing Neural Networks for Atomic Charge Prediction in Systems with Arbitrary Molecular Charge. J Chem Inf Model 2020; 61:115-122. [PMID: 33326247 DOI: 10.1021/acs.jcim.0c01071] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Atomic charges are critical quantities in molecular mechanics and molecular dynamics, but obtaining these quantities requires heuristic choices based on atom typing or relatively expensive quantum mechanical computations to generate a density to be partitioned. Most machine learning efforts in this domain ignore total molecular charges, relying on overfitting and arbitrary rescaling in order to match the total system charge. Here, we introduce the electron-passing neural network (EPNN), a fast, accurate neural network atomic charge partitioning model that conserves total molecular charge by construction. EPNNs predict atomic charges very similar to those obtained by partitioning quantum mechanical densities but at such a small fraction of the cost that they can be easily computed for large biomolecules. Charges from this method may be used directly for molecular mechanics, as features for cheminformatics, or as input to any neural network potential.
Collapse
Affiliation(s)
- Derek P Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| | - Andy Jiang
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| | - Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, United States
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, United States
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| |
Collapse
|
46
|
Grisafi A, Nigam J, Ceriotti M. Multi-scale approach for the prediction of atomic scale properties. Chem Sci 2020; 12:2078-2090. [PMID: 34163971 PMCID: PMC8179303 DOI: 10.1039/d0sc04934d] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Electronic nearsightedness is one of the fundamental principles that governs the behavior of condensed matter and supports its description in terms of local entities such as chemical bonds. Locality also underlies the tremendous success of machine-learning schemes that predict quantum mechanical observables - such as the cohesive energy, the electron density, or a variety of response properties - as a sum of atom-centred contributions, based on a short-range representation of atomic environments. One of the main shortcomings of these approaches is their inability to capture physical effects ranging from electrostatic interactions to quantum delocalization, which have a long-range nature. Here we show how to build a multi-scale scheme that combines in the same framework local and non-local information, overcoming such limitations. We show that the simplest version of such features can be put in formal correspondence with a multipole expansion of permanent electrostatics. The data-driven nature of the model construction, however, makes this simple form suitable to tackle also different types of delocalized and collective effects. We present several examples that range from molecular physics to surface science and biophysics, demonstrating the ability of this multi-scale approach to model interactions driven by electrostatics, polarization and dispersion, as well as the cooperative behavior of dielectric response functions.
Collapse
Affiliation(s)
- Andrea Grisafi
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| | - Jigyasa Nigam
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland .,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland.,Indian Institute of Space Science and Technology Thiruvananthapuram 695547 India
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, IMX, École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland .,National Centre for Computational Design and Discovery of Novel Materials (MARVEL), École Polytechnique Fédérale de Lausanne 1015 Lausanne Switzerland
| |
Collapse
|
47
|
Mao Y, Loipersberger M, Kron KJ, Derrick JS, Chang CJ, Sharada SM, Head-Gordon M. Consistent inclusion of continuum solvation in energy decomposition analysis: theory and application to molecular CO 2 reduction catalysts. Chem Sci 2020; 12:1398-1414. [PMID: 34163903 PMCID: PMC8179122 DOI: 10.1039/d0sc05327a] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
To facilitate computational investigation of intermolecular interactions in the solution phase, we report the development of ALMO-EDA(solv), a scheme that allows the application of continuum solvent models within the framework of energy decomposition analysis (EDA) based on absolutely localized molecular orbitals (ALMOs). In this scheme, all the quantum mechanical states involved in the variational EDA procedure are computed with the presence of solvent environment so that solvation effects are incorporated in the evaluation of all its energy components. After validation on several model complexes, we employ ALMO-EDA(solv) to investigate substituent effects on two classes of complexes that are related to molecular CO2 reduction catalysis. For [FeTPP(CO2-κC)]2- (TPP = tetraphenylporphyrin), we reveal that two ortho substituents which yield most favorable CO2 binding, -N(CH3)3 + (TMA) and -OH, stabilize the complex via through-structure and through-space mechanisms, respectively. The coulombic interaction between the positively charged TMA group and activated CO2 is found to be largely attenuated by the polar solvent. Furthermore, we also provide computational support for the design strategy of utilizing bulky, flexible ligands to stabilize activated CO2 via long-range Coulomb interactions, which creates biomimetic solvent-inaccessible "pockets" in that electrostatics is unscreened. For the reactant and product complexes associated with the electron transfer from the p-terphenyl radical anion to CO2, we demonstrate that the double terminal substitution of p-terphenyl by electron-withdrawing groups considerably strengthens the binding in the product state while moderately weakens that in the reactant state, which are both dominated by the substituent tuning of the electrostatics component. These applications illustrate that this new extension of ALMO-EDA provides a valuable means to unravel the nature of intermolecular interactions and quantify their impacts on chemical reactivity in solution.
Collapse
Affiliation(s)
- Yuezhi Mao
- Department of Chemistry, University of California at Berkeley Berkeley CA 94720 USA
| | | | - Kareesa J Kron
- Mork Family Department of Chemical Engineering and Material Science, University of Southern California Los Angeles CA 90089 USA
| | - Jeffrey S Derrick
- Department of Chemistry, University of California at Berkeley Berkeley CA 94720 USA
| | - Christopher J Chang
- Department of Chemistry, University of California at Berkeley Berkeley CA 94720 USA .,Chemical Sciences Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA.,Department of Molecular and Cell Biology, University of California Berkeley Berkeley CA 94720 USA
| | - Shaama Mallikarjun Sharada
- Mork Family Department of Chemical Engineering and Material Science, University of Southern California Los Angeles CA 90089 USA.,Department of Chemistry, University of Southern California Los Angeles CA 90089 USA
| | - Martin Head-Gordon
- Department of Chemistry, University of California at Berkeley Berkeley CA 94720 USA .,Chemical Sciences Division, Lawrence Berkeley National Laboratory Berkeley CA 94720 USA
| |
Collapse
|
48
|
Chang YM, Wang YS, Chao SD. A minimum quantum chemistry CCSD(T)/CBS dataset of dimeric interaction energies for small organic functional groups. J Chem Phys 2020; 153:154301. [PMID: 33092384 DOI: 10.1063/5.0019392] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
We have performed a quantum chemistry study on the bonding patterns and interaction energies for 31 dimers of small organic functional groups (dubbed the SOFG-31 dataset), including the alkane-alkene-alkyne (6 + 4 + 4 = 14, AAA) groups, alcohol-aldehyde-ketone (4 + 4 + 3 = 11, AAK) groups, and carboxylic acid-amide (3 + 3 = 6, CAA) groups. The basis set superposition error corrected super-molecule approach using the second order Møller-Plesset perturbation theory (MP2) with the Dunning's aug-cc-pVXZ (X = D, T, Q) basis sets has been employed in the geometry optimization and energy calculations. To calibrate the MP2 calculated interaction energies for these dimeric complexes, we perform single-point calculations with the coupled cluster with single, double, and perturbative triple excitations method at the complete basis set limit [CCSD(T)/CBS] using the well-tested extrapolation methods. In order to gain more physical insights, we also perform a parallel series of energy decomposition calculations based on the symmetry adapted perturbation theory (SAPT). The collection of these CCSD(T)/CBS interaction energy values can serve as a minimum quantum chemistry dataset for testing or training less accurate but more efficient calculation methods. As an application, we further propose a segmental SAPT model based on chemically recognizable segments in a specific functional group. These model interactions can be used to construct coarse-grained force fields for larger molecular systems.
Collapse
Affiliation(s)
- Yu-Ming Chang
- Institute of Applied Mechanics, National Taiwan University, Taipei 10617, Taiwan
| | - Yi-Siang Wang
- Institute of Applied Mechanics, National Taiwan University, Taipei 10617, Taiwan
| | - Sheng D Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 10617, Taiwan
| |
Collapse
|
49
|
Smith DGA, Altarawy D, Burns LA, Welborn M, Naden LN, Ward L, Ellis S, Pritchard BP, Crawford TD. The
MolSSI
QCA
rchive
project: An open‐source platform to compute, organize, and share quantum chemistry data. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1491] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Affiliation(s)
| | - Doaa Altarawy
- Molecular Sciences Software Institute Blacksburg Virginia USA
- Department of Computer and Systems Engineering Alexandria University Alexandria Egypt
| | - Lori A. Burns
- Center for Computational Molecular Science and Technology School of Chemistry and Biochemistry, Georgia Institute of Technology Atlanta Georgia USA
| | - Matthew Welborn
- Molecular Sciences Software Institute Blacksburg Virginia USA
| | - Levi N. Naden
- Molecular Sciences Software Institute Blacksburg Virginia USA
| | - Logan Ward
- Data Science and Learning Division Argonne National Laboratory Lemont Illinois USA
| | - Sam Ellis
- Molecular Sciences Software Institute Blacksburg Virginia USA
| | | | - T. Daniel Crawford
- Molecular Sciences Software Institute Blacksburg Virginia USA
- Department of Chemistry Virginia Tech Blacksburg, Virginia USA
| |
Collapse
|
50
|
Glick ZL, Metcalf DP, Koutsoukas A, Spronk SA, Cheney DL, Sherrill CD. AP-Net: An atomic-pairwise neural network for smooth and transferable interaction potentials. J Chem Phys 2020; 153:044112. [DOI: 10.1063/5.0011521] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Zachary L. Glick
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Derek P. Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| | - Alexios Koutsoukas
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Steven A. Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Daniel L. Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - C. David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, USA
| |
Collapse
|