1
|
Vennelakanti V, Kilic IB, Terrones GG, Duan C, Kulik HJ. Machine Learning Prediction of the Experimental Transition Temperature of Fe(II) Spin-Crossover Complexes. J Phys Chem A 2024; 128:204-216. [PMID: 38148525 DOI: 10.1021/acs.jpca.3c07104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
Spin-crossover (SCO) complexes are materials that exhibit changes in the spin state in response to external stimuli, with potential applications in molecular electronics. It is challenging to know a priori how to design ligands to achieve the delicate balance of entropic and enthalpic contributions needed to tailor a transition temperature close to room temperature. We leverage the SCO complexes from the previously curated SCO-95 data set [Vennelakanti et al. J. Chem. Phys. 159, 024120 (2023)] to train three machine learning (ML) models for transition temperature (T1/2) prediction using graph-based revised autocorrelations as features. We perform feature selection using random forest-ranked recursive feature addition (RF-RFA) to identify the features essential to model transferability. Of the ML models considered, the full feature set RF and recursive feature addition RF models perform best, achieving moderate correlation to experimental T1/2 values. We then compare ML T1/2 predictions to those from three previously identified best-performing density functional approximations (DFAs) which accurately predict SCO behavior across SCO-95, finding that the ML models predict T1/2 more accurately than the best-performing DFAs. In addition, we study ML model predictions for a set of 18 SCO complexes for which only estimated T1/2 values are available. Upon excluding outliers from this set, the RF-RFA RF model shows a strong correlation to estimated T1/2 values with a Pearson's r of 0.82. In contrast, DFA-predicted T1/2 values have large errors and show no correlation to estimated T1/2 values over the same set of complexes. Overall, our study demonstrates slightly superior performance of ML models in comparison with some of the best-performing DFAs, and we expect ML models to improve further as larger data sets of SCO complexes are curated and become available for model training.
Collapse
Affiliation(s)
- Vyshnavi Vennelakanti
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Irem B Kilic
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Gianmarco G Terrones
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Chenru Duan
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| | - Heather J Kulik
- Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
- Department of Chemistry, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
2
|
Ricard TC, Zhu X, Iyengar SS. Capturing Weak Interactions in Surface Adsorbate Systems at Coupled Cluster Accuracy: A Graph-Theoretic Molecular Fragmentation Approach Improved through Machine Learning. J Chem Theory Comput 2023. [PMID: 38019639 DOI: 10.1021/acs.jctc.3c00955] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2023]
Abstract
The accurate and efficient study of the interactions of organic matter with the surface of water is critical to a wide range of applications. For example, environmental studies have found that acidic polyfluorinated alkyl substances, especially perfluorooctanoic acid (PFOA), have spread throughout the environment and bioaccumulate into human populations residing near contaminated watersheds, leading to many systemic maladies. Thus, the study of the interactions of PFOA with water surfaces became important for the mitigation of their activity as pollutants and threats to public health. However, theoretical study of the interactions of such organic adsorbates on the surface of water, and their bulk concerted properties, often necessitates the use of ab initio methods to properly incorporate the long-range electronic properties that govern these extended systems. Notable theoretical treatments of "on-water" reactions thus far have employed hybrid DFT and semilocal DFT, but the interactions involved are weak interactions that may be best described using post-Hartree-Fock theory. Here, we aim to demonstrate the utility of a graph-theoretic approach to molecular fragmentation that accurately captures the critical "weak" interactions while maintaining an efficient ab initio treatment of the long-range periodic interactions that underpin the physics of extended systems. We apply this graph-theoretical treatment to study PFOA on the surface of water as a model system for the study of weak interactions seen in the wide range of surface interactions and reactions. The approach divides a system into a set of vertices, that are then connected through edges, faces, and higher order graph theoretic objects known as simplexes, to represent a collection of locally interacting subsystems. These subsystems are then used to construct ab initio molecular dynamics simulations and for computing multidimensional potential energy surfaces. To further improve the computational efficiency of our graph theoretic fragmentation method, we use a recently developed transfer learning protocol to construct the full system potential energy from a family of neural networks each designed to accurately model the behavior of individual simplexes. We use a unique multidimensional clustering algorithm, based on the k-means clustering methodology, to define our training space for each separate simplex. These models are used to extrapolate the energies for molecular dynamics trajectories at PFOA water interfaces, at less than one-tenth the cost as compared to a regular molecular fragmentation-based dynamics calculation with excellent agreement with couple cluster level of full system potential energies.
Collapse
Affiliation(s)
- Timothy C Ricard
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Xiao Zhu
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Srinivasan S Iyengar
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| |
Collapse
|
3
|
Broderick DR, Herbert JM. Scalable generalized screening for high-order terms in the many-body expansion: Algorithm, open-source implementation, and demonstration. J Chem Phys 2023; 159:174801. [PMID: 37921253 DOI: 10.1063/5.0174293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Accepted: 10/16/2023] [Indexed: 11/04/2023] Open
Abstract
The many-body expansion lies at the heart of numerous fragment-based methods that are intended to sidestep the nonlinear scaling of ab initio quantum chemistry, making electronic structure calculations feasible in large systems. In principle, inclusion of higher-order n-body terms ought to improve the accuracy in a controllable way, but unfavorable combinatorics often defeats this in practice and applications with n ≥ 4 are rare. Here, we outline an algorithm to overcome this combinatorial bottleneck, based on a bottom-up approach to energy-based screening. This is implemented within a new open-source software application ("Fragme∩t"), which is integrated with a lightweight semi-empirical method that is used to cull subsystems, attenuating the combinatorial growth of higher-order terms in the graph that is used to manage the calculations. This facilitates applications of unprecedented size, and we report four-body calculations in (H2O)64 clusters that afford relative energies within 0.1 kcal/mol/monomer of the supersystem result using less than 10% of the unique subsystems. We also report n-body calculations in (H2O)20 clusters up to n = 8, at which point the expansion terminates naturally due to screening. These are the largest n-body calculations reported to date using ab initio electronic structure theory, and they confirm that high-order n-body terms are mostly artifacts of basis-set superposition error.
Collapse
Affiliation(s)
- Dustin R Broderick
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio 43210, USA
| | - John M Herbert
- Department of Chemistry and Biochemistry, The Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
4
|
Tkachenko NV, Tkachenko AA, Nebgen B, Tretiak S, Boldyrev AI. Neural network atomistic potentials for global energy minima search in carbon clusters. Phys Chem Chem Phys 2023; 25:21173-21182. [PMID: 37490276 DOI: 10.1039/d3cp02317f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
The global energy optimization problem is an acute and important problem in chemistry. It is crucial to know the geometry of the lowest energy isomer (global minimum, GM) of a given compound for the evaluation of its chemical and physical properties. This problem is especially relevant for atomic clusters. Due to the exponential growth of the number of local minima geometries with the increase of the number of atoms in the cluster, it is important to find a computationally efficient and reliable method to navigate the energy landscape and locate a true global minima structure. Newly developed neural network (NN) atomistic potentials offer a numerically efficient and relatively accurate approach for molecular structure optimization. An important question that needs to be answered is "Can NN potentials, trained on a given set, represent the potential energy surface (PES) of a neighboring domain?". In this work, we tested the applicability of ANI-1ccx and ANI-nr NN atomistic potentials for the global minima optimization of carbon clusters Cn (n = 3-10). We showed that with the introduction of the cluster connectivity restriction and consequent DFT or ab initio calculations, ANI-1ccx and ANI-nr can be considered as robust PES pre-samplers that can capture the GM structure even for large clusters such as C20.
Collapse
Affiliation(s)
- Nikolay V Tkachenko
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322-0300, USA.
| | | | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Alexander I Boldyrev
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322-0300, USA.
| |
Collapse
|
5
|
Akher FB, Shu Y, Varga Z, Bhaumik S, Truhlar DG. Parametrically Managed Activation Function for Fitting a Neural Network Potential with Physical Behavior Enforced by a Low-Dimensional Potential. J Phys Chem A 2023. [PMID: 37307218 DOI: 10.1021/acs.jpca.3c02627] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Machine-learned representations of potential energy surfaces generated in the output layer of a feedforward neural network are becoming increasingly popular. One difficulty with neural network output is that it is often unreliable in regions where training data is missing or sparse. Human-designed potentials often build in proper extrapolation behavior by choice of functional form. Because machine learning is very efficient, it is desirable to learn how to add human intelligence to machine-learned potentials in a convenient way. One example is the well-understood feature of interaction potentials that they vanish when subsystems are too far separated to interact. In this article, we present a way to add a new kind of activation function to a neural network to enforce low-dimensional constraints. In particular, the activation function depends parametrically on all of the input variables. We illustrate the use of this step by showing how it can force an interaction potential to go to zero at large subsystem separations without either inputting a specific functional form for the potential or adding data to the training set in the asymptotic region of geometries where the subsystems are separated. In the process of illustrating this, we present an improved set of potential energy surfaces for the 14 lowest 3A' states of O3. The method is more general than this example, and it may be used to add other low-dimensional knowledge or lower-level knowledge to machine-learned potentials. In addition to the O3 example, we present a greater-generality method called parametrically managed diabatization by deep neural network (PM-DDNN) that is an improvement on our previously presented permutationally restrained diabatization by deep neural network (PR-DDNN).
Collapse
Affiliation(s)
- Farideh Badichi Akher
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Yinan Shu
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Zoltan Varga
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Suman Bhaumik
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| | - Donald G Truhlar
- Department of Chemistry and Supercomputing Institute, University of Minnesota, Minneapolis, Minnesota 55455-0431, United States
| |
Collapse
|
6
|
Chen WK, Wang SR, Liu XY, Fang WH, Cui G. Nonadiabatic Derivative Couplings Calculated Using Information of Potential Energy Surfaces without Wavefunctions: Ab Initio and Machine Learning Implementations. Molecules 2023; 28:molecules28104222. [PMID: 37241962 DOI: 10.3390/molecules28104222] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/16/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
In this work, we implemented an approximate algorithm for calculating nonadiabatic coupling matrix elements (NACMEs) of a polyatomic system with ab initio methods and machine learning (ML) models. Utilizing this algorithm, one can calculate NACMEs using only the information of potential energy surfaces (PESs), i.e., energies, and gradients as well as Hessian matrix elements. We used a realistic system, namely CH2NH, to compare NACMEs calculated by this approximate PES-based algorithm and the accurate wavefunction-based algorithm. Our results show that this approximate PES-based algorithm can give very accurate results comparable to the wavefunction-based algorithm except at energetically degenerate points, i.e., conical intersections. We also tested a machine learning (ML)-trained model with this approximate PES-based algorithm, which also supplied similarly accurate NACMEs but more efficiently. The advantage of this PES-based algorithm is its significant potential to combine with electronic structure methods that do not implement wavefunction-based algorithms, low-scaling energy-based fragment methods, etc., and in particular efficient ML models, to compute NACMEs. The present work could encourage further research on nonadiabatic processes of large systems simulated by ab initio nonadiabatic dynamics simulation methods in which NACMEs are always required.
Collapse
Affiliation(s)
- Wen-Kai Chen
- Hebei Key Laboratory of Inorganic Nano-Materials, College of Chemistry and Materials Science, Hebei Normal University, Shijiazhuang 050024, China
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Sheng-Rui Wang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Xiang-Yang Liu
- College of Chemistry and Material Science, Sichuan Normal University, Chengdu 610068, China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
- Hefei National Laboratory, Hefei 230088, China
| | - Ganglong Cui
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
- Hefei National Laboratory, Hefei 230088, China
| |
Collapse
|
7
|
Yang Q, Jiang GD, He SG. Enhancing the Performance of Global Optimization of Platinum Cluster Structures by Transfer Learning in a Deep Neural Network. J Chem Theory Comput 2023; 19:1922-1930. [PMID: 36917066 DOI: 10.1021/acs.jctc.2c00923] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/16/2023]
Abstract
The global optimization of metal cluster structures is an important research field. The traditional deep neural network (T-DNN) global optimization method is a good way to find out the global minimum (GM) of metal cluster structures, but a large number of samples are required. We developed a new global optimization method which is the combination of the DNN and transfer learning (DNN-TL). The DNN-TL method transfers the DNN parameters of the small-sized cluster to the DNN of the large-sized cluster to greatly reduce the number of samples. For the global optimization of Pt9 and Pt13 clusters in this research, the T-DNN method requires about 3-10 times more samples than the DNN-TL method, and the DNN-TL method saves about 70-80% of time. We also found that the average amplitude of parameter changes in the T-DNN training is about 2 times larger than that in the DNN-TL training, which rationalizes the effectiveness of transfer learning. The average fitting errors of the DNN trained by the DNN-TL method can be even smaller than those by the T-DNN method because of the reliability of transfer learning. Finally, we successfully obtained the GM structures of Ptn (n = 8-14) clusters by the DNN-TL method.
Collapse
Affiliation(s)
- Qi Yang
- State Key Laboratory for Structural Chemistry of Unstable and Stable Species, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, PR China.,University of Chinese Academy of Sciences, Beijing 100049, PR China.,Beijing National Laboratory for Molecular Sciences and CAS Research/Education Center of Excellence in Molecular Sciences, Beijing 100190, PR China
| | - Gui-Duo Jiang
- State Key Laboratory for Structural Chemistry of Unstable and Stable Species, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, PR China.,University of Chinese Academy of Sciences, Beijing 100049, PR China.,Beijing National Laboratory for Molecular Sciences and CAS Research/Education Center of Excellence in Molecular Sciences, Beijing 100190, PR China
| | - Sheng-Gui He
- State Key Laboratory for Structural Chemistry of Unstable and Stable Species, Institute of Chemistry, Chinese Academy of Sciences, Beijing 100190, PR China.,University of Chinese Academy of Sciences, Beijing 100049, PR China.,Beijing National Laboratory for Molecular Sciences and CAS Research/Education Center of Excellence in Molecular Sciences, Beijing 100190, PR China
| |
Collapse
|
8
|
Kumar A, DeGregorio N, Ricard T, Iyengar SS. Graph-Theoretic Molecular Fragmentation for Potential Surfaces Leads Naturally to a Tensor Network Form and Allows Accurate and Efficient Quantum Nuclear Dynamics. J Chem Theory Comput 2022; 18:7243-7259. [PMID: 36332133 DOI: 10.1021/acs.jctc.2c00484] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Molecular fragmentation methods have revolutionized quantum chemistry. Here, we use a graph-theoretically generated molecular fragmentation method, to obtain accurate and efficient representations for multidimensional potential energy surfaces and the quantum time-evolution operator, which plays a critical role in quantum chemical dynamics. In doing so, we find that the graph-theoretic fragmentation approach naturally reduces the potential portion of the time-evolution operator into a tensor network that contains a stream of coupled lower-dimensional propagation steps to potentially achieve quantum dynamics with reduced complexity. Furthermore, the fragmentation approach used here has previously been shown to allow accurate and efficient computation of post-Hartree-Fock electronic potential energy surfaces, which in many cases has been shown to be at density functional theory cost. Thus, by combining the advantages of molecular fragmentation with the tensor network formalism, the approach yields an on-the-fly quantum dynamics scheme where both the electronic potential calculation and nuclear propagation portion are enormously simplified through a single stroke. The method is demonstrated by computing approximations to the propagator and to potential surfaces for a set of coupled nuclear dimensions within a protonated water wire problem exhibiting the Grotthuss mechanism of proton transport. In all cases, our approach has been shown to reduce the complexity of representing the quantum propagator, and by extension action of the propagator on an initial wavepacket, by several orders, with minimal loss in accuracy.
Collapse
Affiliation(s)
- Anup Kumar
- Department of Chemistry, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, Bloomington, Indiana 47405, United States
| | - Nicole DeGregorio
- Department of Chemistry, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, Bloomington, Indiana 47405, United States
| | - Timothy Ricard
- Department of Chemistry, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, Bloomington, Indiana 47405, United States
| | - Srinivasan S Iyengar
- Department of Chemistry, and the Indiana University Quantum Science and Engineering Center (IU-QSEC), Indiana University, Bloomington, Indiana 47405, United States
| |
Collapse
|
9
|
Chen LL, Xu YC, Yang Y, Li N, Zou HX, Wen HH, Yan X. Prediction of peptide-induced silica formation under a wide pH range by molecular descriptors. Colloids Surf A Physicochem Eng Asp 2022. [DOI: 10.1016/j.colsurfa.2022.130030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
10
|
Liao K, Dong S, Cheng Z, Li W, Li S. Combined fragment-based machine learning force field with classical force field and its application in the NMR calculations of macromolecules in solutions. Phys Chem Chem Phys 2022; 24:18559-18567. [PMID: 35916054 DOI: 10.1039/d2cp02192g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
We have developed a combined fragment-based machine learning (ML) force field and molecular mechanics (MM) force field for simulating the structures of macromolecules in solutions, and then compute its NMR chemical shifts with the generalized energy-based fragmentation (GEBF) approach at the level of density functional theory (DFT). In this work, we first construct Gaussian approximation potential based on GEBF subsystems of macromolecules for MD simulations and then a GEBF-based neural network (GEBF-NN) with deep potential model for the studied macromolecule. Then, we develop a GEBF-NN/MM force field for macromolecules in solutions by combining the GEBF-NN force field for the solute molecule and ff14SB force field for solvent molecules. Using the GEBF-NN/MM MD simulation to generate snapshot structures of solute/solvent clusters, we then perform the NMR calculations with the GEBF approach at the DFT level to calculate NMR chemical shifts of the solute molecule. Taking a heptamer of oligopyridine-dicarboxamides in chloroform solution as an example, our results show that the GEBF-NN force field is quite accurate for this heptamer by comparing with the reference DFT results. For this heptamer in chloroform solution, both the GEBF-NN/MM and classical MD simulations could lead to helical structures from the same initial extended structure. The GEBF-DFT NMR results indicate that the GEBF-NN/MM force field could lead to more accurate NMR chemical shifts on hydrogen atoms by comparing with the experimental NMR results. Therefore, the GEBF-NN/MM force field could be employed for predicting more accurate dynamical behaviors than the classical force field for complex systems in solutions.
Collapse
Affiliation(s)
- Kang Liao
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shiyu Dong
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Zheng Cheng
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Wei Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| | - Shuhua Li
- School of Chemistry and Chemical Engineering, Key Laboratory of Mesoscopic Chemistry of Ministry of Education, Institute of Theoretical and Computational Chemistry, Nanjing University, Nanjing, 210023, P. R. China.
| |
Collapse
|
11
|
Mato J, Tzeli D, Xantheas SS. The Many-Body Expansion for Metals I: The Alkaline Earth metals Be, Mg, and Ca. J Chem Phys 2022; 157:084313. [DOI: 10.1063/5.0094598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We examine the Many-Body Expansion (MBE) for alkaline earth metal clusters, Be n, Mg n, Ca n ( n = 4, 5, 6) at the MP2, CCSD(T), MRPT2, and MRCI levels of theory. The magnitude of each term in the MBE is evaluated for several geometrical configurations. We find that the behavior of the MBE for these clusters depends strongly on the geometrical arrangement, and, to a lesser extent, on the level of theory used. Another factor that affects the MBE is the in situ (ground or excited) electronic state of the individual atoms in the cluster. For most geometries, the three-body term is the largest, followed by a steady decrease in absolute energy for subsequent terms. Though these systems exhibit non-negligible multi-reference effects, there was little qualitative difference in the MBE expansion when employing single vs. multi-reference methods. Useful insights into the connectivity and stability of these clusters have been drawn from the respective potential energy surfaces and Quasi-Atomic orbitals for the various dimers, trimers, and tetramers. Through these analyses we investigate the similarities and differences in the binding energies of different size clusters for these metals.
Collapse
Affiliation(s)
- Joani Mato
- Chemical Physics, Pacific Northwest National Laboratory, United States of America
| | - Demeter Tzeli
- Department of Chemistry, National and Kapodistrian University of Athens Department of Chemistry, Greece
| | | |
Collapse
|
12
|
Xiouras C, Cameli F, Quilló GL, Kavousanakis ME, Vlachos DG, Stefanidis GD. Applications of Artificial Intelligence and Machine Learning Algorithms to Crystallization. Chem Rev 2022; 122:13006-13042. [PMID: 35759465 DOI: 10.1021/acs.chemrev.2c00141] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Artificial intelligence and specifically machine learning applications are nowadays used in a variety of scientific applications and cutting-edge technologies, where they have a transformative impact. Such an assembly of statistical and linear algebra methods making use of large data sets is becoming more and more integrated into chemistry and crystallization research workflows. This review aims to present, for the first time, a holistic overview of machine learning and cheminformatics applications as a novel, powerful means to accelerate the discovery of new crystal structures, predict key properties of organic crystalline materials, simulate, understand, and control the dynamics of complex crystallization process systems, as well as contribute to high throughput automation of chemical process development involving crystalline materials. We critically review the advances in these new, rapidly emerging research areas, raising awareness in issues such as the bridging of machine learning models with first-principles mechanistic models, data set size, structure, and quality, as well as the selection of appropriate descriptors. At the same time, we propose future research at the interface of applied mathematics, chemistry, and crystallography. Overall, this review aims to increase the adoption of such methods and tools by chemists and scientists across industry and academia.
Collapse
Affiliation(s)
- Christos Xiouras
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium
| | - Fabio Cameli
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Gustavo Lunardon Quilló
- Chemical Process R&D, Crystallization Technology Unit, Janssen R&D, Turnhoutseweg 30, 2340 Beerse, Belgium.,Chemical and BioProcess Technology and Control, Department of Chemical Engineering, Faculty of Engineering Technology, KU Leuven, Gebroeders de Smetstraat 1, 9000 Ghent, Belgium
| | - Mihail E Kavousanakis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece
| | - Dionisios G Vlachos
- Department of Chemical and Biomolecular Engineering, University of Delaware, 150 Academy Street, Newark, Delaware 19716, United States
| | - Georgios D Stefanidis
- School of Chemical Engineering, National Technical University of Athens, Heroon Polytechniou 9, 15780 Zografou, Greece.,Laboratory for Chemical Technology, Ghent University; Tech Lane Ghent Science Park 125, B-9052 Ghent, Belgium
| |
Collapse
|
13
|
Kwon T, Song HW, Woo SY, Kim J, Sung BJ. The accurate estimation of the third virial coefficients for helium using three‐body neural network potentials. B KOREAN CHEM SOC 2022. [DOI: 10.1002/bkcs.12497] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Taejin Kwon
- Department of Chemistry and Research Institute for Basic Science Sogang University Seoul South Korea
| | - Han Wook Song
- Center for Mechanical Metrology Korea Research Institute of Standards and Science (KRISS) Daejeon South Korea
| | - Sam Yong Woo
- Center for Mechanical Metrology Korea Research Institute of Standards and Science (KRISS) Daejeon South Korea
| | - Jong‐Ho Kim
- Center for Mechanical Metrology Korea Research Institute of Standards and Science (KRISS) Daejeon South Korea
| | - Bong June Sung
- Department of Chemistry and Research Institute for Basic Science Sogang University Seoul South Korea
| |
Collapse
|
14
|
On the Investigation of Effective Factors on Electronic Structure Properties of Transition Metal Complexes: Robust Modeling Using GPR Approach. INTERNATIONAL JOURNAL OF CHEMICAL ENGINEERING 2022. [DOI: 10.1155/2022/8264297] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Materials discovery is usually done using high-throughput computational screening. The use of costly and complex direct density functional theory (DFT) simulation methods has been commonly used to determine subtle trends in spin-state ordering and inorganic bonding of inorganic materials and, in general, to predict the electronic structure properties of transition metal complexes. A Gaussian process regression (GPR) framework consisting of four kernel functions is introduced for spin-state splitting estimation through inorganic chemistry-appropriate empirical inputs. To this end, the present study reviewed an extensive range of data values from earlier works. According to statistical analysis, the GPR model showed very good performance. The coefficients of determination were calculated to be 0.986 for the exponential and Matern kernel functions, suggesting the highest predictive power of these methods. Moreover, the sensitivity of output to inputs was measured. Artificial intelligence (AI) helped accurately predict the target values through various input ranges.
Collapse
|
15
|
Jindal S, Hsu PJ, Phan HT, Tsou PK, Kuo JL. Capturing the potential energy landscape of large size molecular clusters from atomic interactions up to a 4-body system using deep learning. Phys Chem Chem Phys 2022; 24:27263-27276. [DOI: 10.1039/d2cp04441b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
We propose a new method that utilizes the database of stable conformers and borrow the fragmentation concept of many-body-expansion (MBE) methods in ab initio methods to train a deep-learning machine learning (ML) model using SchNet.
Collapse
Affiliation(s)
- Shweta Jindal
- Institute of Atomic and Molecular Sciences, Academia Sinica, No. 1 Roosevelt Road, Section 4, Daan District, Taipei City 10617, Taiwan
| | - Po-Jen Hsu
- Institute of Atomic and Molecular Sciences, Academia Sinica, No. 1 Roosevelt Road, Section 4, Daan District, Taipei City 10617, Taiwan
| | - Huu Trong Phan
- Institute of Atomic and Molecular Sciences, Academia Sinica, No. 1 Roosevelt Road, Section 4, Daan District, Taipei City 10617, Taiwan
- Molecular Science and Technology Program, Taiwan International Graduate Program, Academia Sinica, Taipei, 11529, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
| | - Pei-Kang Tsou
- Institute of Atomic and Molecular Sciences, Academia Sinica, No. 1 Roosevelt Road, Section 4, Daan District, Taipei City 10617, Taiwan
| | - Jer-Lai Kuo
- Institute of Atomic and Molecular Sciences, Academia Sinica, No. 1 Roosevelt Road, Section 4, Daan District, Taipei City 10617, Taiwan
- Department of Chemistry, National Tsing Hua University, Hsinchu 30013, Taiwan
- Molecular Science and Technology, National Taiwan University, Section 4, Daan District, Taipei City 10617, Taiwan
| |
Collapse
|
16
|
Yan W, Zhu YF, Xie WY, Song HW, Zhang CY, Yang MH. A new many-body expansion scheme for atomic clusters: Application to nitrogen clusters. CHINESE J CHEM PHYS 2021. [DOI: 10.1063/1674-0068/cjcp2109173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Affiliation(s)
- Wei Yan
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yong-fa Zhu
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Wei-yu Xie
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang 621900, China
| | - Hong-wei Song
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
| | - Chao-yang Zhang
- Institute of Chemical Materials, China Academy of Engineering Physics, Mianyang 621900, China
| | - Ming-hui Yang
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430071, China
- Wuhan National Laboratory for Optoelectronics, Huazhong University of Science and Technology, Wuhan 430071, China
| |
Collapse
|
17
|
Kumar A, DeGregorio N, Iyengar SS. Graph-Theory-Based Molecular Fragmentation for Efficient and Accurate Potential Surface Calculations in Multiple Dimensions. J Chem Theory Comput 2021; 17:6671-6690. [PMID: 34623129 DOI: 10.1021/acs.jctc.1c00065] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We present a multitopology molecular fragmentation approach, based on graph theory, to calculate multidimensional potential energy surfaces in agreement with post-Hartree-Fock levels of theory but at the density functional theory cost. A molecular assembly is coarse-grained into a set of graph-theoretic nodes that are then connected with edges to represent a collection of locally interacting subsystems up to an arbitrary order. Each of the subsystems is treated at two levels of electronic structure theory, the result being used to construct many-body expansions that are embedded within an ONIOM scheme. These expansions converge rapidly with the many-body order (or graphical rank) of subsystems and capture many-body interactions accurately and efficiently. However, multiple graphs, and hence multiple fragmentation topologies, may be defined in molecular configuration space that may arise during conformational sampling or from reactive, bond breaking and bond formation, events. Obtaining the resultant potential surfaces is an exponential scaling proposition, given the number of electronic structure computations needed. We utilize a family of graph-theoretic representations within a variational scheme to obtain multidimensional potential surfaces at a reduced cost. The fast convergence of the graph-theoretic expansion with increasing order of many-body interactions alleviates the exponential scaling cost for computing potential surfaces, with the need to only use molecular fragments that contain a fewer number of quantum nuclear degrees of freedom compared to the full system. This is because the dimensionality of the conformational space sampled by the fragment subsystems is much smaller than the full molecular configurational space. Additionally, we also introduce a multidimensional clustering algorithm, based on physically defined criteria, to reduce the number of energy calculations by orders of magnitude. The molecular systems benchmarked include coupled proton motion in protonated water wires. The potential energy surfaces and multidimensional nuclear eigenstates obtained are shown to be in very good agreement with those from explicit post-Hartree-Fock calculations that become prohibitive as the number of quantum nuclear dimensions grows. The developments here provide a rigorous and efficient alternative to this important chemical physics problem.
Collapse
Affiliation(s)
- Anup Kumar
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Nicole DeGregorio
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| | - Srinivasan S Iyengar
- Department of Chemistry and Department of Physics, Indiana University, 800 E. Kirkwood Avenue, Bloomington, Indiana 47405, United States
| |
Collapse
|
18
|
Xu M, Zhu T, Zhang JZH. Automatically Constructed Neural Network Potentials for Molecular Dynamics Simulation of Zinc Proteins. Front Chem 2021; 9:692200. [PMID: 34222200 PMCID: PMC8249736 DOI: 10.3389/fchem.2021.692200] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2021] [Accepted: 05/10/2021] [Indexed: 11/13/2022] Open
Abstract
The development of accurate and efficient potential energy functions for the molecular dynamics simulation of metalloproteins has long been a great challenge for the theoretical chemistry community. An artificial neural network provides the possibility to develop potential energy functions with both the efficiency of the classical force fields and the accuracy of the quantum chemical methods. In this work, neural network potentials were automatically constructed by using the ESOINN-DP method for typical zinc proteins. For the four most common zinc coordination modes in proteins, the potential energy, atomic forces, and atomic charges predicted by neural network models show great agreement with quantum mechanics calculations and the neural network potential can maintain the coordination geometry correctly. In addition, MD simulation and energy optimization with the neural network potential can be readily used for structural refinement. The neural network potential is not limited by the function form and complex parameterization process, and important quantum effects such as polarization and charge transfer can be accurately considered. The algorithm proposed in this work can also be directly applied to proteins containing other metal ions.
Collapse
Affiliation(s)
- Mingyuan Xu
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
| | - Tong Zhu
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
| | - John Z. H. Zhang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai, China
- Department of Chemistry, New York University, New York, NY, United States
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, China
| |
Collapse
|
19
|
Vassilev-Galindo V, Fonseca G, Poltavsky I, Tkatchenko A. Challenges for machine learning force fields in reproducing potential energy surfaces of flexible molecules. J Chem Phys 2021; 154:094119. [DOI: 10.1063/5.0038516] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
- Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Gregory Fonseca
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| |
Collapse
|
20
|
Abstract
We present a Perspective on what the future holds for full configuration interaction (FCI) theory, with an emphasis on conceptual rather than technical details. Upon revisiting the early history of FCI, a number of its key contemporary approximations are compared on as equal a footing as possible, using a recent blind challenge on the benzene molecule as a testbed [Eriksen et al., J. Phys. Chem. Lett., 2020 11, 8922]. In the process, we review the scope of applications for which FCI continues to prove indispensable, and the required traits in terms of robustness, efficacy, and reliability its modern approximations must satisfy are discussed. We close by conveying a number of general observations on the merits offered by the state-of-the-art alongside some of the challenges still faced to this day. While the field has altogether seen immense progress over the years-the past decade, in particular-it remains clear that our community as a whole has a substantial way to go in enhancing the overall applicability of near-exact electronic structure theory for systems of general composition and increasing size.
Collapse
Affiliation(s)
- Janus J Eriksen
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom
| |
Collapse
|
21
|
Bogojeski M, Vogt-Maranto L, Tuckerman ME, Müller KR, Burke K. Quantum chemical accuracy from density functional approximations via machine learning. Nat Commun 2020; 11:5223. [PMID: 33067479 PMCID: PMC7567867 DOI: 10.1038/s41467-020-19093-1] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2020] [Accepted: 09/24/2020] [Indexed: 12/21/2022] Open
Abstract
Kohn-Sham density functional theory (DFT) is a standard tool in most branches of chemistry, but accuracies for many molecules are limited to 2-3 kcal ⋅ mol-1 with presently-available functionals. Ab initio methods, such as coupled-cluster, routinely produce much higher accuracy, but computational costs limit their application to small molecules. In this paper, we leverage machine learning to calculate coupled-cluster energies from DFT densities, reaching quantum chemical accuracy (errors below 1 kcal ⋅ mol-1) on test data. Moreover, density-based Δ-learning (learning only the correction to a standard DFT calculation, termed Δ-DFT ) significantly reduces the amount of training data required, particularly when molecular symmetries are included. The robustness of Δ-DFT is highlighted by correcting "on the fly" DFT-based molecular dynamics (MD) simulations of resorcinol (C6H4(OH)2) to obtain MD trajectories with coupled-cluster accuracy. We conclude, therefore, that Δ-DFT facilitates running gas-phase MD simulations with quantum chemical accuracy, even for strained geometries and conformer changes where standard DFT fails.
Collapse
Affiliation(s)
- Mihail Bogojeski
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany
| | | | - Mark E Tuckerman
- Department of Chemistry, New York University, New York, NY, 10003, USA.
- Courant Institute of Mathematical Science, New York University, New York, NY, 10012, USA.
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, 3663 Zhongshan Road North, Shanghai, 200062, China.
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, Marchstr. 23, 10587, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul, 02841, Korea.
- Max-Planck-Institut für Informatik, Stuhlsatzenhausweg, 66123, Saarbrücken, Germany.
| | - Kieron Burke
- Department of Physics and Astronomy, University of California, Irvine, CA, 92697, USA.
- Department of Chemistry, University of California, Irvine, CA, 92697, USA.
| |
Collapse
|
22
|
Manzhos S, Carrington T. Neural Network Potential Energy Surfaces for Small Molecules and Reactions. Chem Rev 2020; 121:10187-10217. [PMID: 33021368 DOI: 10.1021/acs.chemrev.0c00665] [Citation(s) in RCA: 110] [Impact Index Per Article: 27.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We review progress in neural network (NN)-based methods for the construction of interatomic potentials from discrete samples (such as ab initio energies) for applications in classical and quantum dynamics including reaction dynamics and computational spectroscopy. The main focus is on methods for building molecular potential energy surfaces (PES) in internal coordinates that explicitly include all many-body contributions, even though some of the methods we review limit the degree of coupling, due either to a desire to limit computational cost or to limited data. Explicit and direct treatment of all many-body contributions is only practical for sufficiently small molecules, which are therefore our primary focus. This includes small molecules on surfaces. We consider direct, single NN PES fitting as well as more complex methods that impose structure (such as a multibody representation) on the PES function, either through the architecture of one NN or by using multiple NNs. We show how NNs are effective in building representations with low-dimensional functions including dimensionality reduction. We consider NN-based approaches to build PESs in the sums-of-product form important for quantum dynamics, ways to treat symmetry, and issues related to sampling data distributions and the relation between PES errors and errors in observables. We highlight combinations of NNs with other ideas such as permutationally invariant polynomials or sums of environment-dependent atomic contributions, which have recently emerged as powerful tools for building highly accurate PESs for relatively large molecular and reactive systems.
Collapse
Affiliation(s)
- Sergei Manzhos
- Centre Énergie Matériaux Télécommunications, Institut National de la Recherche Scientifique, 1650, Boulevard Lionel-Boulet, Varennes, Québec City, Québec J3X 1S2, Canada
| | - Tucker Carrington
- Chemistry Department, Queen's University, Kingston Ontario K7L 3N6, Canada
| |
Collapse
|
23
|
Sauceda HE, Gastegger M, Chmiela S, Müller KR, Tkatchenko A. Molecular force fields with gradient-domain machine learning (GDML): Comparison and synergies with classical force fields. J Chem Phys 2020; 153:124109. [DOI: 10.1063/5.0023005] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
- Huziel E. Sauceda
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- BASLEARN, BASF-TU Joint Lab, Technische Universität Berlin, 10587 Berlin, Germany
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- BASLEARN, BASF-TU Joint Lab, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence “Unifying Systems in Catalysis” (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 136-713, South Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- Google Research, Brain Team, Berlin, Germany
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| |
Collapse
|
24
|
Sugisawa H, Ida T, Krems RV. Gaussian process model of 51-dimensional potential energy surface for protonated imidazole dimer. J Chem Phys 2020; 153:114101. [DOI: 10.1063/5.0023492] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Affiliation(s)
- Hiroki Sugisawa
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma, Kanazawa 920-1192, Japan
| | - Tomonori Ida
- Division of Material Chemistry, Graduate School of Natural Science and Technology, Kanazawa University, Kakuma, Kanazawa 920-1192, Japan
| | - R. V. Krems
- Department of Chemistry, University of British Columbia, Vancouver, British Columbia V6T 1Z1, Canada
- Stewart Blusson Quantum Matter Institute, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada
| |
Collapse
|
25
|
Xie X, Persson KA, Small DW. Incorporating Electronic Information into Machine Learning Potential Energy Surfaces via Approaching the Ground-State Electronic Energy as a Function of Atom-Based Electronic Populations. J Chem Theory Comput 2020; 16:4256-4270. [PMID: 32502350 DOI: 10.1021/acs.jctc.0c00217] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Machine learning (ML) approximations to density functional theory (DFT) potential energy surfaces (PESs) are showing great promise for reducing the computational cost of accurate molecular simulations, but at present, they are not applicable to varying electronic states, and in particular, they are not well suited for molecular systems in which the local electronic structure is sensitive to the medium to long-range electronic environment. With this issue as the focal point, we present a new machine learning approach called "BpopNN" for obtaining efficient approximations to DFT PESs. Conceptually, the methodology is based on approaching the true DFT energy as a function of electron populations on atoms; in practice, this is realized with available density functionals and constrained DFT (CDFT). The new approach creates approximations to this function with neural networks. These approximations thereby incorporate electronic information naturally into a ML approach, and optimizing the model energy with respect to populations allows the electronic terms to self-consistently adapt to the environment, as in DFT. We confirm the effectiveness of this approach with a variety of calculations on LinHn clusters.
Collapse
Affiliation(s)
- Xiaowei Xie
- Department of Chemistry, University of California, Berkeley, California 94720, United States.,Energy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Kristin A Persson
- Energy Technologies Area, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States.,Department of Materials Science and Engineering, University of California, Berkeley, California 94720, United States
| | - David W Small
- Department of Chemistry, University of California, Berkeley, California 94720, United States.,Molecular Graphics and Computation Facility, College of Chemistry, University of California, Berkeley 94720, California United States
| |
Collapse
|
26
|
Yanes-Rodríguez R, Arismendi-Arrieta DJ, Prosmiti R. He Inclusion in Ice-like and Clathrate-like Frameworks: A Benchmark Quantum Chemistry Study of Guest-Host Interactions. J Chem Inf Model 2020; 60:3043-3056. [PMID: 32469514 DOI: 10.1021/acs.jcim.0c00349] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Energetics and structural properties of selected type and size He@hydrate frameworks, e.g., from regular structured ice channels to clathrate-like cages, are presented from first-principles quantum chemistry methods. The scarcity of information on He@hydrates makes such complexes challenging targets, while their computational study entails an interesting and arduous task. Some of them have been synthesized in the laboratory, which motivates further investigations on their stability. Hence, the main focus is to examine the performance and accuracy of different wave function-based electronic structure methods, such as MP2, CCSD(T), their explicitly correlated (F12) and domain-based local pair-natural orbital (DLPNO) analogs, as well as modern and conventional density functional theory (DFT) approaches, and analytical model potentials available. Different structures are considered, starting from the "simplest system" formed by a noble gas atom (such as He) and one water molecule, followed by the study of the "fundamental units" present in all ice-like and clathrate-like frameworks (such as pentamers and hexamers) and finally the description of interactions in the "building blocks" of three-dimensional (3D) ice channels (e.g., horizontal and perpendicular ice II and Ih) and clathrate-like cages, such as the 512 present in the most common sI, sII, and sH clathrate-hydrate structures. The idea is to provide well-converged DLPNO-CCSD(T) and DFMP2/CBS reference datasets that in turn are used to validate how DFT functionals (in total, 29 approaches from generalized-gradient approximation (GGA), meta-GGA, to hybrid and range-separated functionals, including dispersion correction treatments, were checked) and analytical semiempirical/ab initio-based potentials perform compared with high-level alternatives. Within all tested approaches, those best-performing were identified and classified. Most of the DFT/DFT-D functionals, as well as available analytical pairwise model potentials, face difficulties in describing both hydrogen-bonded water frameworks and dispersion bound He-water interactions. Including dispersion corrections yields an overall well-balanced performance for LCωPBE-D3BJ and PBE0-D4 functionals. Such benchmark datasets can benefit research into the development of new cheminformatics models, as can serve to guide and cross-check methodologies, lending increased predicted power to future molecular simulations for investigating the role of structures and phase transitions from nanoscale clusters to macroscopic crystalline structures.
Collapse
Affiliation(s)
| | - Daniel J Arismendi-Arrieta
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain.,Donostia International Physics Center (DIPC), Paseo Manuel de Lardizabal 4, Gipuzkoa, 20018 Donostia-San Sebastián, Spain
| | - Rita Prosmiti
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain
| |
Collapse
|
27
|
Muratov EN, Bajorath J, Sheridan RP, Tetko IV, Filimonov D, Poroikov V, Oprea TI, Baskin II, Varnek A, Roitberg A, Isayev O, Curtarolo S, Fourches D, Cohen Y, Aspuru-Guzik A, Winkler DA, Agrafiotis D, Cherkasov A, Tropsha A. QSAR without borders. Chem Soc Rev 2020; 49:3525-3564. [PMID: 32356548 PMCID: PMC8008490 DOI: 10.1039/d0cs00098a] [Citation(s) in RCA: 312] [Impact Index Per Article: 78.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Prediction of chemical bioactivity and physical properties has been one of the most important applications of statistical and more recently, machine learning and artificial intelligence methods in chemical sciences. This field of research, broadly known as quantitative structure-activity relationships (QSAR) modeling, has developed many important algorithms and has found a broad range of applications in physical organic and medicinal chemistry in the past 55+ years. This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed in QSAR to a wide range of research areas outside of traditional QSAR boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics. As modern research methods generate rapidly increasing amounts of data, the knowledge of robust data-driven modelling methods professed within the QSAR field can become essential for scientists working both within and outside of chemical research. We hope that this contribution highlighting the generalizable components of QSAR modeling will serve to address this challenge.
Collapse
Affiliation(s)
- Eugene N Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Cheng Z, Zhao D, Ma J, Li W, Li S. An On-the-Fly Approach to Construct Generalized Energy-Based Fragmentation Machine Learning Force Fields of Complex Systems. J Phys Chem A 2020; 124:5007-5014. [DOI: 10.1021/acs.jpca.0c04526] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]
Affiliation(s)
- Zheng Cheng
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Dongbo Zhao
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
- Kuang Yaming Honors School, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Jing Ma
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Wei Li
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| | - Shuhua Li
- Institute of Theoretical and Computational Chemistry, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing 210023, People’s Republic of China
| |
Collapse
|
29
|
Whitfield TW, Ragland DA, Zeldovich KB, Schiffer CA. Characterizing Protein-Ligand Binding Using Atomistic Simulation and Machine Learning: Application to Drug Resistance in HIV-1 Protease. J Chem Theory Comput 2020; 16:1284-1299. [PMID: 31877249 DOI: 10.1021/acs.jctc.9b00781] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Over the past several decades, atomistic simulations of biomolecules, whether carried out using molecular dynamics or Monte Carlo techniques, have provided detailed insights into their function. Comparing the results of such simulations for a few closely related systems has guided our understanding of the mechanisms by which changes such as ligand binding or mutation can alter the function. The general problem of detecting and interpreting such mechanisms from simulations of many related systems, however, remains a challenge. This problem is addressed here by applying supervised and unsupervised machine learning techniques to a variety of thermodynamic observables extracted from molecular dynamics simulations of different systems. As an important test case, these methods are applied to understand the evasion by human immunodeficiency virus type-1 (HIV-1) protease of darunavir, a potent inhibitor to which resistance can develop via the simultaneous mutation of multiple amino acids. Complex mutational patterns have been observed among resistant strains, presenting a challenge to developing a mechanistic picture of resistance in the protease. In order to dissect these patterns and gain mechanistic insight into the role of specific mutations, molecular dynamics simulations were carried out on a collection of HIV-1 protease variants, chosen to include highly resistant strains and susceptible controls, in complex with darunavir. Using a machine learning approach that takes advantage of the hierarchical nature in the relationships among the sequence, structure, and function, an integrative analysis of these trajectories reveals key details of the resistance mechanism, including changes in the protein structure, hydrogen bonding, and protein-ligand contacts.
Collapse
Affiliation(s)
- Troy W Whitfield
- Department of Medicine , University of Massachusetts Medical School , Worcester , Massachusetts 01605 , United States.,Program in Bioinformatics and Integrative Biology , University of Massachusetts Medical School , Worcester , Massachusetts 01605 , United States
| | - Debra A Ragland
- Department of Biochemistry and Molecular Pharmacology , University of Massachusetts Medical School , Worcester , Massachusetts 01605 , United States
| | - Konstantin B Zeldovich
- Program in Bioinformatics and Integrative Biology , University of Massachusetts Medical School , Worcester , Massachusetts 01605 , United States
| | - Celia A Schiffer
- Department of Biochemistry and Molecular Pharmacology , University of Massachusetts Medical School , Worcester , Massachusetts 01605 , United States
| |
Collapse
|
30
|
Building Nonparametric n-Body Force Fields Using Gaussian Process Regression. MACHINE LEARNING MEETS QUANTUM PHYSICS 2020. [DOI: 10.1007/978-3-030-40245-7_5] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
|
31
|
Sauceda HE, Chmiela S, Poltavsky I, Müller KR, Tkatchenko A. Construction of Machine Learned Force Fields with Quantum Chemical Accuracy: Applications and Chemical Insights. MACHINE LEARNING MEETS QUANTUM PHYSICS 2020. [DOI: 10.1007/978-3-030-40245-7_14] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
|
32
|
Accurate Molecular Dynamics Enabled by Efficient Physically Constrained Machine Learning Approaches. MACHINE LEARNING MEETS QUANTUM PHYSICS 2020. [DOI: 10.1007/978-3-030-40245-7_7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
|
33
|
Brown SE. From ab initio data to high-dimensional potential energy surfaces: A critical overview and assessment of the development of permutationally invariant polynomial potential energy surfaces for single molecules. J Chem Phys 2019; 151:194111. [PMID: 31757150 DOI: 10.1063/1.5123999] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The representation of high-dimensional potential energy surfaces by way of the many-body expansion and permutationally invariant polynomials has become a well-established tool for improving the resolution and extending the scope of molecular simulations. The high level of accuracy that can be attained by these potential energy functions (PEFs) is due in large part to their specificity: for each term in the many-body expansion, a species-specific training set must be generated at the desired level of theory and a number of fits attempted in order to obtain a robust and reliable PEF. In this work, we attempt to characterize the numerical aspects of the fitting problem, addressing questions which are of simultaneous practical and fundamental importance. These include concrete illustrations of the nonconvexity of the problem, the ill-conditionedness of the linear system to be solved and possible need for regularization, the sensitivity of the solutions to the characteristics of the training set, and limitations of the approach with respect to accuracy and the types of molecules that can be treated. In addition, we introduce a general approach to the generation of training set configurations based on the familiar harmonic approximation and evaluate the possible benefits to the use of quasirandom sequences for sampling configuration space in this context. Using sulfate as a case study, the findings are largely generalizable and expected to ultimately facilitate the efficient development of PIP-based many-body PEFs for general systems via automation.
Collapse
Affiliation(s)
- Sandra E Brown
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
34
|
Application of Computational Biology and Artificial Intelligence Technologies in Cancer Precision Drug Discovery. BIOMED RESEARCH INTERNATIONAL 2019; 2019:8427042. [PMID: 31886259 PMCID: PMC6925679 DOI: 10.1155/2019/8427042] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2019] [Accepted: 10/14/2019] [Indexed: 02/08/2023]
Abstract
Artificial intelligence (AI) proves to have enormous potential in many areas of healthcare including research and chemical discoveries. Using large amounts of aggregated data, the AI can discover and learn further transforming these data into “usable” knowledge. Being well aware of this, the world's leading pharmaceutical companies have already begun to use artificial intelligence to improve their research regarding new drugs. The goal is to exploit modern computational biology and machine learning systems to predict the molecular behaviour and the likelihood of getting a useful drug, thus saving time and money on unnecessary tests. Clinical studies, electronic medical records, high-resolution medical images, and genomic profiles can be used as resources to aid drug development. Pharmaceutical and medical researchers have extensive data sets that can be analyzed by strong AI systems. This review focused on how computational biology and artificial intelligence technologies can be implemented by integrating the knowledge of cancer drugs, drug resistance, next-generation sequencing, genetic variants, and structural biology in the cancer precision drug discovery.
Collapse
|
35
|
Song Q, Zhang Q, Meng Q. Neural-network potential energy surface with small database and high precision: A benchmark of the H + H2 system. J Chem Phys 2019; 151:114302. [PMID: 31542037 DOI: 10.1063/1.5118692] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Qingfei Song
- Department of Applied Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi’an, China
- Ministry-of-Education Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi’an, China
| | - Qiuyu Zhang
- Department of Applied Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi’an, China
- Ministry-of-Education Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi’an, China
| | - Qingyong Meng
- Department of Applied Chemistry, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi’an, China
- Ministry-of-Education Key Laboratory of Materials Physics and Chemistry under Extraordinary Conditions, Northwestern Polytechnical University, West Youyi Road 127, 710072 Xi’an, China
| |
Collapse
|
36
|
Herr JE, Koh K, Yao K, Parkhill J. Compressing physics with an autoencoder: Creating an atomic species representation to improve machine learning models in the chemical sciences. J Chem Phys 2019; 151:084103. [DOI: 10.1063/1.5108803] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Affiliation(s)
- John E. Herr
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| | - Kevin Koh
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| | - Kun Yao
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| | - John Parkhill
- Department of Chemistry and Biochemistry, The University of Notre Dame du Lac, 251 Nieuwland Science Hall, Notre Dame, Indiana 46556, USA
| |
Collapse
|
37
|
Jinich A, Sanchez-Lengeling B, Ren H, Harman R, Aspuru-Guzik A. A Mixed Quantum Chemistry/Machine Learning Approach for the Fast and Accurate Prediction of Biochemical Redox Potentials and Its Large-Scale Application to 315 000 Redox Reactions. ACS CENTRAL SCIENCE 2019; 5:1199-1210. [PMID: 31404220 PMCID: PMC6661861 DOI: 10.1021/acscentsci.9b00297] [Citation(s) in RCA: 41] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/24/2019] [Indexed: 05/05/2023]
Abstract
A quantitative understanding of the thermodynamics of biochemical reactions is essential for accurately modeling metabolism. The group contribution method (GCM) is one of the most widely used approaches to estimate standard Gibbs energies and redox potentials of reactions for which no experimental measurements exist. Previous work has shown that quantum chemical predictions of biochemical thermodynamics are a promising approach to overcome the limitations of GCM. However, the quantum chemistry approach is significantly more expensive. Here, we use a combination of quantum chemistry and machine learning to obtain a fast and accurate method for predicting the thermodynamics of biochemical redox reactions. We focus on predicting the redox potentials of carbonyl functional group reductions to alcohols and amines, two of the most ubiquitous carbon redox transformations in biology. Our method relies on semiempirical quantum chemistry calculations calibrated with Gaussian process (GP) regression against available experimental data and results in higher predictive power than the GCM at low computational cost. Direct calibration of GCM and fingerprint-based predictions (without quantum chemistry) with GP regression also results in significant improvements in prediction accuracy, demonstrating the versatility of the approach. We design and implement a network expansion algorithm that iteratively reduces and oxidizes a set of natural seed metabolites and demonstrate the high-throughput applicability of our method by predicting the standard potentials of more than 315 000 redox reactions involving approximately 70 000 compounds. Additionally, we developed a novel fingerprint-based framework for detecting molecular environment motifs that are enriched or depleted across different regions of the redox potential landscape. We provide open access to all source code and data generated.
Collapse
Affiliation(s)
- Adrian Jinich
- Department
of Chemistry and Chemical Biology, Harvard
University, Cambridge, Massachusetts 02138, United States
- Division
of Infectious Diseases, Weill Department of Medicine, Weill−Cornell Medical College, New York, New York 10065, United States
| | - Benjamin Sanchez-Lengeling
- Department
of Chemistry and Chemical Biology, Harvard
University, Cambridge, Massachusetts 02138, United States
| | - Haniu Ren
- Department
of Chemistry and Chemical Biology, Harvard
University, Cambridge, Massachusetts 02138, United States
| | - Rebecca Harman
- Department
of Chemistry and Chemical Biology, Harvard
University, Cambridge, Massachusetts 02138, United States
| | - Alán Aspuru-Guzik
- Department
of Chemistry and Department of Computer Science, University of Toronto, 80 St. George Street, Toronto, Ontario M5S 3H6, Canada
- Vector
Institute, Toronto, Ontario M5G 1M1, Canada
- Biologically-Inspired
Solar Energy Program, Canadian Institute
for Advanced Research (CIFAR), Toronto, Ontario M5S 1M1, Canada
| |
Collapse
|
38
|
Xu M, Zhu T, Zhang JZH. Molecular Dynamics Simulation of Zinc Ion in Water with an ab Initio Based Neural Network Potential. J Phys Chem A 2019; 123:6587-6595. [DOI: 10.1021/acs.jpca.9b04087] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Mingyuan Xu
- State Key Lab of Precision Spectroscopy, Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
| | - Tong Zhu
- State Key Lab of Precision Spectroscopy, Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
| | - John Z. H. Zhang
- State Key Lab of Precision Spectroscopy, Shanghai Engineering Research Center of Molecular Therapeutics & New Drug Development, Shanghai Key Laboratory of Green Chemistry & Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Department of Chemistry, New York University, New York City, New York 10003, United States
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| |
Collapse
|
39
|
Yang X, Wang Y, Byrne R, Schneider G, Yang S. Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery. Chem Rev 2019; 119:10520-10594. [PMID: 31294972 DOI: 10.1021/acs.chemrev.8b00728] [Citation(s) in RCA: 343] [Impact Index Per Article: 68.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Artificial intelligence (AI), and, in particular, deep learning as a subcategory of AI, provides opportunities for the discovery and development of innovative drugs. Various machine learning approaches have recently (re)emerged, some of which may be considered instances of domain-specific AI which have been successfully employed for drug discovery and design. This review provides a comprehensive portrayal of these machine learning techniques and of their applications in medicinal chemistry. After introducing the basic principles, alongside some application notes, of the various machine learning algorithms, the current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects. Finally, several challenges and limitations of the current methods are summarized, with a view to potential future directions for AI-assisted drug discovery and design.
Collapse
Affiliation(s)
- Xin Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Yifei Wang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| | - Ryan Byrne
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Gisbert Schneider
- ETH Zurich , Department of Chemistry and Applied Biosciences , Vladimir-Prelog-Weg 4 , CH-8093 Zurich , Switzerland
| | - Shengyong Yang
- State Key Laboratory of Biotherapy and Cancer Center, West China Hospital , Sichuan University , Chengdu , Sichuan 610041 , China
| |
Collapse
|
40
|
Willatt MJ, Musil F, Ceriotti M. Atom-density representations for machine learning. J Chem Phys 2019; 150:154110. [DOI: 10.1063/1.5090481] [Citation(s) in RCA: 91] [Impact Index Per Article: 18.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Affiliation(s)
- Michael J. Willatt
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| | - Félix Musil
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
- National Center for Computational Design and Discovery of Novel Materials (MARVEL), Lausanne, Switzerland
| | - Michele Ceriotti
- Laboratory of Computational Science and Modeling, Institute of Materials, École Polytechnique Fédérale de Lausanne, 1015 Lausanne, Switzerland
| |
Collapse
|
41
|
Sauceda HE, Chmiela S, Poltavsky I, Müller KR, Tkatchenko A. Molecular force fields with gradient-domain machine learning: Construction and application to dynamics of small molecules with coupled cluster forces. J Chem Phys 2019; 150:114102. [DOI: 10.1063/1.5078687] [Citation(s) in RCA: 56] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Huziel E. Sauceda
- Fritz-Haber-Institut der Max-Planck-Gesellschaft, 14195 Berlin, Germany
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
| | - Igor Poltavsky
- Physics and Materials Science Research Unit, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| | - Klaus-Robert Müller
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Brain and Cognitive Engineering, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, South Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
| | - Alexandre Tkatchenko
- Physics and Materials Science Research Unit, University of Luxembourg, L-1511 Luxembourg, Luxembourg
| |
Collapse
|
42
|
Desgranges C, Delhommelle J. Determination of mixture properties via a combined Expanded Wang-Landau simulations-Machine Learning approach. Chem Phys Lett 2019. [DOI: 10.1016/j.cplett.2018.11.009] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
43
|
Nakata H, Fedorov DG. Simulations of infrared and Raman spectra in solution using the fragment molecular orbital method. Phys Chem Chem Phys 2019; 21:13641-13652. [DOI: 10.1039/c9cp00940j] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Calculation of IR and Raman spectra in solution for large molecular systems made possible with analytic FMO/PCM Hessians.
Collapse
Affiliation(s)
| | - Dmitri G. Fedorov
- Research Center for Computational Design of Advanced Functional Materials (CD-FMat)
- National Institute of Advanced Industrial Science and Technology (AIST)
- Tsukuba
- Japan
| |
Collapse
|
44
|
Elton DC, Fritz M, Fernández-Serra M. Using a monomer potential energy surface to perform approximate path integral molecular dynamics simulation of ab initio water at near-zero added cost. Phys Chem Chem Phys 2018; 21:409-417. [PMID: 30534683 DOI: 10.1039/c8cp06077k] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
It is now established that nuclear quantum motion plays an important role in determining water's hydrogen bonding, structure, and dynamics. Such effects are important to include in density functional theory (DFT) based molecular dynamics simulation of water. The standard way of treating nuclear quantum effects, path integral molecular dynamics (PIMD), multiplies the number of energy/force calculations by the number of beads required. In this work we introduce a method whereby PIMD can be incorporated into a DFT simulation with little extra cost and little loss in accuracy. The method is based on the many body expansion of the energy and has the benefit of including a monomer level correction to the DFT energy. Our method calculates intramolecular forces using the highly accurate monomer potential energy surface developed by Partridge-Schwenke, which is cheap to evaluate. Intermolecular forces and energies are calculated with DFT only once per timestep using the centroid positions. We show how our method may be used in conjunction with a multiple time step algorithm for an additional speedup and how it relates to ring polymer contraction and other schemes that have been introduced recently to speed up PIMD simulations. We show that our method, which we call "monomer PIMD", correctly captures changes in the structure of water found in a full PIMD simulation but at much lower computational cost.
Collapse
Affiliation(s)
- Daniel C Elton
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, New York 11794-3800, USA.
| | | | | |
Collapse
|
45
|
Chen WK, Liu XY, Fang WH, Dral PO, Cui G. Deep Learning for Nonadiabatic Excited-State Dynamics. J Phys Chem Lett 2018; 9:6702-6708. [PMID: 30403870 DOI: 10.1021/acs.jpclett.8b03026] [Citation(s) in RCA: 97] [Impact Index Per Article: 16.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
In this work we show that deep learning (DL) can be used for exploring complex and highly nonlinear multistate potential energy surfaces of polyatomic molecules and related nonadiabatic dynamics. Our DL is based on deep neural networks (DNNs), which are used as accurate representations of the CASSCF ground- and excited-state potential energy surfaces (PESs) of CH2NH. After geometries near conical intersection are included in the training set, the DNN models accurately reproduce excited-state topological structures; photoisomerization paths; and, importantly, conical intersections. We have also demonstrated that the results from nonadiabatic dynamics run with the DNN models are very close to those from the dynamics run with the pure ab initio method. The present work should encourage further studies of using machine learning methods to explore excited-state potential energy surfaces and nonadiabatic dynamics of polyatomic molecules.
Collapse
Affiliation(s)
- Wen-Kai Chen
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry , Beijing Normal University , Beijing 100875 , China
| | - Xiang-Yang Liu
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry , Beijing Normal University , Beijing 100875 , China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry , Beijing Normal University , Beijing 100875 , China
| | - Pavlo O Dral
- Max-Planck-Institut für Kohlenforschung , Kaiser-Wilhelm-Platz 1 , 45470 Mülheim an der Ruhr , Germany
| | - Ganglong Cui
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry , Beijing Normal University , Beijing 100875 , China
| |
Collapse
|
46
|
Krylov A, Windus TL, Barnes T, Marin-Rimoldi E, Nash JA, Pritchard B, Smith DGA, Altarawy D, Saxe P, Clementi C, Crawford TD, Harrison RJ, Jha S, Pande VS, Head-Gordon T. Perspective: Computational chemistry software and its advancement as illustrated through three grand challenge cases for molecular science. J Chem Phys 2018; 149:180901. [DOI: 10.1063/1.5052551] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Affiliation(s)
- Anna Krylov
- Department of Chemistry, University of Southern California, Los Angeles, California 90089, USA
| | - Theresa L. Windus
- Department of Chemistry, Iowa State University, Ames, Iowa 50011, USA
| | - Taylor Barnes
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | | | - Jessica A. Nash
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | | | | | - Doaa Altarawy
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | - Paul Saxe
- Molecular Sciences Software Institute, Blacksburg, Virginia 24061, USA
| | - Cecilia Clementi
- Department of Chemistry and Center for Theoretical Biological Physics, Rice University, 6100 Main Street, Houston, Texas 77005, USA
- Department of Mathematics and Computer Science, Freie Universitt Berlin, Arnimallee 6, 14195 Berlin, Germany
| | | | - Robert J. Harrison
- Institute for Advanced Computational Science, Stony Brook University, Stony Brook, New York 11794, USA
| | - Shantenu Jha
- Electrical and Computer Engineering, Rutgers The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | - Vijay S. Pande
- Department of Bioengineering, Stanford University, Stanford, California 94305, USA
| | - Teresa Head-Gordon
- Department of Chemistry, Department of Bioengineering, Department of Chemical and Biomolecular Engineering, Pitzer Center for Theoretical Chemistry, University of California, Berkeley, California 94720, USA
| |
Collapse
|
47
|
Shang C, Huang SD, Liu ZP. Massively parallelization strategy for material simulation using high-dimensional neural network potential. J Comput Chem 2018; 40:1091-1096. [DOI: 10.1002/jcc.25636] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2018] [Revised: 08/28/2018] [Accepted: 09/09/2018] [Indexed: 01/15/2023]
Affiliation(s)
- Cheng Shang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science (Ministry of Education), Department of Chemistry; Fudan University; Shanghai 200433 China
| | - Si-Da Huang
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science (Ministry of Education), Department of Chemistry; Fudan University; Shanghai 200433 China
| | - Zhi-Pan Liu
- Collaborative Innovation Center of Chemistry for Energy Material, Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science (Ministry of Education), Department of Chemistry; Fudan University; Shanghai 200433 China
| |
Collapse
|
48
|
Nakata H, Fedorov DG. Analytic second derivatives for the efficient electrostatic embedding in the fragment molecular orbital method. J Comput Chem 2018; 39:2039-2050. [DOI: 10.1002/jcc.25360] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2018] [Revised: 04/27/2018] [Accepted: 04/29/2018] [Indexed: 01/09/2023]
Affiliation(s)
- Hiroya Nakata
- Department of Fundamental Technology Research; Research and Development Center Kagoshima, Kyocera, 1-4 Kokubu Yamashita-cho; Kirishima-shi Kagoshima, 899-4312 Japan
| | - Dmitri G. Fedorov
- Research Center for Computational Design of Advanced Functional Materials (CD-FMat), National Institute of Advanced Industrial Science and Technology (AIST), 1-1-1 Umezono; Tsukuba Ibaraki, 305-8568 Japan
| |
Collapse
|
49
|
Habiboglu MG, Coskuner-Weber O. Quantum Chemistry Meets Deep Learning for Complex Carbohydrate and Glycopeptide Species I. Z PHYS CHEM 2018. [DOI: 10.1515/zpch-2018-1251] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
Carbohydrate complexes are crucial in many various biological and medicinal processes. The impacts of N-acetyl on the glycosidic linkage flexibility of methyl β-D-glucopyranose, and of the glycoamino acid β-D-glucopyranose-asparagine are poorly understood at the electronic level. Furthermore, the effect of D- and L-isomers of asparagine in the complexes of N-acetyl-β-D-glucopyranose-(L)-asparagine and N-acetyl-β-D-glucopyranose-(D)-asparagine is unknown. In this study, we performed density functional theory calculations of methyl β-D-glucopyranose, methyl N-acetyl-β-D-glucopyranose, and of glycoamino acids β-D-glucopyranose-asparagine, N-acetyl-β-D-glucopyranose-(L)-asparagine and N-acetyl-β-D-glucopyranose-(D)-asparagine for studying their linkage flexibilities, total solvated energies, thermochemical properties and intra-molecular hydrogen bond formations in an aqueous solution environment using the COnductor-like Screening MOdel (COSMO) for water. We linked these density functional theory calculations to deep learning via estimating the total solvated energy of each linkage torsional angle value. Our results show that deep learning methods accurately estimate the total solvated energies of complex carbohydrate and glycopeptide species and provide linkage flexibility trends for methyl β-D-glucopyranose, methyl N-acetyl-β-D-glucopyranose, and of glycoamino acids β-D-glucopyranose-asparagine, N-acetyl-β-D-glucopyranose-(L)-asparagine and N-acetyl-β-D-glucopyranose-(D)-asparagine in agreement with density functional theory results. To the best of our knowledge, this study represents the first application of density functional theory along with deep learning for complex carbohydrate and glycopeptide species in an aqueous solution medium. In addition, this study shows that a few thousands of optimization frames from DFT calculations are enough for accurate estimations by deep learning tools.
Collapse
Affiliation(s)
- M. Gokhan Habiboglu
- Turkisch-Deutsche Universität, Electrical and Electronics Engineering Department , Sahinkaya Caddesi, No. 86 , Beykoz, Istanbul 34820 , Turkey
| | - Orkid Coskuner-Weber
- Turkish-Deutsche Universität, Molecular Biotechnology , Sahinkaya Caddesi, No. 86 , Beykoz, Istanbul 34820 , Turkey
- National Institute of Standards and Technology, Biochemical Reference Data Division , 100 Bureau Drive, Gaithersburg , MD 20899 , USA
| |
Collapse
|
50
|
Towards exact molecular dynamics simulations with machine-learned force fields. Nat Commun 2018; 9:3887. [PMID: 30250077 PMCID: PMC6155327 DOI: 10.1038/s41467-018-06169-2] [Citation(s) in RCA: 318] [Impact Index Per Article: 53.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2018] [Accepted: 08/22/2018] [Indexed: 12/25/2022] Open
Abstract
Molecular dynamics (MD) simulations employing classical force fields constitute the cornerstone of contemporary atomistic modeling in chemistry, biology, and materials science. However, the predictive power of these simulations is only as good as the underlying interatomic potential. Classical potentials often fail to faithfully capture key quantum effects in molecules and materials. Here we enable the direct construction of flexible molecular force fields from high-level ab initio calculations by incorporating spatial and temporal physical symmetries into a gradient-domain machine learning (sGDML) model in an automatic data-driven way. The developed sGDML approach faithfully reproduces global force fields at quantum-chemical CCSD(T) level of accuracy and allows converged molecular dynamics simulations with fully quantized electrons and nuclei. We present MD simulations, for flexible molecules with up to a few dozen atoms and provide insights into the dynamical behavior of these molecules. Our approach provides the key missing ingredient for achieving spectroscopic accuracy in molecular simulations. Simultaneous accurate and efficient prediction of molecular properties relies on combined quantum mechanics and machine learning approaches. Here the authors develop a flexible machine-learning force-field with high-level accuracy for molecular dynamics simulations.
Collapse
|