1
|
Li ZL, Pei S, Chen Z, Huang TY, Wang XD, Shen L, Chen X, Wang QQ, Wang DX, Ao YF. Machine learning-assisted amidase-catalytic enantioselectivity prediction and rational design of variants for improving enantioselectivity. Nat Commun 2024; 15:8778. [PMID: 39389964 PMCID: PMC11467325 DOI: 10.1038/s41467-024-53048-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Accepted: 09/30/2024] [Indexed: 10/12/2024] Open
Abstract
Biocatalysis is an attractive approach for the synthesis of chiral pharmaceuticals and fine chemicals, but assessing and/or improving the enantioselectivity of biocatalyst towards target substrates is often time and resource intensive. Although machine learning has been used to reveal the underlying relationship between protein sequences and biocatalytic enantioselectivity, the establishment of substrate fitness space is usually disregarded by chemists and is still a challenge. Using 240 datasets collected in our previous works, we adopt chemistry and geometry descriptors and build random forest classification models for predicting the enantioselectivity of amidase towards new substrates. We further propose a heuristic strategy based on these models, by which the rational protein engineering can be efficiently performed to synthesize chiral compounds with higher ee values, and the optimized variant results in a 53-fold higher E-value comparing to the wild-type amidase. This data-driven methodology is expected to broaden the application of machine learning in biocatalysis research.
Collapse
Affiliation(s)
- Zi-Lin Li
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Shuxin Pei
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, China
| | - Ziying Chen
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, China
| | - Teng-Yu Huang
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Xu-Dong Wang
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Beijing, China
| | - Lin Shen
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, China.
- Yantai-Jingshi Institute of Material Genome Engineering, Yantai, China.
| | - Xuebo Chen
- Key Laboratory of Theoretical and Computational Photochemistry of Ministry of Education, College of Chemistry, Beijing Normal University, Beijing, China.
- Yantai-Jingshi Institute of Material Genome Engineering, Yantai, China.
- Shandong Laboratory of Yantai Advanced Materials and Green Manufacturing, Yantai, China.
| | - Qi-Qiang Wang
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - De-Xian Wang
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Yu-Fei Ao
- Beijing National Laboratory for Molecular Sciences, CAS Key Laboratory of Molecular Recognition and Function, Institute of Chemistry, Chinese Academy of Sciences, Beijing, China.
- University of Chinese Academy of Sciences, Beijing, China.
| |
Collapse
|
2
|
Shirani H, Hashemianzadeh SM. Quantum-level machine learning calculations of Levodopa. Comput Biol Chem 2024; 112:108146. [PMID: 39067350 DOI: 10.1016/j.compbiolchem.2024.108146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 06/20/2024] [Accepted: 07/08/2024] [Indexed: 07/30/2024]
Abstract
Many drug molecules contain functional groups, resulting in a torsional barrier corresponding to rotation around the bond linking the fragments. In medicinal chemistry and pharmaceutical sciences, inclusive of drug design studies, the exact calculation of the potential energy surface (PES) of these molecular torsions is extremely important and precious. Machine learning (ML), including deep learning (DL), is currently one of the most rapidly evolving tools in computer-aided drug discovery and molecular simulations. In this work, we used ANI-1x neural network potential as a quantum-level ML to predict the PESs of the L-3,4-dihydroxyphenylalanine (Levodopa) antiparkinsonian drug molecule. The electronic energies and structural parameters calculated by density functional theory (DFT) using the wB97X method and all possible Pople's basis sets indicated the 6-31G(d) basis set, when used with the wB97X functional, exhibits behavior similar to that of the ANI-1x model. The vibrational frequencies investigation showed a linear correlation between DFT and ML data. All ANI-1x calculations were completed quickly in a very short computing time. From this perspective, we expect the ANI-1x dataset applied in this work to be appreciably efficient and effective in computational structure-based drug design studies.
Collapse
Affiliation(s)
- Hossein Shirani
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| | - Seyed Majid Hashemianzadeh
- Molecular Simulation Research Laboratory, Department of Chemistry, Iran University of Science and Technology, P.O. Box 16846-13114, Tehran, Iran.
| |
Collapse
|
3
|
Zhu Y, Peng J, Xu C, Lan Z. Unsupervised Machine Learning in the Analysis of Nonadiabatic Molecular Dynamics Simulation. J Phys Chem Lett 2024; 15:9601-9619. [PMID: 39270134 DOI: 10.1021/acs.jpclett.4c01751] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/15/2024]
Abstract
The all-atomic full-dimensional-level simulations of nonadiabatic molecular dynamics (NAMD) in large realistic systems has received high research interest in recent years. However, such NAMD simulations normally generate an enormous amount of time-dependent high-dimensional data, leading to a significant challenge in result analyses. Based on unsupervised machine learning (ML) methods, considerable efforts were devoted to developing novel and easy-to-use analysis tools for the identification of photoinduced reaction channels and the comprehensive understanding of complicated molecular motions in NAMD simulations. Here, we tried to survey recent advances in this field, particularly to focus on how to use unsupervised ML methods to analyze the trajectory-based NAMD simulation results. Our purpose is to offer a comprehensive discussion on several essential components of this analysis protocol, including the selection of ML methods, the construction of molecular descriptors, the establishment of analytical frameworks, their advantages and limitations, and persistent challenges.
Collapse
Affiliation(s)
- Yifei Zhu
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| | - Jiawei Peng
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| | - Chao Xu
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| | - Zhenggang Lan
- MOE Key Laboratory of Environmental Theoretical Chemistry, SCNU Environmental Research Institute, Guangdong Provincial Key Laboratory of Chemical Pollution and Environmental Safety, School of Environment, South China Normal University, Guangzhou 510006, P. R. China
| |
Collapse
|
4
|
Middleton C, Curchod BFE, Penfold TJ. Partial density of states representation for accurate deep neural network predictions of X-ray spectra. Phys Chem Chem Phys 2024; 26:24477-24487. [PMID: 39264269 DOI: 10.1039/d4cp01368a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/13/2024]
Abstract
The performance of a machine learning (ML) algorithm for chemistry is highly contingent upon the architect's choice of input representation. This work introduces the partial density of states (p-DOS) descriptor: a novel, quantum-inspired structural representation which encodes relevant electronic information for machine learning models seeking to simulate X-ray spectroscopy. p-DOS uses a minimal basis set in conjunction with a guess (non-optimised) electronic configuration to extract and then discretise the density of states (DOS) of the absorbing atom to form the input vector. We demonstrate that while the electronically-focused p-DOS performs well in isolation, optimal performance is achieved when supplemented with nuclear structural information imparted via a geometric representation. p-DOS provides a description of the key electronic properties of a system which is not only concise and computationally efficient, but also independent of molecular size or choice of basis set. It can be rapidly generated, facilitating its application with large training sets. Its performance is demonstrated using a wide variety of examples at the sulphur K-edge, including the prediction of ultrafast X-ray spectroscopic signal associated with photoexcited 2(5H)-thiophenone. These results highlight the potential for ML models developed using p-DOS to contribute to the interpretation and prediction of experimental results e.g. in operando measurements of batteries and/or catalysts and femtosecond time-resolved studies, especially those made possible by emergent cutting-edge technologies, especially X-ray free electron lasers.
Collapse
Affiliation(s)
- Clelia Middleton
- Chemistry, School of Natural and Environmental Sciences, Newcastle University, Great North Road, Newcastle upon Tyne, NE1 7RU, UK.
| | - Basile F E Curchod
- Centre for Computational Chemistry, School of Chemistry, Cantock's Close, University of Bristol, Bristol, BS8 1TS, UK
| | - Thomas J Penfold
- Chemistry, School of Natural and Environmental Sciences, Newcastle University, Great North Road, Newcastle upon Tyne, NE1 7RU, UK.
| |
Collapse
|
5
|
Taylor CR, Butler PWV, Day GM. Predictive crystallography at scale: mapping, validating, and learning from 1000 crystal energy landscapes. Faraday Discuss 2024. [PMID: 39301753 DOI: 10.1039/d4fd00105b] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/22/2024]
Abstract
Computational crystal structure prediction (CSP) is an increasingly powerful technique in materials discovery, due to its ability to reveal trends and permit insight across the possibility space of crystal structures of a candidate molecule, beyond simply the observed structure(s). In this work, we demonstrate the reliability and scalability of CSP methods for small, rigid organic molecules by performing in-depth CSP investigations for over 1000 such compounds, the largest survey of its kind to-date. We show that this highly-efficient force-field-based CSP approach is superbly predictive, locating 99.4% of observed experimental structures, and ranking a large majority of these (74%) as among the most stable possible structures (to within uncertainty due to thermal effects). We present two examples of insights such large predicted datasets can permit, examining the space group preferences of organic molecular crystals and rationalising empirical rules concerning the spontaneous resolution of chiral molecules. Finally, we exploit this large and diverse dataset for developing transferable machine-learned energy potentials for the organic solid state, training a neural network lattice energy correction to force field energies that offers substantial improvements to the already impressive energy rankings, and a MACE equivariant message-passing neural network for crystal structure re-optimisation. We conclude that the excellent performance and reliability of the CSP workflow enables the creation of very large datasets of broad utility and explanatory power in materials design.
Collapse
Affiliation(s)
| | - Patrick W V Butler
- School of Chemistry, University of Southampton, Southampton, SO17 1BJ, UK.
| | - Graeme M Day
- School of Chemistry, University of Southampton, Southampton, SO17 1BJ, UK.
| |
Collapse
|
6
|
Chen PY, Shibata K, Hagita K, Miyata T, Mizoguchi T. Predicting ELNES/XANES spectra by machine learning with an atomic coordinate-independent descriptor and its application to ground-state electronic structures. Micron 2024; 187:103723. [PMID: 39342916 DOI: 10.1016/j.micron.2024.103723] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Revised: 09/17/2024] [Accepted: 09/18/2024] [Indexed: 10/01/2024]
Abstract
ELNES/XANES spectra can be observed using TEM or synchrotron radiation and can elucidate the unoccupied state electronic structures of an excited states. The computation of their features is usually demanding substantial computational resources due to the requisite structure optimization and electronic structure calculations. Herein, we leverage a machine learning technique alongside an atomic-coordinate-independent descriptor, SMILES, to yield the ELNES/XANES spectra, directly, with heightened precision. Moreover, our approach extends to obtain ground state electronic structure, namely PDOS at both occupied and unoccupied ground states, underscoring its viability for a ground-state spectroscopy. Our study revealed that incorporation of long-SMILES molecules into the training dataset enhances prediction accuracy for such molecular structures. This study's direct derivation of spectroscopy from SMILES strings holds promise for expediting spectroscopic inquiries.
Collapse
Affiliation(s)
- Po-Yen Chen
- Department of Materials Engineering, the University of Tokyo, Tokyo, Japan.
| | - Kiyou Shibata
- Department of Materials Engineering, the University of Tokyo, Tokyo, Japan; Institute of Industrial Science, the University of Tokyo, Tokyo, Japan.
| | - Katsumi Hagita
- Department of Applied Physics, National Defense Academy, Yokosuka, Japan
| | - Tomohiro Miyata
- Institute of Multidisciplinary Research for Advanced Materials, Tohoku University, Sendai, Japan
| | - Teruyasu Mizoguchi
- Department of Materials Engineering, the University of Tokyo, Tokyo, Japan; Institute of Industrial Science, the University of Tokyo, Tokyo, Japan.
| |
Collapse
|
7
|
Luo J, Said OB, Xie P, Gibaldi M, Burner J, Pereira C, Woo TK. MEPO-ML: a robust graph attention network model for rapid generation of partial atomic charges in metal-organic frameworks. NPJ COMPUTATIONAL MATERIALS 2024; 10:224. [PMID: 39309403 PMCID: PMC11412901 DOI: 10.1038/s41524-024-01413-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 08/30/2024] [Indexed: 09/25/2024]
Abstract
Accurate computation of the gas adsorption properties of MOFs is usually bottlenecked by the DFT calculations required to generate partial atomic charges. Therefore, large virtual screenings of MOFs often use the QEq method which is rapid, but of limited accuracy. Recently, machine learning (ML) models have been trained to generate charges in much better agreement with DFT-derived charges compared to the QEq models. Previous ML charge models for MOFs have all used training sets with less than 3000 MOFs obtained from the CoRE MOF database, which has recently been shown to have high structural error rates. In this work, we developed a graph attention network model for predicting DFT-derived charges in MOFs where the model was developed with the ARC-MOF database that contains 279,632 MOFs and over 40 million charges. This model, which we call MEPO-ML, predicts charges with a mean absolute error of 0.025e on our test set of over 27 K MOFs. Other ML models reported in the literature were also trained using the same dataset and descriptors, and MEPO-ML was shown to give the lowest errors. The gas adsorption properties evaluated using MEPO-ML charges are found to be in significantly better agreement with the reference DFT-derived charges compared to the empirical charges, for both polar and non-polar gases. Using only a single CPU core on our benchmark computer, MEPO-ML charges can be generated in less than two seconds on average (including all computations required to apply the model) for MOFs in the test set of 27 K MOFs.
Collapse
Affiliation(s)
- Jun Luo
- Department of Chemistry and Biomolecular Science, University of Ottawa, 10 Marie Curie Private, Ottawa, K1N 6N5 Canada
| | | | - Peigen Xie
- TotalEnergies OneTech SE, Palaiseau, France
| | - Marco Gibaldi
- Department of Chemistry and Biomolecular Science, University of Ottawa, 10 Marie Curie Private, Ottawa, K1N 6N5 Canada
| | - Jake Burner
- Department of Chemistry and Biomolecular Science, University of Ottawa, 10 Marie Curie Private, Ottawa, K1N 6N5 Canada
| | | | - Tom K. Woo
- Department of Chemistry and Biomolecular Science, University of Ottawa, 10 Marie Curie Private, Ottawa, K1N 6N5 Canada
| |
Collapse
|
8
|
Warren MT, Biggs CI, Bissoyi A, Gibson MI, Sosso GC. Data-driven discovery of potent small molecule ice recrystallisation inhibitors. Nat Commun 2024; 15:8082. [PMID: 39278938 PMCID: PMC11402961 DOI: 10.1038/s41467-024-52266-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 08/27/2024] [Indexed: 09/18/2024] Open
Abstract
Controlling the formation and growth of ice is essential to successfully cryopreserve cells, tissues and biologics. Current efforts to identify materials capable of modulating ice growth are guided by iterative changes and human intuition, with a major focus on proteins and polymers. With limited data, the discovery pipeline is constrained by a poor understanding of the mechanisms and the underlying structure-activity relationships. In this work, this barrier is overcome by constructing machine learning models capable of predicting the ice recrystallisation inhibition activity of small molecules. We generate a new dataset via experimental measurements of ice growth, then harness predictive models combining state-of-the-art descriptors with domain-specific features derived from molecular simulations. The models accurately identify potent small molecule ice recrystallisation inhibitors within a commercial compound library. Identified hits can also mitigate cellular damage during transient warming events in cryopreserved red blood cells, demonstrating how data-driven approaches can be used to discover innovative cryoprotectants and enable next-generation cryopreservation solutions for the cold chain.
Collapse
Affiliation(s)
- Matthew T Warren
- Department of Chemistry, University of Warwick, Coventry, UK
- Warwick Medical School, University of Warwick, Coventry, UK
- Institute of Cancer Research, London, UK
| | | | - Akalabya Bissoyi
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK
- Department of Chemistry, University of Manchester, Manchester, UK
| | - Matthew I Gibson
- Department of Chemistry, University of Warwick, Coventry, UK.
- Warwick Medical School, University of Warwick, Coventry, UK.
- Manchester Institute of Biotechnology, University of Manchester, Manchester, UK.
- Department of Chemistry, University of Manchester, Manchester, UK.
| | | |
Collapse
|
9
|
Wang J, Wang Y, Zhang H, Yang Z, Liang Z, Shi J, Wang HT, Xing D, Sun J. E(n)-Equivariant cartesian tensor message passing interatomic potential. Nat Commun 2024; 15:7607. [PMID: 39218987 PMCID: PMC11366765 DOI: 10.1038/s41467-024-51886-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
Machine learning potential (MLP) has been a popular topic in recent years for its capability to replace expensive first-principles calculations in some large systems. Meanwhile, message passing networks have gained significant attention due to their remarkable accuracy, and a wave of message passing networks based on Cartesian coordinates has emerged. However, the information of the node in these models is usually limited to scalars, and vectors. In this work, we propose High-order Tensor message Passing interatomic Potential (HotPP), an E(n) equivariant message passing neural network that extends the node embedding and message to an arbitrary order tensor. By performing some basic equivariant operations, high order tensors can be coupled very simply and thus the model can make direct predictions of high-order tensors such as dipole moments and polarizabilities without any modifications. The tests in several datasets show that HotPP not only achieves high accuracy in predicting target properties, but also successfully performs tasks such as calculating phonon spectra, infrared spectra, and Raman spectra, demonstrating its potential as a tool for future research.
Collapse
Affiliation(s)
- Junjie Wang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Yong Wang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
- Department of Chemistry, Princeton University, Princeton, NJ, 08544, USA
| | - Haoting Zhang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Ziyang Yang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Zhixin Liang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Jiuyang Shi
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Hui-Tian Wang
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Dingyu Xing
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China
| | - Jian Sun
- National Laboratory of Solid State Microstructures, School of Physics and Collaborative Innovation Center of Advanced Microstructures, Nanjing University, Nanjing, 210093, China.
| |
Collapse
|
10
|
Nakajima Y, Ohmura T, Seino J. Using atomic clustering based on structural and electronic descriptors that consider surrounding environment to evaluate local properties of DFT functionals. J Comput Chem 2024; 45:1870-1879. [PMID: 38686778 DOI: 10.1002/jcc.27375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2023] [Revised: 04/01/2024] [Accepted: 04/03/2024] [Indexed: 05/02/2024]
Abstract
We developed a method for evaluating the accuracies of the local properties of DFT functionals in detail using a clustering method based on machine learning and structural/electronic descriptors. We generated 36 clusters consistent with human intuition using 30,436 carbon atoms from the QM9 dataset. The results were used to evaluate 13C NMR chemical shifts calculated using 84 DFT functionals. Carbon atoms were grouped based on their similar environments, reducing errors within these groups. This enables more accurate assessment of the accuracy using a specific DFT functional. Therefore, the present atomic clustering provides more detailed insight into accuracy verification.
Collapse
Affiliation(s)
- Yuya Nakajima
- Waseda Research Institute for Science and Engineering, Tokyo, Japan
| | - Takuto Ohmura
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, Tokyo, Japan
| | - Junji Seino
- Waseda Research Institute for Science and Engineering, Tokyo, Japan
- Department of Chemistry and Biochemistry, School of Advanced Science and Engineering, Waseda University, Tokyo, Japan
| |
Collapse
|
11
|
Martire S, Decherchi S, Cavalli A. OBIWAN: An Element-Wise Scalable Feed-Forward Neural Network Potential. J Chem Theory Comput 2024; 20:6287-6302. [PMID: 38978155 DOI: 10.1021/acs.jctc.4c00342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
Estimating the potential energy of a molecular system at a quantum level of theory is a task of paramount importance in computational chemistry. The often employed density functional theory approach allows one to accomplish this task, yet most often at significant computational costs. This prompted the community to develop so-called machine learning potentials to achieve near-quantum accuracy at molecular mechanics computational cost. In this paper, we introduce OBIWAN, a feed-forward neural network that bears some relevant structural properties that also led to the definition of a new kind of general-purpose neural network layer. Its featurization process scales efficiently with newly added atomic species. This allows one to seamlessly add new atom types without requiring to change the topology of the network. Also, this allows one to train on new data sets leveraging a previously trained OBIWAN, hence converging very quickly. This avoids training from scratch and renders the approach more compliant with a green computing perspective.
Collapse
Affiliation(s)
- Stefano Martire
- Department of Pharmacy and Biotechnology, University of Bologna, Via Belmeloro 6, Bologna 40126, Italy
- Computational and Chemical Biology, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, Genoa 16163, Italy
| | - Sergio Decherchi
- Data Science and Computation Facility, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, Genoa 16163, Italy
| | - Andrea Cavalli
- Computational and Chemical Biology, Fondazione Istituto Italiano di Tecnologia, Via Morego 30, Genoa 16163, Italy
- Centre Européen de Calcul Atomique et Moléculaire, Ecole Polytechnique Fédérale de Lausanne, Avenue de Forel 3, Lausanne 1015, Switzerland
| |
Collapse
|
12
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
13
|
Wang G, Wang C, Zhang X, Li Z, Zhou J, Sun Z. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations. iScience 2024; 27:109673. [PMID: 38646181 PMCID: PMC11033164 DOI: 10.1016/j.isci.2024.109673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024] Open
Abstract
Machine learning interatomic potential (MLIP) overcomes the challenges of high computational costs in density-functional theory and the relatively low accuracy in classical large-scale molecular dynamics, facilitating more efficient and precise simulations in materials research and design. In this review, the current state of the four essential stages of MLIP is discussed, including data generation methods, material structure descriptors, six unique machine learning algorithms, and available software. Furthermore, the applications of MLIP in various fields are investigated, notably in phase-change memory materials, structure searching, material properties predicting, and the pre-trained universal models. Eventually, the future perspectives, consisting of standard datasets, transferability, generalization, and trade-off between accuracy and complexity in MLIPs, are reported.
Collapse
Affiliation(s)
- Guanjie Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
- School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
| | - Changrui Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Xuanguang Zhang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zefeng Li
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Jian Zhou
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zhimei Sun
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
14
|
Wan K, He J, Shi X. Construction of High Accuracy Machine Learning Interatomic Potential for Surface/Interface of Nanomaterials-A Review. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2305758. [PMID: 37640376 DOI: 10.1002/adma.202305758] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Revised: 08/24/2023] [Indexed: 08/31/2023]
Abstract
The inherent discontinuity and unique dimensional attributes of nanomaterial surfaces and interfaces bestow them with various exceptional properties. These properties, however, also introduce difficulties for both experimental and computational studies. The advent of machine learning interatomic potential (MLIP) addresses some of the limitations associated with empirical force fields, presenting a valuable avenue for accurate simulations of these surfaces/interfaces of nanomaterials. Central to this approach is the idea of capturing the relationship between system configuration and potential energy, leveraging the proficiency of machine learning (ML) to precisely approximate high-dimensional functions. This review offers an in-depth examination of MLIP principles and their execution and elaborates on their applications in the realm of nanomaterial surface and interface systems. The prevailing challenges faced by this potent methodology are also discussed.
Collapse
Affiliation(s)
- Kaiwei Wan
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jianxin He
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xinghua Shi
- Laboratory of Theoretical and Computational Nanoscience, National Center for Nanoscience and Technology, Chinese Academy of Sciences, Beijing, 100190, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| |
Collapse
|
15
|
del Rio BG, González LE. Exploring Challenging Properties of Liquid Metallic Systems through Machine Learning: Liquid La and Li 4Pb Systems. J Chem Theory Comput 2024; 20:3285-3297. [PMID: 38557035 PMCID: PMC11044274 DOI: 10.1021/acs.jctc.4c00049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2024] [Revised: 03/15/2024] [Accepted: 03/18/2024] [Indexed: 04/04/2024]
Abstract
In this machine learning (ML) study, we delved into the unique properties of liquid lanthanum and the Li4Pb alloy, revealing some unexpected features and also firmly establishing some of the debated characteristics. Leveraging interatomic potentials derived from ab initio calculations, our investigation achieved a level of precision comparable to first-principles methods while at the same time entering the hydrodynamic regime. We compared the structure factors and pair distribution functions to experimental data and unearthed distinctive collective excitations with intriguing features. Liquid lanthanum unveiled two transverse collective excitation branches, each closely tied to specific peaks in the velocity autocorrelation function spectrum. Furthermore, the analysis of the generalized specific heat ratio in the hydrodynamic regime investigated with the ML molecular dynamics simulations uncovered a peculiar behavior, impossible to discern with only ab initio simulations. Liquid Li4Pb, on the other hand, challenged existing claims by showcasing a rich array of branches in its longitudinal dispersion relation, including a high-frequency LiLi mode with a nonhydrodynamic optical character that maintains a finite value as q → 0. Additionally, we conducted an in-depth analysis of various transport coefficients, expanding our understanding of these liquid metallic systems. In summary, our ML approach yielded precise results, offering new and captivating insights into the structural and dynamic aspects of these materials.
Collapse
Affiliation(s)
- Beatriz G. del Rio
- Departamento de Física Teórica
Atómica y Óptica, Universidad
de Valladolid, 47011 Valladolid, Spain
| | - Luis E. González
- Departamento de Física Teórica
Atómica y Óptica, Universidad
de Valladolid, 47011 Valladolid, Spain
| |
Collapse
|
16
|
Gallegos M, Isamura BK, Popelier PLA, Martín Pendás Á. An Unsupervised Machine Learning Approach for the Automatic Construction of Local Chemical Descriptors. J Chem Inf Model 2024; 64:3059-3079. [PMID: 38498942 PMCID: PMC11040729 DOI: 10.1021/acs.jcim.3c01906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2023] [Revised: 03/06/2024] [Accepted: 03/07/2024] [Indexed: 03/20/2024]
Abstract
Condensing the many physical variables defining a chemical system into a fixed-size array poses a significant challenge in the development of chemical Machine Learning (ML). Atom Centered Symmetry Functions (ACSFs) offer an intuitive featurization approach by means of a tedious and labor-intensive selection of tunable parameters. In this work, we implement an unsupervised ML strategy relying on a Gaussian Mixture Model (GMM) to automatically optimize the ACSF parameters. GMMs effortlessly decompose the vastness of the chemical and conformational spaces into well-defined radial and angular clusters, which are then used to build tailor-made ACSFs. The unsupervised exploration of the space has demonstrated general applicability across a diverse range of systems, spanning from various unimolecular landscapes to heterogeneous databases. The impact of the sampling technique and temperature on space exploration is also addressed, highlighting the particularly advantageous role of high-temperature Molecular Dynamics (MD) simulations. The reliability of the resulting features is assessed through the estimation of the atomic charges of a prototypical capped amino acid and a heterogeneous collection of CHON molecules. The automatically constructed ACSFs serve as high-quality descriptors, consistently yielding typical prediction errors below 0.010 electrons bound for the reported atomic charges. Altering the spatial distribution of the functions with respect to the cluster highlights the critical role of symmetry rupture in achieving significantly improved features. More specifically, using two separate functions to describe the lower and upper tails of the cluster results in the best performing models with errors as low as 0.006 electrons. Finally, the effectiveness of finely tuned features was checked across different architectures, unveiling the superior performance of Gaussian Process (GP) models over Feed Forward Neural Networks (FFNNs), particularly in low-data regimes, with nearly a 2-fold increase in prediction quality. Altogether, this approach paves the way toward an easier construction of local chemical descriptors, while providing valuable insights into how radial and angular spaces should be mapped. Finally, this work opens the possibility of encoding many-body information beyond angular terms into upcoming ML features.
Collapse
Affiliation(s)
- Miguel Gallegos
- Department
of Analytical and Physical Chemistry, University
of Oviedo, Oviedo E-33006, Spain
| | | | - Paul L. A. Popelier
- Department
of Chemistry, The University of Manchester, Oxford Road, Manchester M13 9PL, U.K.
| | - Ángel Martín Pendás
- Department
of Analytical and Physical Chemistry, University
of Oviedo, Oviedo E-33006, Spain
| |
Collapse
|
17
|
Pan X, Snyder R, Wang JN, Lander C, Wickizer C, Van R, Chesney A, Xue Y, Mao Y, Mei Y, Pu J, Shao Y. Training machine learning potentials for reactive systems: A Colab tutorial on basic models. J Comput Chem 2024; 45:638-647. [PMID: 38082539 PMCID: PMC10923003 DOI: 10.1002/jcc.27269] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Revised: 11/10/2023] [Accepted: 11/11/2023] [Indexed: 01/18/2024]
Abstract
In the last several years, there has been a surge in the development of machine learning potential (MLP) models for describing molecular systems. We are interested in a particular area of this field - the training of system-specific MLPs for reactive systems - with the goal of using these MLPs to accelerate free energy simulations of chemical and enzyme reactions. To help new members in our labs become familiar with the basic techniques, we have put together a self-guided Colab tutorial (https://cc-ats.github.io/mlp_tutorial/), which we expect to be also useful to other young researchers in the community. Our tutorial begins with the introduction of simple feedforward neural network (FNN) and kernel-based (using Gaussian process regression, GPR) models by fitting the two-dimensional Müller-Brown potential. Subsequently, two simple descriptors are presented for extracting features of molecular systems: symmetry functions (including the ANI variant) and embedding neural networks (such as DeepPot-SE). Lastly, these features will be fed into FNN and GPR models to reproduce the energies and forces for the molecular configurations in a Claisen rearrangement reaction.
Collapse
Affiliation(s)
- Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Ryan Snyder
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Jia-Ning Wang
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Chance Lander
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Carly Wickizer
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
- Laboratory of Computational Biology, National, Heart, Lung and Blood Institute, National Institutes of Health, Bethesda, MD 20824, USA
| | - Andrew Chesney
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| | - Yuanfei Xue
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University, San Diego, CA 92182, USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University, Shanghai 200241, China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai, Shanghai 200062, China
- Collaborative Innovation Center of Extreme Optics, Shanxi University, Taiyuan, Shanxi 030006, China
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis, Indianapolis, IN 46202, USA
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
18
|
Xi B, Chan MK, Bao K, Zhao W, Chan HM, Chen H, Zhu J. Parameter-Free and Electron Counting Satisfied Material Representation for Machine Learning Potential Energy and Force Fields. J Phys Chem Lett 2024; 15:1636-1643. [PMID: 38306617 PMCID: PMC10875669 DOI: 10.1021/acs.jpclett.3c03250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 01/28/2024] [Accepted: 01/29/2024] [Indexed: 02/04/2024]
Abstract
We proposed a parameter-free volume element representation that satisfies the electron counting model and obtains accurate machine learning potential energy and direct force fitting of randomly perturbed hexagonal BN. Our method preserves permutational, translational, and rotational invariance and can be extended to three-dimensional systems, verified by a system of bulk Si. As a result, we obtained 0.57 meV/atom potential energy root mean squared error (RMSE) and 59 meV/Å force RMSE for perturbed bulk BN systems and 0.43 meV/atom potential energy RMSE and 36 meV/Å force RMSE for perturbed Si systems. In addition, an unbiased perturbation-based data set construction scheme is introduced and a continuous population distribution is obtained with a training data set of 4500, which is about 1 order of magnitude smaller than standard methods based on first-principles molecular dynamics simulations and saves a large amount of computing resources. General validity of our model is verified by structure optimization, molecular dynamics simulations, and extrapolations.
Collapse
Affiliation(s)
- Bin Xi
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Man Kit Chan
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Kejie Bao
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Wenjing Zhao
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Ho Ming Chan
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Hang Chen
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Junyi Zhu
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| |
Collapse
|
19
|
Kanada R, Tokuhisa A, Nagasaka Y, Okuno S, Amemiya K, Chiba S, Bekker GJ, Kamiya N, Kato K, Okuno Y. Enhanced Coarse-Grained Molecular Dynamics Simulation with a Smoothed Hybrid Potential Using a Neural Network Model. J Chem Theory Comput 2024; 20:7-17. [PMID: 38148034 DOI: 10.1021/acs.jctc.3c00889] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2023]
Abstract
In all-atom (AA) molecular dynamics (MD) simulations, the rugged energy profile of the force field makes it challenging to reproduce spontaneous structural changes in biomolecules within a reasonable calculation time. Existing coarse-grained (CG) models, in which the energy profile is set to a global minimum around the initial structure, are unsuitable to explore the structural dynamics between metastable states far away from the initial structure without any bias. In this study, we developed a new hybrid potential composed of an artificial intelligence (AI) potential and minimal CG potential related to the statistical bond length and excluded volume interactions to accelerate the transition dynamics while maintaining the protein character. The AI potential is trained by energy matching using a diverse structural ensemble sampled via multicanonical (Mc) MD simulation and the corresponding AA force field energy, profile of which is smoothed by energy minimization. By applying the new methodology to chignolin and TrpCage, we showed that the AI potential can predict the AA energy with significantly high accuracy, as indicated by a correlation coefficient (R-value) between the true and predicted energies exceeding 0.89. In addition, we successfully demonstrated that CGMD simulation based on the smoothed hybrid potential can significantly enhance the transition dynamics between various metastable states while preserving protein properties compared to those obtained with conventional CGMD and AAMD.
Collapse
Affiliation(s)
- Ryo Kanada
- RIKEN Center for Computational Science, Kobe 650-0047, Japan
| | | | | | | | | | - Shuntaro Chiba
- RIKEN Center for Computational Science, Kobe 650-0047, Japan
| | - Gert-Jan Bekker
- Institute for Protein Research, Osaka University, Suita, Osaka 565-0871, Japan
| | - Narutoshi Kamiya
- Graduate School of Information Science, University of Hyogo, Kobe, Hyogo 650-0047, Japan
| | - Koichiro Kato
- Graduate School of Engineering, Kyushu University, Fukuoka 819-0395, Japan
- Center for Molecular System, Kyushu University, 744 Motooka, Noshi-ku, Fukuoka 819-0395, Japan
| | - Yasushi Okuno
- RIKEN Center for Computational Science, Kobe 650-0047, Japan
- Graduate School of Medicine, Kyoto University, Kyoto 606-8507, Japan
| |
Collapse
|
20
|
Zhang F, Zhang J, Fang D, Zhang Y, Wang D. Unusual magnetic interaction in CrTe: insights from machine-learning and empirical models. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2023; 36:135804. [PMID: 38091625 DOI: 10.1088/1361-648x/ad154f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/21/2023] [Accepted: 12/13/2023] [Indexed: 12/28/2023]
Abstract
Chromium telluride (CrTe) has received much attention due to its small magnetic anisotropy, which hosts the potential for complex magnetic structures. However, its magnetic properties have been relatively unexplored with numerical simulations, as the magnetic interactions inside are quite unusual. In this study, we employ both a machine-learning model and an empirical model to investigate the magnetic phase transitions of bulk and monolayer CrTe, revealing the existence of unusual magnetic interaction, which can be captured by the machine-learning model but not the simple empirical model. Furthermore, our results also demonstrate that magnetic moments further apart exhibit stronger interactions than those in closer proximity, deviating from typical behavior.
Collapse
Affiliation(s)
- F Zhang
- School of Microelectronics & State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
- Key Lab of Micro-Nano Electronics and System Integration of Xi'an City, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - J Zhang
- School of Microelectronics & State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
- Key Lab of Micro-Nano Electronics and System Integration of Xi'an City, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - D Fang
- MOE Key Laboratory for Nonequilibrium Synthesis and Modulation of Condensed Matter, School of Physics, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| | - Y Zhang
- School of Physics, Henan Normal University, Xinxiang 453007, People's Republic of China
| | - D Wang
- School of Microelectronics & State Key Laboratory for Mechanical Behavior of Materials, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
- Key Lab of Micro-Nano Electronics and System Integration of Xi'an City, Xi'an Jiaotong University, Xi'an 710049, People's Republic of China
| |
Collapse
|
21
|
Xia J, Zhang Y, Jiang B. Accuracy Assessment of Atomistic Neural Network Potentials: The Impact of Cutoff Radius and Message Passing. J Phys Chem A 2023; 127:9874-9883. [PMID: 37943102 DOI: 10.1021/acs.jpca.3c06024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Atomistic neural network potentials have achieved great success in accelerating atomistic simulations in complicated systems in recent years. They are typically based on the atomic decomposition of total properties, truncating the interatomic correlations to a local environment within a given cutoff radius. A more recently developed message passing (MP) neural network framework can, in principle, incorporate nonlocal effects through iteratively correlating some atoms outside the cutoff sphere with atoms inside, a process referred to as MP. However, how the model accuracy depends on the cutoff radius and the MP process has rarely been discussed. In this work, we investigate this dependence using a recursively embedded atom neural network method that possesses both local and MP features, in two representative systems: liquid H2O and solid Al2O3. We focus on how these settings influence predictions for structural and vibrational properties, namely, radial distribution functions (RDFs) and vibrational density of states (VDOSs). We find that while MP lowers test errors of energy and forces in general, it may not improve the prediction for RDFs and/or VDOSs if direct interatomic correlations in the local environment are insufficiently described. A cutoff radius exceeding the first neighbor shell is necessary, beyond which involving MP quickly enhances the model accuracy until convergence. This is a potentially more efficient way to increase the model accuracy than directly increasing the cutoff radius, especially with more memory savings in the GPU implementation. Our findings also suggest that using the mean test error as the measure of the model accuracy alone is inadequate.
Collapse
Affiliation(s)
- Junfan Xia
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yaolong Zhang
- École Polytechnique FFlytech de Lausanne, 1015 Lausanne, Switzerland
| | - Bin Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
22
|
Watson L, Pope T, Jay RM, Banerjee A, Wernet P, Penfold TJ. A Δ-learning strategy for interpretation of spectroscopic observables. STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2023; 10:064101. [PMID: 37941993 PMCID: PMC10629969 DOI: 10.1063/4.0000215] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Accepted: 10/17/2023] [Indexed: 11/10/2023]
Abstract
Accurate computations of experimental observables are essential for interpreting the high information content held within x-ray spectra. However, for complicated systems this can be difficult, a challenge compounded when dynamics becomes important owing to the large number of calculations required to capture the time-evolving observable. While machine learning architectures have been shown to represent a promising approach for rapidly predicting spectral lineshapes, achieving simultaneously accurate and sufficiently comprehensive training data is challenging. Herein, we introduce Δ-learning for x-ray spectroscopy. Instead of directly learning the structure-spectrum relationship, the Δ-model learns the structure dependent difference between a higher and lower level of theory. Consequently, once developed these models can be used to translate spectral shapes obtained from lower levels of theory to mimic those corresponding to higher levels of theory. Ultimately, this achieves accurate simulations with a much reduced computational burden as only the lower level of theory is computed, while the model can instantaneously transform this to a spectrum equivalent to a higher level of theory. Our present model, demonstrated herein, learns the difference between TDDFT(BLYP) and TDDFT(B3LYP) spectra. Its effectiveness is illustrated using simulations of Rh L3-edge spectra tracking the C-H activation of octane by a cyclopentadienyl rhodium carbonyl complex.
Collapse
Affiliation(s)
- Luke Watson
- Chemistry, School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Thomas Pope
- Chemistry, School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Raphael M. Jay
- Department of Physics and Astronomy, Uppsala University, 751 20 Uppsala, Sweden
| | - Ambar Banerjee
- Department of Physics and Astronomy, Uppsala University, 751 20 Uppsala, Sweden
| | - Philippe Wernet
- Department of Physics and Astronomy, Uppsala University, 751 20 Uppsala, Sweden
| | - Thomas J. Penfold
- Chemistry, School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| |
Collapse
|
23
|
Tokita AM, Behler J. How to train a neural network potential. J Chem Phys 2023; 159:121501. [PMID: 38127396 DOI: 10.1063/5.0160326] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/24/2023] [Indexed: 12/23/2023] Open
Abstract
The introduction of modern Machine Learning Potentials (MLPs) has led to a paradigm change in the development of potential energy surfaces for atomistic simulations. By providing efficient access to energies and forces, they allow us to perform large-scale simulations of extended systems, which are not directly accessible by demanding first-principles methods. In these simulations, MLPs can reach the accuracy of electronic structure calculations, provided that they have been properly trained and validated using a suitable set of reference data. Due to their highly flexible functional form, the construction of MLPs has to be done with great care. In this Tutorial, we describe the necessary key steps for training reliable MLPs, from data generation via training to final validation. The procedure, which is illustrated for the example of a high-dimensional neural network potential, is general and applicable to many types of MLPs.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
24
|
Kývala L, Dellago C. Optimizing the architecture of Behler-Parrinello neural network potentials. J Chem Phys 2023; 159:094105. [PMID: 37655764 DOI: 10.1063/5.0167260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 08/10/2023] [Indexed: 09/02/2023] Open
Abstract
The architecture of neural network potentials is typically optimized at the beginning of the training process and remains unchanged throughout. Here, we investigate the accuracy of Behler-Parrinello neural network potentials for varying training set sizes. Using the QM9 and 3BPA datasets, we show that adjusting the network architecture according to the training set size improves the accuracy significantly. We demonstrate that both an insufficient and an excessive number of fitting parameters can have a detrimental impact on the accuracy of the neural network potential. Furthermore, we investigate the influences of descriptor complexity, neural network depth, and activation function on the model's performance. We find that for the neural network potentials studied here, two hidden layers yield the best accuracy and that unbounded activation functions outperform bounded ones.
Collapse
Affiliation(s)
- Lukáš Kývala
- Faculty of Physics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria
- Vienna Doctoral School in Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria
| |
Collapse
|
25
|
Zeng J, Zhang D, Lu D, Mo P, Li Z, Chen Y, Rynik M, Huang L, Li Z, Shi S, Wang Y, Ye H, Tuo P, Yang J, Ding Y, Li Y, Tisi D, Zeng Q, Bao H, Xia Y, Huang J, Muraoka K, Wang Y, Chang J, Yuan F, Bore SL, Cai C, Lin Y, Wang B, Xu J, Zhu JX, Luo C, Zhang Y, Goodall REA, Liang W, Singh AK, Yao S, Zhang J, Wentzcovitch R, Han J, Liu J, Jia W, York DM, E W, Car R, Zhang L, Wang H. DeePMD-kit v2: A software package for deep potential models. J Chem Phys 2023; 159:054801. [PMID: 37526163 PMCID: PMC10445636 DOI: 10.1063/5.0155600] [Citation(s) in RCA: 41] [Impact Index Per Article: 41.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 07/03/2023] [Indexed: 08/02/2023] Open
Abstract
DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The current version of DeePMD-kit offers numerous advanced features, such as DeepPot-SE, attention-based and hybrid descriptors, the ability to fit tensile properties, type embedding, model deviation, DP-range correction, DP long range, graphics processing unit support for customized operators, model compression, non-von Neumann molecular dynamics, and improved usability, including documentation, compiled binary packages, graphical user interfaces, and application programming interfaces. This article presents an overview of the current major version of the DeePMD-kit package, highlighting its features and technical details. Additionally, this article presents a comprehensive procedure for conducting molecular dynamics as a representative application, benchmarks the accuracy and efficiency of different models, and discusses ongoing developments.
Collapse
Affiliation(s)
- Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | - Denghui Lu
- HEDPS, CAPT, College of Engineering, Peking University, Beijing 100871, People’s Republic of China
| | - Pinghui Mo
- College of Electrical and Information Engineering, Hunan University, Changsha, People’s Republic of China
| | - Zeyu Li
- Yuanpei College, Peking University, Beijing 100871, People’s Republic of China
| | - Yixiao Chen
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08540, USA
| | - Marián Rynik
- Department of Experimental Physics, Comenius University, Mlynská Dolina F2, 842 48 Bratislava, Slovakia
| | - Li’ang Huang
- Center for Quantum Information, Institute for Interdisciplinary Information Sciences, Tsinghua University, Beijing 100084, People’s Republic of China
| | | | - Shaochen Shi
- ByteDance Research, Zhonghang Plaza, No. 43, North 3rd Ring West Road, Haidian District, Beijing, People’s Republic of China
| | | | - Haotian Ye
- Yuanpei College, Peking University, Beijing 100871, People’s Republic of China
| | - Ping Tuo
- AI for Science Institute, Beijing 100080, People’s Republic of China
| | - Jiabin Yang
- Baidu, Inc., Beijing, People’s Republic of China
| | | | - Yifan Li
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | | | - Qiyu Zeng
- Department of Physics, National University of Defense Technology, Changsha, Hunan 410073, People’s Republic of China
| | | | - Yu Xia
- ByteDance Research, Zhonghang Plaza, No. 43, North 3rd Ring West Road, Haidian District, Beijing, People’s Republic of China
| | | | - Koki Muraoka
- Department of Chemical System Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Yibo Wang
- DP Technology, Beijing 100080, People’s Republic of China
| | | | - Fengbo Yuan
- DP Technology, Beijing 100080, People’s Republic of China
| | - Sigbjørn Løland Bore
- Hylleraas Centre for Quantum Molecular Sciences and Department of Chemistry, University of Oslo, P.O. Box 1033 Blindern, 0315 Oslo, Norway
| | | | - Yinnian Lin
- Wangxuan Institute of Computer Technology, Peking University, Beijing 100871, People’s Republic of China
| | - Bo Wang
- Shanghai Engineering Research Center of Molecular Therapeutics and New Drug Development, Shanghai Key Laboratory of Green Chemistry and Chemical Process, School of Chemistry and Molecular Engineering, East China Normal University, Shanghai 200062, People’s Republic of China
| | - Jiayan Xu
- School of Chemistry and Chemical Engineering, Queen’s University Belfast, Belfast BT9 5AG, United Kingdom
| | - Jia-Xin Zhu
- State Key Laboratory of Physical Chemistry of Solid Surfaces, iChEM, College of Chemistry and Chemical Engineering, Xiamen University, Xiamen 361005, People’s Republic of China
| | - Chenxing Luo
- Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York 10027, USA
| | - Yuzhi Zhang
- DP Technology, Beijing 100080, People’s Republic of China
| | | | - Wenshuo Liang
- DP Technology, Beijing 100080, People’s Republic of China
| | - Anurag Kumar Singh
- Department of Data Science, Indian Institute of Technology, Palakkad, Kerala, India
| | - Sikai Yao
- DP Technology, Beijing 100080, People’s Republic of China
| | - Jingchao Zhang
- NVIDIA AI Technology Center (NVAITC), Santa Clara, California 95051, USA
| | | | - Jiequn Han
- Center for Computational Mathematics, Flatiron Institute, New York, New York 10010, USA
| | - Jie Liu
- College of Electrical and Information Engineering, Hunan University, Changsha, People’s Republic of China
| | | | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | - Roberto Car
- Department of Chemistry, Princeton University, Princeton, New Jersey 08544, USA
| | | | - Han Wang
- Author to whom correspondence should be addressed:
| |
Collapse
|
26
|
Darby JP, Kovács DP, Batatia I, Caro MA, Hart GLW, Ortner C, Csányi G. Tensor-Reduced Atomic Density Representations. PHYSICAL REVIEW LETTERS 2023; 131:028001. [PMID: 37505943 DOI: 10.1103/physrevlett.131.028001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Accepted: 04/18/2023] [Indexed: 07/30/2023]
Abstract
Density-based representations of atomic environments that are invariant under Euclidean symmetries have become a widely used tool in the machine learning of interatomic potentials, broader data-driven atomistic modeling, and the visualization and analysis of material datasets. The standard mechanism used to incorporate chemical element information is to create separate densities for each element and form tensor products between them. This leads to a steep scaling in the size of the representation as the number of elements increases. Graph neural networks, which do not explicitly use density representations, escape this scaling by mapping the chemical element information into a fixed dimensional space in a learnable way. By exploiting symmetry, we recast this approach as tensor factorization of the standard neighbour-density-based descriptors and, using a new notation, identify connections to existing compression algorithms. In doing so, we form compact tensor-reduced representation of the local atomic environment whose size does not depend on the number of chemical elements, is systematically convergable, and therefore remains applicable to a wide range of data analysis and regression tasks.
Collapse
Affiliation(s)
- James P Darby
- Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick, Coventry, CV4 7AL, United Kingdom
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| | - Dávid P Kovács
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| | - Ilyes Batatia
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
- ENS Paris-Saclay, Université Paris-Saclay, 91190 Gif-sur-Yvette, France
| | - Miguel A Caro
- Department of Electrical Engineering and Automation, Aalto University, FIN-02150 Espoo, Finland
| | - Gus L W Hart
- Department of Physics and Astronomy, Brigham Young University, Provo, Utah, 84602, USA
| | - Christoph Ortner
- Department of Mathematics, University of British Columbia, 1984 Mathematics Road, Vancouver, British Columbia, Canada V6T 1Z2
| | - Gábor Csányi
- Engineering Laboratory, University of Cambridge, Cambridge, CB2 1PZ, United Kingdom
| |
Collapse
|
27
|
Eckhoff M, Reiher M. Lifelong Machine Learning Potentials. J Chem Theory Comput 2023; 19:3509-3525. [PMID: 37288932 PMCID: PMC10308836 DOI: 10.1021/acs.jctc.3c00279] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2023] [Indexed: 06/09/2023]
Abstract
Machine learning potentials (MLPs) trained on accurate quantum chemical data can retain the high accuracy, while inflicting little computational demands. On the downside, they need to be trained for each individual system. In recent years, a vast number of MLPs have been trained from scratch because learning additional data typically requires retraining on all data to not forget previously acquired knowledge. Additionally, most common structural descriptors of MLPs cannot represent efficiently a large number of different chemical elements. In this work, we tackle these problems by introducing element-embracing atom-centered symmetry functions (eeACSFs), which combine structural properties and element information from the periodic table. These eeACSFs are key for our development of a lifelong machine learning potential (lMLP). Uncertainty quantification can be exploited to transgress a fixed, pretrained MLP to arrive at a continuously adapting lMLP, because a predefined level of accuracy can be ensured. To extend the applicability of an lMLP to new systems, we apply continual learning strategies to enable autonomous and on-the-fly training on a continuous stream of new data. For the training of deep neural networks, we propose the continual resilient (CoRe) optimizer and incremental learning strategies relying on rehearsal of data, regularization of parameters, and the architecture of the model.
Collapse
Affiliation(s)
- Marco Eckhoff
- ETH Zürich, Departement Chemie und Angewandte Biowissenschaften, 8093 Zürich, Switzerland
| | - Markus Reiher
- ETH Zürich, Departement Chemie und Angewandte Biowissenschaften, 8093 Zürich, Switzerland
| |
Collapse
|
28
|
Jaffrelot Inizan T, Plé T, Adjoua O, Ren P, Gökcan H, Isayev O, Lagardère L, Piquemal JP. Scalable hybrid deep neural networks/polarizable potentials biomolecular simulations including long-range effects. Chem Sci 2023; 14:5438-5452. [PMID: 37234902 PMCID: PMC10208042 DOI: 10.1039/d2sc04815a] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 04/03/2023] [Indexed: 07/28/2023] Open
Abstract
Deep-HP is a scalable extension of the Tinker-HP multi-GPU molecular dynamics (MD) package enabling the use of Pytorch/TensorFlow Deep Neural Network (DNN) models. Deep-HP increases DNNs' MD capabilities by orders of magnitude offering access to ns simulations for 100k-atom biosystems while offering the possibility of coupling DNNs to any classical (FFs) and many-body polarizable (PFFs) force fields. It allows therefore the introduction of the ANI-2X/AMOEBA hybrid polarizable potential designed for ligand binding studies where solvent-solvent and solvent-solute interactions are computed with the AMOEBA PFF while solute-solute ones are computed by the ANI-2X DNN. ANI-2X/AMOEBA explicitly includes AMOEBA's physical long-range interactions via an efficient Particle Mesh Ewald implementation while preserving ANI-2X's solute short-range quantum mechanical accuracy. The DNN/PFF partition can be user-defined allowing for hybrid simulations to include key ingredients of biosimulation such as polarizable solvents, polarizable counter ions, etc.… ANI-2X/AMOEBA is accelerated using a multiple-timestep strategy focusing on the model's contributions to low-frequency modes of nuclear forces. It primarily evaluates AMOEBA forces while including ANI-2X ones only via correction-steps resulting in an order of magnitude acceleration over standard Velocity Verlet integration. Simulating more than 10 μs, we compute charged/uncharged ligand solvation free energies in 4 solvents, and absolute binding free energies of host-guest complexes from SAMPL challenges. ANI-2X/AMOEBA average errors are discussed in terms of statistical uncertainty and appear in the range of chemical accuracy compared to experiment. The availability of the Deep-HP computational platform opens the path towards large-scale hybrid DNN simulations, at force-field cost, in biophysics and drug discovery.
Collapse
Affiliation(s)
- Théo Jaffrelot Inizan
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
| | - Thomas Plé
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
| | - Olivier Adjoua
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
| | - Pengyu Ren
- Department of Biomedical Engineering, University of Texas at Austin Austin Texas USA
| | - Hatice Gökcan
- Department of Chemistry, Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University Pittsburgh Pennsylvania USA
| | - Louis Lagardère
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
- Sorbonne Université, Institut Parisien de Chimie Physique et Théorique FR 2622 CNRS Paris France
| | - Jean-Philip Piquemal
- Sorbonne Université, Laboratoire de Chimie Théorique UMR 7616 CNRS Paris 75005 France
- Department of Biomedical Engineering, University of Texas at Austin Austin Texas USA
| |
Collapse
|
29
|
Middleton C, Rankine CD, Penfold TJ. An on-the-fly deep neural network for simulating time-resolved spectroscopy: predicting the ultrafast ring opening dynamics of 1,2-dithiane. Phys Chem Chem Phys 2023; 25:13325-13334. [PMID: 37139551 DOI: 10.1039/d3cp00510k] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Revolutionary developments in ultrafast light source technology are enabling experimental spectroscopists to probe the structural dynamics of molecules and materials on the femtosecond timescale. The capacity to investigate ultrafast processes afforded by these resources accordingly inspires theoreticians to carry out high-level simulations which facilitate the interpretation of the underlying dynamics probed during these ultrafast experiments. In this Article, we implement a deep neural network (DNN) to convert excited-state molecular dynamics simulations into time-resolved spectroscopic signals. Our DNN is trained on-the-fly from first-principles theoretical data obtained from a set of time-evolving molecular dynamics. The train-test process iterates for each time-step of the dynamics data until the network can predict spectra with sufficient accuracy to replace the computationally intensive quantum chemistry calculations required to produce them, at which point it simulates the time-resolved spectra for longer timescales. The potential of this approach is demonstrated by probing dynamics of the ring opening of 1,2-dithiane using sulphur K-edge X-ray absorption spectroscopy. The benefits of this strategy will be more markedly apparent for simulations of larger systems which will exhibit a more notable computational burden, making this approach applicable to the study of a diverse range of complex chemical dynamics.
Collapse
Affiliation(s)
- Clelia Middleton
- Chemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK.
| | - Conor D Rankine
- Chemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK.
- Department of Chemistry, University of York, York, YO10 5DD, UK
| | - Thomas J Penfold
- Chemistry - School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, NE1 7RU, UK.
| |
Collapse
|
30
|
Saldinger JC, Raymond M, Elvati P, Violi A. Domain-agnostic predictions of nanoscale interactions in proteins and nanoparticles. NATURE COMPUTATIONAL SCIENCE 2023; 3:393-402. [PMID: 38177838 DOI: 10.1038/s43588-023-00438-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 03/24/2023] [Indexed: 01/06/2024]
Abstract
Although challenging, the accurate and rapid prediction of nanoscale interactions has broad applications for numerous biological processes and material properties. While several models have been developed to predict the interaction of specific biological components, they use system-specific information that hinders their application to more general materials. Here we present NeCLAS, a general and efficient machine learning pipeline that predicts the location of nanoscale interactions, providing human-intelligible predictions. NeCLAS outperforms current nanoscale prediction models for generic nanoparticles up to 10-20 nm, reproducing interactions for biological and non-biological systems. Two aspects contribute to these results: a low-dimensional representation of nanoparticles and molecules (to reduce the effect of data uncertainty), and environmental features (to encode the physicochemical neighborhood at multiple scales). This framework has several applications, from basic research to rapid prototyping and design in nanobiotechnology.
Collapse
Affiliation(s)
| | - Matt Raymond
- Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA
| | - Paolo Elvati
- Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA
| | - Angela Violi
- Chemical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, USA.
- Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA.
- Biophysics Program, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
31
|
Hamilton BW, Yoo P, Sakano MN, Islam MM, Strachan A. High-pressure and temperature neural network reactive force field for energetic materials. J Chem Phys 2023; 158:144117. [PMID: 37061473 DOI: 10.1063/5.0146055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023] Open
Abstract
Reactive force fields for molecular dynamics have enabled a wide range of studies in numerous material classes. These force fields are computationally inexpensive compared with electronic structure calculations and allow for simulations of millions of atoms. However, the accuracy of traditional force fields is limited by their functional forms, preventing continual refinement and improvement. Therefore, we develop a neural network-based reactive interatomic potential for the prediction of the mechanical, thermal, and chemical responses of energetic materials at extreme conditions. The training set is expanded in an automatic iterative approach and consists of various CHNO materials and their reactions under ambient and shock-loading conditions. This new potential shows improved accuracy over the current state-of-the-art force fields for a wide range of properties such as detonation performance, decomposition product formation, and vibrational spectra under ambient and shock-loading conditions.
Collapse
Affiliation(s)
- Brenden W Hamilton
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Pilsun Yoo
- Computational Science and Engineering Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37830, USA
| | - Michael N Sakano
- Sandia National Laboratories, Albuquerque, New Mexico 87123, USA
| | - Md Mahbubul Islam
- Department of Mechanical Engineering, Wayne State University, Detroit, Michigan 48202, USA
| | - Alejandro Strachan
- School of Materials Engineering and Birck Nanotechnology Center, Purdue University, West Lafayette, Indiana 47907, USA
| |
Collapse
|
32
|
Ricci E, Vergadou N. Integrating Machine Learning in the Coarse-Grained Molecular Simulation of Polymers. J Phys Chem B 2023; 127:2302-2322. [PMID: 36888553 DOI: 10.1021/acs.jpcb.2c06354] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/09/2023]
Abstract
Machine learning (ML) is having an increasing impact on the physical sciences, engineering, and technology and its integration into molecular simulation frameworks holds great potential to expand their scope of applicability to complex materials and facilitate fundamental knowledge and reliable property predictions, contributing to the development of efficient materials design routes. The application of ML in materials informatics in general, and polymer informatics in particular, has led to interesting results, however great untapped potential lies in the integration of ML techniques into the multiscale molecular simulation methods for the study of macromolecular systems, specifically in the context of Coarse Grained (CG) simulations. In this Perspective, we aim at presenting the pioneering recent research efforts in this direction and discussing how these new ML-based techniques can contribute to critical aspects of the development of multiscale molecular simulation methods for bulk complex chemical systems, especially polymers. Prerequisites for the implementation of such ML-integrated methods and open challenges that need to be met toward the development of general systematic ML-based coarse graining schemes for polymers are discussed.
Collapse
Affiliation(s)
- Eleonora Ricci
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
- Institute of Informatics and Telecommunications, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| | - Niki Vergadou
- Institute of Nanoscience and Nanotechnology, National Center for Scientific Research "Demokritos", GR-15341 Agia Paraskevi, Athens, Greece
| |
Collapse
|
33
|
Gedam SP, Chiriki S, Padmavathi D. Advanced machine learning based global optimizations for Pt nanoclusters. J INDIAN CHEM SOC 2023. [DOI: 10.1016/j.jics.2023.100978] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/31/2023]
|
34
|
Zeng J, Tao Y, Giese TJ, York DM. QDπ: A Quantum Deep Potential Interaction Model for Drug Discovery. J Chem Theory Comput 2023; 19:1261-1275. [PMID: 36696673 PMCID: PMC9992268 DOI: 10.1021/acs.jctc.2c01172] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
We report QDπ-v1.0 for modeling the internal energy of drug molecules containing H, C, N, and O atoms. The QDπ model is in the form of a quantum mechanical/machine learning potential correction (QM/Δ-MLP) that uses a fast third-order self-consistent density-functional tight-binding (DFTB3/3OB) model that is corrected to a quantitatively high-level of accuracy through a deep-learning potential (DeepPot-SE). The model has the advantage that it is able to properly treat electrostatic interactions and handle changes in charge/protonation states. The model is trained against reference data computed at the ωB97X/6-31G* level (as in the ANI-1x data set) and compared to several other approximate semiempirical and machine learning potentials (ANI-1x, ANI-2x, DFTB3, MNDO/d, AM1, PM6, GFN1-xTB, and GFN2-xTB). The QDπ model is demonstrated to be accurate for a wide range of intra- and intermolecular interactions (despite its intended use as an internal energy model) and has shown to perform exceptionally well for relative protonation/deprotonation energies and tautomers. An example application to model reactions involved in RNA strand cleavage catalyzed by protein and nucleic acid enzymes illustrates QDπ has average errors less than 0.5 kcal/mol, whereas the other models compared have errors over an order of magnitude greater. Taken together, this makes QDπ highly attractive as a potential force field model for drug discovery.
Collapse
Affiliation(s)
- Jinzhe Zeng
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Yujun Tao
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Timothy J. Giese
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| | - Darrin M. York
- Laboratory for Biomolecular Simulation Research, Institute for Quantitative Biomedicine and Department of Chemistry and Chemical Biology, Rutgers University, Piscataway, NJ 08854, USA
| |
Collapse
|
35
|
Käser S, Vazquez-Salazar LI, Meuwly M, Töpfer K. Neural network potentials for chemistry: concepts, applications and prospects. DIGITAL DISCOVERY 2023; 2:28-58. [PMID: 36798879 PMCID: PMC9923808 DOI: 10.1039/d2dd00102k] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022]
Abstract
Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.
Collapse
Affiliation(s)
- Silvan Käser
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | | | - Markus Meuwly
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | - Kai Töpfer
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| |
Collapse
|
36
|
Yao S, Van R, Pan X, Park JH, Mao Y, Pu J, Mei Y, Shao Y. Machine learning based implicit solvent model for aqueous-solution alanine dipeptide molecular dynamics simulations. RSC Adv 2023; 13:4565-4577. [PMID: 36760282 PMCID: PMC9900604 DOI: 10.1039/d2ra08180f] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2022] [Accepted: 01/20/2023] [Indexed: 02/05/2023] Open
Abstract
Inspired by the recent work from Noé and coworkers on the development of machine learning based implicit solvent model for the simulation of solvated peptides [Chen et al., J. Chem. Phys., 2021, 155, 084101], here we report another investigation of the possibility of using machine learning (ML) techniques to "derive" an implicit solvent model directly from explicit solvent molecular dynamics (MD) simulations. For alanine dipeptide, a machine learning potential (MLP) based on the DeepPot-SE representation of the molecule was trained to capture its interactions with its average solvent environment configuration (ASEC). The predicted forces on the solute deviated only by an RMSD of 0.4 kcal mol-1 Å-1 from the reference values, and the MLP-based free energy surface differed from that obtained from explicit solvent MD simulations by an RMSD of less than 0.9 kcal mol-1. Our MLP training protocol could also accurately reproduce combined quantum mechanical molecular mechanical (QM/MM) forces on the quantum mechanical (QM) solute in ASEC environment, thus enabling the development of accurate ML-based implicit solvent models for ab initio-QM MD simulations. Such ML-based implicit solvent models for QM calculations are cost-effective in both the training stage, where the use of ASEC reduces the number of data points to be labelled, and the inference stage, where the MLP can be evaluated at a relatively small additional cost on top of the QM calculation of the solute.
Collapse
Affiliation(s)
- Songyuan Yao
- Department of Chemistry and Biochemistry, University of Oklahoma Norman OK 73019 USA
| | - Richard Van
- Department of Chemistry and Biochemistry, University of Oklahoma Norman OK 73019 USA
| | - Xiaoliang Pan
- Department of Chemistry and Biochemistry, University of Oklahoma Norman OK 73019 USA
| | - Ji Hwan Park
- School of Computer Science, University of Oklahoma Norman OK 73019 USA
| | - Yuezhi Mao
- Department of Chemistry and Biochemistry, San Diego State University San Diego CA 92182 USA
| | - Jingzhi Pu
- Department of Chemistry and Chemical Biology, Indiana University-Purdue University Indianapolis Indianapolis IN 46202 USA
| | - Ye Mei
- State Key Laboratory of Precision Spectroscopy, School of Physics and Electronic Science, East China Normal University Shanghai 200062 China
- NYU-ECNU Center for Computational Chemistry at NYU Shanghai Shanghai 200062 China
- Collaborative Innovation Center of Extreme Optics, Shanxi University Taiyuan Shanxi 030006 China
| | - Yihan Shao
- Department of Chemistry and Biochemistry, University of Oklahoma Norman OK 73019 USA
| |
Collapse
|
37
|
Kříž K, Schmidt L, Andersson AT, Walz MM, van der Spoel D. An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
Affiliation(s)
- Kristian Kříž
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Lisa Schmidt
- Faculty
of Biosciences, University of Heidelberg, Heidelberg69117, Germany
| | - Alfred T. Andersson
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Marie-Madeleine Walz
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - David van der Spoel
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| |
Collapse
|
38
|
Sumaria V, Nguyen L, Tao FF, Sautet P. Atomic-Scale Mechanism of Platinum Catalyst Restructuring under a Pressure of Reactant Gas. J Am Chem Soc 2023; 145:392-401. [PMID: 36548635 DOI: 10.1021/jacs.2c10179] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
Heterogeneous catalysis is key for chemical transformations. Understanding how catalysts' active sites dynamically evolve at the atomic scale under reaction conditions is a prerequisite for accurately determining catalytic mechanisms and predictably developing catalysts. We combine in situ time-dependent scanning tunneling microscopy observations and machine-learning-accelerated first-principles atomistic simulations to uncover the mechanism of restructuring of Pt catalysts under a pressure of carbon monoxide (CO). We show that a high CO coverage at a Pt step edge triggers the formation of atomic protrusions of low-coordination Pt atoms, which then detach from the step edge to create sub-nano-islands on the terraces, where under-coordinated sites are stabilized by the CO adsorbates. The fast and accurate machine-learning potential is key to enabling the exploration of tens of thousands of configurations for the CO-covered restructuring catalyst. These studies open an avenue to achieve an atomic-scale understanding of the structural dynamics of more complex metal nanoparticle catalysts under reaction conditions.
Collapse
Affiliation(s)
- Vaidish Sumaria
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California 90094, United States
| | - Luan Nguyen
- Department of Chemical and Petroleum Engineering, University of Kansas, Lawrence, Kansas 66045, United States
| | - Franklin Feng Tao
- Department of Chemical and Petroleum Engineering, University of Kansas, Lawrence, Kansas 66045, United States
| | - Philippe Sautet
- Department of Chemical and Biomolecular Engineering, University of California, Los Angeles, California 90094, United States.,Department of Chemistry and Biochemistry, University of California, Los Angeles, California 90094, United States
| |
Collapse
|
39
|
Zha S, Sharapa DI, Liu S, Zhao ZJ, Studt F. Modeling CoCu Nanoparticles Using Neural Network-Accelerated Monte Carlo Simulations. J Phys Chem A 2022; 126:9440-9446. [PMID: 36512375 DOI: 10.1021/acs.jpca.2c07888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
The correct description of catalytic reactions happening on bimetallic particles is not feasible without proper accounting of the segregation process. In this study, we tried to shed light on the structure of large CoCu particles, for which quite controversial results were published before. However, density functional theory (DFT) is challenging to be directly used for the systematic study of nanometer-sized particles. Therefore, we constructed a neural network-based potential and further applied it to the Monte Carlo simulations for the description of the segregation phenomenon. The resulting approach shows high efficiency and can be used in systems with thousands of atoms. The accuracy and transferability of the model to other sizes and compositions make this methodology useful for solving segregation problems.
Collapse
Affiliation(s)
- Shenjun Zha
- Institute of Catalysis Research and Technology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz Platz 1, 76344Eggenstein-Leopoldshafen, Germany
| | - Dmitry I Sharapa
- Institute of Catalysis Research and Technology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz Platz 1, 76344Eggenstein-Leopoldshafen, Germany
| | - Sihang Liu
- Key Laboratory for Green Chemical Technology of Ministry of Education, School of Chemical Engineering and Technology, Collaborative Innovation Center of Chemical Science and Engineering, Tianjin University, Tianjin300072, China
| | - Zhi-Jian Zhao
- Key Laboratory for Green Chemical Technology of Ministry of Education, School of Chemical Engineering and Technology, Collaborative Innovation Center of Chemical Science and Engineering, Tianjin University, Tianjin300072, China
| | - Felix Studt
- Institute of Catalysis Research and Technology, Karlsruhe Institute of Technology, Hermann-von-Helmholtz Platz 1, 76344Eggenstein-Leopoldshafen, Germany.,Institute for Chemical Technology and Polymer Chemistry, Karlsruhe Institute of Technology, Engesserstrasse 18, 76131Karlsruhe, Germany
| |
Collapse
|
40
|
Browning NJ, Faber FA, Anatole von Lilienfeld O. GPU-accelerated approximate kernel method for quantum machine learning. J Chem Phys 2022; 157:214801. [PMID: 36511559 DOI: 10.1063/5.0108967] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
We introduce Quantum Machine Learning (QML)-Lightning, a PyTorch package containing graphics processing unit (GPU)-accelerated approximate kernel models, which can yield trained models within seconds. QML-Lightning includes a cost-efficient GPU implementation of FCHL19, which together can provide energy and force predictions with competitive accuracy on a microsecond per atom timescale. Using modern GPU hardware, we report learning curves of energies and forces as well as timings as numerical evidence for select legacy benchmarks from atomistic simulation including QM9, MD-17, and 3BPA.
Collapse
Affiliation(s)
- Nicholas J Browning
- Institute of Physical Chemistry and National Center for Computational Design and Discovery of Novel Materials, Department of Chemistry, University of Basel, Klingelbergstrasse 80, CH-4056 Basel, Switzerland
| | - Felix A Faber
- Department of Physics, University of Cambridge, Cambridge, United Kingdom
| | | |
Collapse
|
41
|
Zhang Y, Lin Q, Jiang B. Atomistic neural network representations for chemical dynamics simulations of molecular, condensed phase, and interfacial systems: Efficiency, representability, and generalization. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Affiliation(s)
- Yaolong Zhang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Qidong Lin
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| | - Bin Jiang
- Department of Chemical Physics, School of Chemistry and Materials Science, Key Laboratory of Surface and Interface Chemistry and Energy Catalysis of Anhui Higher Education Institutes University of Science and Technology of China Hefei Anhui China
| |
Collapse
|
42
|
Mudassir MW, Goverapet Srinivasan S, Mynam M, Rai B. Systematic Identification of Atom-Centered Symmetry Functions for the Development of Neural Network Potentials. J Phys Chem A 2022; 126:8337-8347. [DOI: 10.1021/acs.jpca.2c04508] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
| | | | - Mahesh Mynam
- TCS Research, Tata Consultancy Services Ltd., Pune 411013, India
| | - Beena Rai
- TCS Research, Tata Consultancy Services Ltd., Pune 411013, India
| |
Collapse
|
43
|
Density-of-states similarity descriptor for unsupervised learning from materials data. Sci Data 2022; 9:646. [PMID: 36273207 DOI: 10.1038/s41597-022-01754-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Accepted: 10/10/2022] [Indexed: 11/08/2022] Open
Abstract
We develop a materials descriptor based on the electronic density-of-states (DOS) and investigate the similarity of materials based on it. As an application example, we study the Computational 2D Materials Database (C2DB) that hosts thousands of two-dimensional materials with their properties calculated by density-functional theory. Combining our descriptor with a clustering algorithm, we identify groups of materials with similar electronic structure. We introduce additional descriptors to characterize these clusters in terms of crystal structures, atomic compositions, and electronic configurations of their members. This allows us to rationalize the found (dis)similarities and to perform an automated exploratory and confirmatory analysis of the C2DB data. From this analysis, we find that the majority of clusters consist of isoelectronic materials sharing crystal symmetry, but we also identify outliers, i.e., materials whose similarity cannot be explained in this way.
Collapse
|
44
|
Chen LY, Hsu TW, Hsiung TC, Li YP. Deep Learning-Based Increment Theory for Formation Enthalpy Predictions. J Phys Chem A 2022; 126:7548-7556. [PMID: 36217924 DOI: 10.1021/acs.jpca.2c04848] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Machine learning predictions of molecular thermochemistry, such as formation enthalpy, have been limited for large and complicated species because of the lack of available training data. Such predictions would be important in the prediction of reaction thermodynamics and the construction of kinetic models. Herein, we introduce a graph-based deep learning approach that can separately learn the enthalpy contribution of each atom in its local environment with the effect of the overall molecular structure taken into account. Because this approach follows the additivity scheme of increment theory, it can be generalized to larger and more complicated species not present in the training data. By training the model on molecules with up to 11 heavy atoms, it can predict the formation enthalpy of testing molecules with up to 42 heavy atoms with a mean absolute error of 2 kcal/mol, which is less than half of the error of the conventional increment theory. We expect that this approach will also enable rapid prediction of other extensive properties of large molecules that are difficult to derive from experiments or ab initio calculation.
Collapse
Affiliation(s)
- Lung-Yi Chen
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei10617, Taiwan
| | - Ting-Wei Hsu
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei10617, Taiwan
| | - Tsai-Chen Hsiung
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei10617, Taiwan
| | - Yi-Pei Li
- Department of Chemical Engineering, National Taiwan University, No. 1, Sec. 4, Roosevelt Road, Taipei10617, Taiwan.,Taiwan International Graduate Program on Sustainable Chemical Science and Technology (TIGP-SCST), Academia Sinica, No. 128, Sec. 2, Academia Road, Taipei11529, Taiwan
| |
Collapse
|
45
|
Penfold TJ, Rankine CD. A deep neural network for valence-to-core X-ray emission spectroscopy. Mol Phys 2022. [DOI: 10.1080/00268976.2022.2123406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Affiliation(s)
- T. J. Penfold
- Chemistry–School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - C. D. Rankine
- Chemistry–School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
46
|
Avery C, Patterson J, Grear T, Frater T, Jacobs DJ. Protein Function Analysis through Machine Learning. Biomolecules 2022; 12:1246. [PMID: 36139085 PMCID: PMC9496392 DOI: 10.3390/biom12091246] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2022] [Revised: 08/22/2022] [Accepted: 08/31/2022] [Indexed: 11/16/2022] Open
Abstract
Machine learning (ML) has been an important arsenal in computational biology used to elucidate protein function for decades. With the recent burgeoning of novel ML methods and applications, new ML approaches have been incorporated into many areas of computational biology dealing with protein function. We examine how ML has been integrated into a wide range of computational models to improve prediction accuracy and gain a better understanding of protein function. The applications discussed are protein structure prediction, protein engineering using sequence modifications to achieve stability and druggability characteristics, molecular docking in terms of protein-ligand binding, including allosteric effects, protein-protein interactions and protein-centric drug discovery. To quantify the mechanisms underlying protein function, a holistic approach that takes structure, flexibility, stability, and dynamics into account is required, as these aspects become inseparable through their interdependence. Another key component of protein function is conformational dynamics, which often manifest as protein kinetics. Computational methods that use ML to generate representative conformational ensembles and quantify differences in conformational ensembles important for function are included in this review. Future opportunities are highlighted for each of these topics.
Collapse
Affiliation(s)
- Chris Avery
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - John Patterson
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Tyler Grear
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Theodore Frater
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| | - Donald J. Jacobs
- Department of Physics and Optical Science, University of North Carolina at Charlotte, Charlotte, NC 28223, USA
| |
Collapse
|
47
|
Kiss O, Tacchino F, Vallecorsa S, Tavernelli I. Quantum neural networks force fields generation. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2022. [DOI: 10.1088/2632-2153/ac7d3c] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022] Open
Abstract
Abstract
Accurate molecular force fields are of paramount importance for the efficient implementation of molecular dynamics techniques at large scales. In the last decade, machine learning (ML) methods have demonstrated impressive performances in predicting accurate values for energy and forces when trained on finite size ensembles generated with ab initio techniques. At the same time, quantum computers have recently started to offer new viable computational paradigms to tackle such problems. On the one hand, quantum algorithms may notably be used to extend the reach of electronic structure calculations. On the other hand, quantum ML is also emerging as an alternative and promising path to quantum advantage. Here we follow this second route and establish a direct connection between classical and quantum solutions for learning neural network (NN) potentials. To this end, we design a quantum NN architecture and apply it successfully to different molecules of growing complexity. The quantum models exhibit larger effective dimension with respect to classical counterparts and can reach competitive performances, thus pointing towards potential quantum advantages in natural science applications via quantum ML.
Collapse
|
48
|
Zhang X, Tian Y, Chen L, Hu X, Zhou Z. Machine Learning: A New Paradigm in Computational Electrocatalysis. J Phys Chem Lett 2022; 13:7920-7930. [PMID: 35980765 DOI: 10.1021/acs.jpclett.2c01710] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
Designing and screening novel electrocatalysts, understanding electrocatalytic mechanisms at an atomic level, and uncovering scientific insights lie at the center of the development of electrocatalysis. Despite certain success in experiments and computations, it is still difficult to achieve the above objectives due to the complexity of electrocatalytic systems and the vastness of the chemical space for candidate electrocatalysts. With the advantage of machine learning (ML) and increasing interest in electrocatalysis for energy conversion and storage, data-driven scientific research motivated by artificial intelligence (AI) has provided new opportunities to discover promising electrocatalysts, investigate dynamic reaction processes, and extract knowledge from huge data. In this Perspective, we summarize the recent applications of ML in electrocatalysis, including the screening of electrocatalysts and simulation of electrocatalytic processes. Furthermore, interpretable machine learning methods for electrocatalysis are discussed to accelerate knowledge generation. Finally, the blueprint of machine learning is envisaged for future development of electrocatalysis.
Collapse
Affiliation(s)
- Xu Zhang
- School of Chemical Engineering, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Yun Tian
- School of Chemical Engineering, Zhengzhou University, Zhengzhou 450001, P. R. China
| | - Letian Chen
- School of Materials Science and Engineering, Institute of New Energy Material Chemistry, Renewable Energy Conversion and Storage Center (ReCast), Key Laboratory of Advanced Energy Chemistry (Ministry of Education), Nankai University, Tianjin 300350, P. R. China
| | - Xu Hu
- School of Materials Science and Engineering, Institute of New Energy Material Chemistry, Renewable Energy Conversion and Storage Center (ReCast), Key Laboratory of Advanced Energy Chemistry (Ministry of Education), Nankai University, Tianjin 300350, P. R. China
| | - Zhen Zhou
- School of Chemical Engineering, Zhengzhou University, Zhengzhou 450001, P. R. China
- School of Materials Science and Engineering, Institute of New Energy Material Chemistry, Renewable Energy Conversion and Storage Center (ReCast), Key Laboratory of Advanced Energy Chemistry (Ministry of Education), Nankai University, Tianjin 300350, P. R. China
| |
Collapse
|
49
|
Fan Z, Wang Y, Ying P, Song K, Wang J, Wang Y, Zeng Z, Xu K, Lindgren E, Rahm JM, Gabourie AJ, Liu J, Dong H, Wu J, Chen Y, Zhong Z, Sun J, Erhart P, Su Y, Ala-Nissila T. GPUMD: A package for constructing accurate machine-learned potentials and performing highly efficient atomistic simulations. J Chem Phys 2022; 157:114801. [DOI: 10.1063/5.0106617] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We present our latest advancements of machine-learned potentials (MLPs) based on the neuroevolution potential (NEP) framework introduced in [Fan et al., Phys. Rev. B 104, 104309 (2021)] and their implementation in the open-source package GPUMD.We increase the accuracy of NEP models both by improving the radial functions in the atomic-environment descriptor using a linear combination of Chebyshev basis functions and by extending the angular descriptor with some four-body and five-body contributions as in the atomic cluster expansion approach.We also detail our efficient implementation of the NEP approach in graphics processing units as well as our workflow for the construction of NEP models, and we demonstrate their application in large-scale atomistic simulations.By comparing to state-of-the-art MLPs, we show that the NEP approach not only achieves above-average accuracy but also is far more computationally efficient.These results demonstrate that the GPUMD package is a promising tool for solving challenging problems requiring highly accurate, large-scale atomistic simulations.To enable the construction of MLPs using a minimal training set, we propose an active-learning scheme based on the latent space of a pre-trained NEP model.Finally, we introduce three separate Python packages, GPYUMD, CALORINE, and PYNEP, which enable the integration of GPUMD into Python workflows.
Collapse
Affiliation(s)
- Zheyong Fan
- School of Mathematics and Physics, Bohai University, China
| | | | - Penghua Ying
- School of Science, Harbin Institute of Technology Shenzhen, China
| | - Keke Song
- University of Science and Technology Beijing, China
| | | | | | | | - Ke Xu
- Xiamen University, Xiamen University, China
| | | | | | | | - Jiahui Liu
- University of Science and Technology Beijing, China
| | | | - Jianyang Wu
- Department of Physics, Xiamen University, China
| | - Yue Chen
- Mechanical Engineering, University of Hong Kong Department of Mechanical Engineering, Hong Kong
| | - Zheng Zhong
- Harbin Institute of Technology, Shenzhen, Harbin Institute of Technology, China
| | - Jian Sun
- Department of Physics and National Laboratory of Solid State Microstructures, Nanjing University, China
| | | | - Yanjing Su
- Corrosion and Protection Center, Key Laboratory for Environmental Fracture (MOE), University of Science and Technology Beijing, China
| | - Tapio Ala-Nissila
- Department of Applied Physics, Aalto University Department of Applied Physics, Finland
| |
Collapse
|
50
|
Kim K, Dive A, Grieder A, Adelstein N, Kang S, Wan LF, Wood BC. Flexible machine-learning interatomic potential for simulating structural disordering behavior of Li 7La 3Zr 2O 12 solid electrolytes. J Chem Phys 2022; 156:221101. [PMID: 35705400 DOI: 10.1063/5.0090341] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Batteries based on solid-state electrolytes, including Li7La3Zr2O12 (LLZO), promise improved safety and increased energy density; however, atomic disorder at grain boundaries and phase boundaries can severely deteriorate their performance. Machine-learning (ML) interatomic potentials offer a uniquely compelling solution for simulating chemical processes, rare events, and phase transitions associated with these complex interfaces by mixing high scalability with quantum-level accuracy, provided that they can be trained to properly address atomic disorder. To this end, we report the construction and validation of an ML potential that is specifically designed to simulate crystalline, disordered, and amorphous LLZO systems across a wide range of conditions. The ML model is based on a neural network algorithm and is trained using ab initio data. Performance tests prove that the developed ML potential can predict accurate structural and vibrational characteristics, elastic properties, and Li diffusivity of LLZO comparable to ab initio simulations. As a demonstration of its applicability to larger systems, we show that the potential can correctly capture grain boundary effects on diffusivity, as well as the thermal transition behavior of LLZO. These examples show that the ML potential enables simulations of transitions between well-defined and disordered structures with quantum-level accuracy at speeds thousands of times faster than ab initio methods.
Collapse
Affiliation(s)
- Kwangnam Kim
- Laboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550-9234, USA
| | - Aniruddha Dive
- Laboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550-9234, USA
| | - Andrew Grieder
- Laboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550-9234, USA
| | - Nicole Adelstein
- Department of Chemistry and Biochemistry, San Francisco State University, San Francisco, California 94132-1740, USA
| | - ShinYoung Kang
- Laboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550-9234, USA
| | - Liwen F Wan
- Laboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550-9234, USA
| | - Brandon C Wood
- Laboratory for Energy Applications for the Future (LEAF), Lawrence Livermore National Laboratory, Livermore, California 94550-9234, USA
| |
Collapse
|