1
|
Cheng Z, Bi H, Liu S, Chen J, Misquitta AJ, Yu K. Developing a Differentiable Long-Range Force Field for Proteins with E(3) Neural Network-Predicted Asymptotic Parameters. J Chem Theory Comput 2024; 20:5598-5608. [PMID: 38888427 DOI: 10.1021/acs.jctc.4c00337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/20/2024]
Abstract
Accurately describing long-range interactions is a significant challenge in molecular dynamics (MD) simulations of proteins. High-quality long-range potential is also an important component of the range-separated machine learning force field. This study introduces a comprehensive asymptotic parameter database encompassing atomic multipole moments, polarizabilities, and dispersion coefficients. Leveraging active learning, our database comprehensively represents protein fragments with up to 8 heavy atoms, capturing their conformational diversity with merely 78,000 data points. Additionally, the E(3) neural network (E3NN) is employed to predict the asymptotic parameters directly from the local geometry. The E3NN models demonstrate exceptional accuracy and transferability across all asymptotic parameters, achieving an R2 of 0.999 for both protein fragments and 20 amino acid dipeptide test sets. The long-range electrostatic and dispersion energies can be obtained using the E3NN-predicted parameters, with an error of 0.07 and 0.02 kcal/mol, respectively, when compared to symmetry-adapted perturbation theory (SAPT). Therefore, our force fields demonstrate the capability to accurately describe long-range interactions in proteins, paving the way for next-generation protein force fields.
Collapse
Affiliation(s)
- Zheng Cheng
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- AI for Science Institute, Beijing 100084, P. R. China
| | - Hangrui Bi
- School of Mathematical Sciences, Peking University, Beijing 100871, China
- DP Technology, Beijing 100080, P. R. China
| | - Siyuan Liu
- DP Technology, Beijing 100080, P. R. China
| | - Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| | - Alston J Misquitta
- School of Physics and Astronomy, Queen Mary, University of London, London E1 4NS, U.K
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Shenzhen 518055, Guangdong, P. R. China
- Tsinghua Shenzhen International Graduate School, Shenzhen 518055, Guangdong, P. R. China
| |
Collapse
|
2
|
Yang L, Guo Q, Zhang L. AI-assisted chemistry research: a comprehensive analysis of evolutionary paths and hotspots through knowledge graphs. Chem Commun (Camb) 2024; 60:6977-6987. [PMID: 38910536 DOI: 10.1039/d4cc01892c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/25/2024]
Abstract
Artificial intelligence (AI) offers transformative potential for chemical research through its ability to optimize reactions and processes, enhance energy efficiency, and reduce waste. AI-assisted chemical research (AI + chem) has become a global hotspot. To better understand the current research status of "AI + chem", this study conducted a scientific bibliometric investigation using CiteSpace. The web of science core collection was utilized to retrieve original articles related to "AI + chem" published from 2000 to 2024. The obtained data allowed for the visualization of the knowledge background, current research status, and latest knowledge structure of "AI + chem". The "AI + chem" has entered a stage of explosive growth, and the number of papers will maintain long-term high-speed growth. This article systematically analyzes the latest progress in "AI + chem" and objectively predicts future trends, including molecular design, reaction prediction, materials design, drug design, and quantum chemistry. The outcomes of this study will provide readers with a comprehensive understanding of the overall landscape of "AI + chem".
Collapse
Affiliation(s)
- Lin Yang
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Qingle Guo
- School of Intellectual Property, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China
| | - Lijing Zhang
- School of Chemistry, Dalian University of Technology, Dalian 116024, Liaoning, P. R. China.
| |
Collapse
|
3
|
Wang R, Zhang L, Li X, Zhu L, Xiang Z, Xu J, Xue D, Deng Z, Su X, Zou M. High-Performance Aluminum Fuels Induced by Monolayer Self-Assembly of Nano-Sized Energetic Fluoride Vesicles on the Surface. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2401564. [PMID: 38704734 PMCID: PMC11234408 DOI: 10.1002/advs.202401564] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2024] [Revised: 04/17/2024] [Indexed: 05/07/2024]
Abstract
Surface modification is frequently used to solve the problems of low combustion properties and agglomeration for aluminum-based fuels. However, due to the intrinsic incompatibility between the aluminum powder and the organic modifiers, the surface coating is usually uneven and disordered, which significantly deteriorates the uniformity and performances of the Al-based fuels. Herein, a new approach of monolayer nano-vesicular self-assembly is proposed to prepare high-performance Al fuels. Triblock copolymer G-F-G is produced by glycidyl azide polymer (GAP) and 2,2'-(2,2,3,3,4,5,5-Octafluorohexane-1,6-diyl) bis (oxirane) (fluoride) ring-open addition reaction. By utilizing G-F-G vesicular self-assembly in a special solvent, the nano-sized vesicles are firmly adhered to the surface of Al powder through the long-range attraction between the fluorine segments and Al. Meanwhile, the electrostatic repulsion between vesicles ensures an extremely thin coating thickness (≈15 nm), maintaining the monolayer coating structure. Nice ignition, combustion, anti-agglomeration, and water-proof properties of Al@G-F-G(DMF) are achieved, which are superior among the existing Al-based fuels. The derived Al-based fuel has excellent comprehensive properties, which can not only inspire the development of new-generation energetic materials but also provide facile but exquisite strategies for exquisite surface nanostructure construction via ordered self-assembly for many other applications.
Collapse
Affiliation(s)
- Ruibin Wang
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Lichen Zhang
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Xiaodong Li
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Lixiang Zhu
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Zilong Xiang
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Jin Xu
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Dichang Xue
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Zitong Deng
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Xing Su
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| | - Meishuai Zou
- School of Materials Science and EngineeringBeijing Institute of TechnologyNo. 5 South Zhongguancun Street, HaidianBeijing100081China
| |
Collapse
|
4
|
Chen Y, Pios SV, Gelin MF, Chen L. Accelerating Molecular Vibrational Spectra Simulations with a Physically Informed Deep Learning Model. J Chem Theory Comput 2024; 20:4703-4710. [PMID: 38825857 DOI: 10.1021/acs.jctc.4c00173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/04/2024]
Abstract
In recent years, machine learning (ML) surrogate models have emerged as an indispensable tool to accelerate simulations of physical and chemical processes. However, there is still a lack of ML models that can accurately predict molecular vibrational spectra. Here, we present a highly efficient multitask ML surrogate model termed Vibrational Spectra Neural Network (VSpecNN), to accurately calculate infrared (IR) and Raman spectra based on dipole moments and polarizabilities obtained on-the-fly via ML-enhanced molecular dynamics simulations. The methodology is applied to pyrazine, a prototypical polyatomic chromophore. The VSpecNN-predicted energies are well within the chemical accuracy (1 kcal/mol), and the errors for VSpecNN-predicted forces are only half of those obtained from a popular high-performance ML model. Compared to the ab initio reference, the VSpecNN-predicted frequencies of IR and Raman spectra differ only by less than 5.87 cm-1, and the intensities of IR spectra and the depolarization ratios of Raman spectra are well reproduced. The VSpecNN model developed in this work highlights the importance of constructing highly accurate neural network potentials for predicting molecular vibrational spectra.
Collapse
Affiliation(s)
| | | | - Maxim F Gelin
- School of Science, Hangzhou Dianzi University, Hangzhou 310018, China
| | | |
Collapse
|
5
|
Yang Y, Zhang S, Ranasinghe KD, Isayev O, Roitberg AE. Machine Learning of Reactive Potentials. Annu Rev Phys Chem 2024; 75:371-395. [PMID: 38941524 DOI: 10.1146/annurev-physchem-062123-024417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
In the past two decades, machine learning potentials (MLPs) have driven significant developments in chemical, biological, and material sciences. The construction and training of MLPs enable fast and accurate simulations and analysis of thermodynamic and kinetic properties. This review focuses on the application of MLPs to reaction systems with consideration of bond breaking and formation. We review the development of MLP models, primarily with neural network and kernel-based algorithms, and recent applications of reactive MLPs (RMLPs) to systems at different scales. We show how RMLPs are constructed, how they speed up the calculation of reactive dynamics, and how they facilitate the study of reaction trajectories, reaction rates, free energy calculations, and many other calculations. Different data sampling strategies applied in building RMLPs are also discussed with a focus on how to collect structures for rare events and how to further improve their performance with active learning.
Collapse
Affiliation(s)
- Yinuo Yang
- Department of Chemistry, University of Florida, Gainesville, Florida;
| | - Shuhao Zhang
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | | | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania;
| | - Adrian E Roitberg
- Department of Chemistry, University of Florida, Gainesville, Florida;
| |
Collapse
|
6
|
Grassano JS, Pickering I, Roitberg AE, González Lebrero MC, Estrin DA, Semelak JA. Assessment of Embedding Schemes in a Hybrid Machine Learning/Classical Potentials (ML/MM) Approach. J Chem Inf Model 2024; 64:4047-4058. [PMID: 38710065 DOI: 10.1021/acs.jcim.4c00478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Machine learning (ML) methods have reached high accuracy levels for the prediction of in vacuo molecular properties. However, the simulation of large systems solely through ML methods (such as those based on neural network potentials) is still a challenge. In this context, one of the most promising frameworks for integrating ML schemes in the simulation of complex molecular systems are the so-called ML/MM methods. These multiscale approaches combine ML methods with classical force fields (MM), in the same spirit as the successful hybrid quantum mechanics-molecular mechanics methods (QM/MM). The key issue for such ML/MM methods is an adequate description of the coupling between the region of the system described by ML and the region described at the MM level. In the context of QM/MM schemes, the main ingredient of the interaction is electrostatic, and the state of the art is the so-called electrostatic-embedding. In this study, we analyze the quality of simpler mechanical embedding-based approaches, specifically focusing on their application within a ML/MM framework utilizing atomic partial charges derived in vacuo. Taking as reference electrostatic embedding calculations performed at a QM(DFT)/MM level, we explore different atomic charges schemes, as well as a polarization correction computed using atomic polarizabilites. Our benchmark data set comprises a set of about 80k small organic structures from the ANI-1x and ANI-2x databases, solvated in water. The results suggest that the minimal basis iterative stockholder (MBIS) atomic charges yield the best agreement with the reference coupling energy. Remarkable enhancements are achieved by including a simple polarization correction.
Collapse
Affiliation(s)
- Juan S Grassano
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| | - Ignacio Pickering
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Adrian E Roitberg
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Mariano C González Lebrero
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| | - Dario A Estrin
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| | - Jonathan A Semelak
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| |
Collapse
|
7
|
Aldossary A, Campos-Gonzalez-Angulo JA, Pablo-García S, Leong SX, Rajaonson EM, Thiede L, Tom G, Wang A, Avagliano D, Aspuru-Guzik A. In Silico Chemical Experiments in the Age of AI: From Quantum Chemistry to Machine Learning and Back. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024:e2402369. [PMID: 38794859 DOI: 10.1002/adma.202402369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 04/28/2024] [Indexed: 05/26/2024]
Abstract
Computational chemistry is an indispensable tool for understanding molecules and predicting chemical properties. However, traditional computational methods face significant challenges due to the difficulty of solving the Schrödinger equations and the increasing computational cost with the size of the molecular system. In response, there has been a surge of interest in leveraging artificial intelligence (AI) and machine learning (ML) techniques to in silico experiments. Integrating AI and ML into computational chemistry increases the scalability and speed of the exploration of chemical space. However, challenges remain, particularly regarding the reproducibility and transferability of ML models. This review highlights the evolution of ML in learning from, complementing, or replacing traditional computational chemistry for energy and property predictions. Starting from models trained entirely on numerical data, a journey set forth toward the ideal model incorporating or learning the physical laws of quantum mechanics. This paper also reviews existing computational methods and ML models and their intertwining, outlines a roadmap for future research, and identifies areas for improvement and innovation. Ultimately, the goal is to develop AI architectures capable of predicting accurate and transferable solutions to the Schrödinger equation, thereby revolutionizing in silico experiments within chemistry and materials science.
Collapse
Affiliation(s)
- Abdulrahman Aldossary
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | | | - Sergio Pablo-García
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
| | - Shi Xuan Leong
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Ella Miray Rajaonson
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Luca Thiede
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
| | - Andrew Wang
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
| | - Davide Avagliano
- Chimie ParisTech, PSL University, CNRS, Institute of Chemistry for Life and Health Sciences (iCLeHS UMR 8060), Paris, F-75005, France
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, 80 St. George Street, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, 40 St. George Street, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, 661 University Ave. Suite 710, Toronto, ON, M5G 1M1, Canada
- Department of Materials Science & Engineering, University of Toronto, 184 College St., Toronto, ON, M5S 3E4, Canada
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St., Toronto, ON, M5S 3E5, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), 66118 University Ave., Toronto, M5G 1M1, Canada
- Acceleration Consortium, 80 St George St, Toronto, M5S 3H6, Canada
| |
Collapse
|
8
|
Duignan TT. The Potential of Neural Network Potentials. ACS PHYSICAL CHEMISTRY AU 2024; 4:232-241. [PMID: 38800721 PMCID: PMC11117678 DOI: 10.1021/acsphyschemau.4c00004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/18/2024] [Revised: 03/04/2024] [Accepted: 03/05/2024] [Indexed: 05/29/2024]
Abstract
In the next half-century, physical chemistry will likely undergo a profound transformation, driven predominantly by the combination of recent advances in quantum chemistry and machine learning (ML). Specifically, equivariant neural network potentials (NNPs) are a breakthrough new tool that are already enabling us to simulate systems at the molecular scale with unprecedented accuracy and speed, relying on nothing but fundamental physical laws. The continued development of this approach will realize Paul Dirac's 80-year-old vision of using quantum mechanics to unify physics with chemistry and providing invaluable tools for understanding materials science, biology, earth sciences, and beyond. The era of highly accurate and efficient first-principles molecular simulations will provide a wealth of training data that can be used to build automated computational methodologies, using tools such as diffusion models, for the design and optimization of systems at the molecular scale. Large language models (LLMs) will also evolve into increasingly indispensable tools for literature review, coding, idea generation, and scientific writing.
Collapse
|
9
|
Wang G, Wang C, Zhang X, Li Z, Zhou J, Sun Z. Machine learning interatomic potential: Bridge the gap between small-scale models and realistic device-scale simulations. iScience 2024; 27:109673. [PMID: 38646181 PMCID: PMC11033164 DOI: 10.1016/j.isci.2024.109673] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2024] Open
Abstract
Machine learning interatomic potential (MLIP) overcomes the challenges of high computational costs in density-functional theory and the relatively low accuracy in classical large-scale molecular dynamics, facilitating more efficient and precise simulations in materials research and design. In this review, the current state of the four essential stages of MLIP is discussed, including data generation methods, material structure descriptors, six unique machine learning algorithms, and available software. Furthermore, the applications of MLIP in various fields are investigated, notably in phase-change memory materials, structure searching, material properties predicting, and the pre-trained universal models. Eventually, the future perspectives, consisting of standard datasets, transferability, generalization, and trade-off between accuracy and complexity in MLIPs, are reported.
Collapse
Affiliation(s)
- Guanjie Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
- School of Integrated Circuit Science and Engineering, Beihang University, Beijing 100191, China
| | - Changrui Wang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Xuanguang Zhang
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zefeng Li
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Jian Zhou
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| | - Zhimei Sun
- School of Materials Science and Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
10
|
Zhang S, Makoś MZ, Jadrich RB, Kraka E, Barros K, Nebgen BT, Tretiak S, Isayev O, Lubbers N, Messerly RA, Smith JS. Exploring the frontiers of condensed-phase chemistry with a general reactive machine learning potential. Nat Chem 2024; 16:727-734. [PMID: 38454071 PMCID: PMC11087274 DOI: 10.1038/s41557-023-01427-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2023] [Accepted: 12/12/2023] [Indexed: 03/09/2024]
Abstract
Atomistic simulation has a broad range of applications from drug design to materials discovery. Machine learning interatomic potentials (MLIPs) have become an efficient alternative to computationally expensive ab initio simulations. For this reason, chemistry and materials science would greatly benefit from a general reactive MLIP, that is, an MLIP that is applicable to a broad range of reactive chemistry without the need for refitting. Here we develop a general reactive MLIP (ANI-1xnr) through automated sampling of condensed-phase reactions. ANI-1xnr is then applied to study five distinct systems: carbon solid-phase nucleation, graphene ring formation from acetylene, biofuel additives, combustion of methane and the spontaneous formation of glycine from early earth small molecules. In all studies, ANI-1xnr closely matches experiment (when available) and/or previous studies using traditional model chemistry methods. As such, ANI-1xnr proves to be a highly general reactive MLIP for C, H, N and O elements in the condensed phase, enabling high-throughput in silico reactive chemistry experimentation.
Collapse
Affiliation(s)
- Shuhao Zhang
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA, USA
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Małgorzata Z Makoś
- Computational and Theoretical Chemistry Group, Department of Chemistry, Southern Methodist University, Dallas, TX, USA
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Ryan B Jadrich
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Elfi Kraka
- Computational and Theoretical Chemistry Group, Department of Chemistry, Southern Methodist University, Dallas, TX, USA
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Benjamin T Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, NM, USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
| | - Richard A Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
| | - Justin S Smith
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM, USA.
- NVIDIA Corp., Santa Clara, CA, USA.
| |
Collapse
|
11
|
Chen M, Jiang X, Zhang L, Chen X, Wen Y, Gu Z, Li X, Zheng M. The emergence of machine learning force fields in drug design. Med Res Rev 2024; 44:1147-1182. [PMID: 38173298 DOI: 10.1002/med.22008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2023] [Revised: 11/29/2023] [Accepted: 12/05/2023] [Indexed: 01/05/2024]
Abstract
In the field of molecular simulation for drug design, traditional molecular mechanic force fields and quantum chemical theories have been instrumental but limited in terms of scalability and computational efficiency. To overcome these limitations, machine learning force fields (MLFFs) have emerged as a powerful tool capable of balancing accuracy with efficiency. MLFFs rely on the relationship between molecular structures and potential energy, bypassing the need for a preconceived notion of interaction representations. Their accuracy depends on the machine learning models used, and the quality and volume of training data sets. With recent advances in equivariant neural networks and high-quality datasets, MLFFs have significantly improved their performance. This review explores MLFFs, emphasizing their potential in drug design. It elucidates MLFF principles, provides development and validation guidelines, and highlights successful MLFF implementations. It also addresses potential challenges in developing and applying MLFFs. The review concludes by illuminating the path ahead for MLFFs, outlining the challenges to be overcome and the opportunities to be harnessed. This inspires researchers to embrace MLFFs in their investigations as a new tool to perform molecular simulations in drug design.
Collapse
Affiliation(s)
- Mingan Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Physical Science and Technology, ShanghaiTech University, Shanghai, China
- Lingang Laboratory, Shanghai, China
| | - Xinyu Jiang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Lehan Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Xiaoxu Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Yiming Wen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Zhiyong Gu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- School of Pharmacy, University of Chinese Academy of Sciences, Beijing, China
- School of Pharmaceutical Science and Technology, Hangzhou Institute for Advanced Study, UCAS, Hangzhou, China
| |
Collapse
|
12
|
Zhai Y, Rashmi R, Palos E, Paesani F. Many-body interactions and deep neural network potentials for water. J Chem Phys 2024; 160:144501. [PMID: 38587225 DOI: 10.1063/5.0203682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 03/23/2024] [Indexed: 04/09/2024] Open
Abstract
We present a detailed assessment of deep neural network potentials developed within the Deep Potential Molecular Dynamics (DeePMD) framework and trained on the MB-pol data-driven many-body potential energy function. Specific focus is directed at the ability of DeePMD-based potentials to correctly reproduce the accuracy of MB-pol across various water systems. Analyses of bulk and interfacial properties as well as many-body interactions characteristic of water elucidate inherent limitations in the transferability and predictive accuracy of DeePMD-based potentials. These limitations can be traced back to an incomplete implementation of the "nearsightedness of electronic matter" principle, which may be common throughout machine learning potentials that do not include a proper representation of self-consistently determined long-range electric fields. These findings provide further support for the "short-blanket dilemma" faced by DeePMD-based potentials, highlighting the challenges in achieving a balance between computational efficiency and a rigorous, physics-based representation of the properties of water. Finally, we believe that our study contributes to the ongoing discourse on the development and application of machine learning models in simulating water systems, offering insights that could guide future improvements in the field.
Collapse
Affiliation(s)
- Yaoguang Zhai
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA
| | - Richa Rashmi
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Etienne Palos
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
- Materials Science and Engineering, University of California San Diego, La Jolla, California 92093, USA
- Halicioğlu Data Science Institute, University of California San Diego, La Jolla, California 92093, USA
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
13
|
Metcalf DP, Glick ZL, Bortolato A, Jiang A, Cheney DL, Sherrill CD. Directional Δ G Neural Network (DrΔ G-Net): A Modular Neural Network Approach to Binding Free Energy Prediction. J Chem Inf Model 2024; 64:1907-1918. [PMID: 38470995 DOI: 10.1021/acs.jcim.3c02054] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
The protein-ligand binding free energy is a central quantity in structure-based computational drug discovery efforts. Although popular alchemical methods provide sound statistical means of computing the binding free energy of a large breadth of systems, they are generally too costly to be applied at the same frequency as end point or ligand-based methods. By contrast, these data-driven approaches are typically fast enough to address thousands of systems but with reduced transferability to unseen systems. We introduce DrΔG-Net (or simply Dragnet), an equivariant graph neural network that can blend ligand-based and protein-ligand data-driven approaches. It is based on a 3D fingerprint representation of the ligand alone and in complex with the protein target. Dragnet is a global scoring function to predict the binding affinity of arbitrary protein-ligand complexes, but can be easily tuned via transfer learning to specific systems or end points, performing similarly to common 2D ligand-based approaches in these tasks. Dragnet is evaluated on a total of 28 validation proteins with a set of congeneric ligands derived from the Binding DB and one custom set extracted from the ChEMBL Database. In general, a handful of experimental binding affinities are sufficient to optimize the scoring function for a particular protein and ligand scaffold. When not available, predictions from physics-based methods such as absolute free energy perturbation can be used for the transfer learning tuning of Dragnet. Furthermore, we use our data to illustrate the present limitations of data-driven modeling of binding free energy predictions.
Collapse
Affiliation(s)
- Derek P Metcalf
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| | - Zachary L Glick
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| | - Andrea Bortolato
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, United States
| | - Andy Jiang
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol-Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, United States
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30332-0400, United States
| |
Collapse
|
14
|
Song Z, Han J, Henkelman G, Li L. Charge-Optimized Electrostatic Interaction Atom-Centered Neural Network Algorithm. J Chem Theory Comput 2024; 20:2088-2097. [PMID: 38380601 DOI: 10.1021/acs.jctc.3c01254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Machine-learning algorithms have been proposed to capture electrostatic interactions by using effective partial charges. These algorithms often rely on a pretrained model for partial charge prediction using density functional theory-calculated partial charges as references, which introduces complexity to the force field model. The accuracy of the trained model also depends on the reliability of charge partition methods, which can be dependent on the specific system and methodology employed. In this study, we propose an atom-centered neural network (ANN) algorithm that eliminates the need for reference charges. Our algorithm requires only a single NN model for each element to obtain both atomic energy and charges. These atomic charges are then employed to compute electrostatic energies using the Ewald summation algorithm. Subsequently, the force field model is trained on total energy and forces, with the inclusion of electrostatic energy. To evaluate the performance of our algorithm, we conducted tests on three benchmark systems, including a Ge slab with an O adatom system, a TiO2 crystalline system, and a Pd-O nanoparticle system. Our results demonstrate reasonably accurate predictions of partial charges and electrostatic interactions. This algorithm provides a self-consistent charge prediction strategy and possibilities for robust and reliable modeling of electrostatic interactions in machine-learning potentials.
Collapse
Affiliation(s)
- Zichen Song
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
| | - Jian Han
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
| | - Graeme Henkelman
- Department of Chemistry, the University of Texas at Austin, Austin, Texas 78712, United States
- Institute for Computational Engineering and Sciences, the University of Texas at Austin, Austin, Texas 78712, United States
| | - Lei Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
15
|
Martí C, Devereux C, Najm HN, Zádor J. Evaluation of Rate Coefficients in the Gas Phase Using Machine-Learned Potentials. J Phys Chem A 2024. [PMID: 38427974 DOI: 10.1021/acs.jpca.3c07872] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/03/2024]
Abstract
We assess the capability of machine-learned potentials to compute rate coefficients by training a neural network (NN) model and applying it to describe the chemical landscape on the C5H5 potential energy surface, which is relevant to molecular weight growth in combustion and interstellar media. We coupled the resulting NN with an automated kinetics workflow code, KinBot, to perform all necessary calculations to compute the rate coefficients. The NN is benchmarked exhaustively by evaluating its performance at the various stages of the kinetics calculations: from the electronic energy through the computation of zero point energy, barrier heights, entropic contributions, the portion of the PES explored, and finally the overall rate coefficients as formulated by transition state theory.
Collapse
Affiliation(s)
- Carles Martí
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| | - Christian Devereux
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| | - Habib N Najm
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| | - Judit Zádor
- Combustion Research Facility, Sandia National Laboratories, Livermore, California 94551, United States
| |
Collapse
|
16
|
Kaufman B, Williams EC, Underkoffler C, Pederson R, Mardirossian N, Watson I, Parkhill J. COATI: Multimodal Contrastive Pretraining for Representing and Traversing Chemical Space. J Chem Inf Model 2024; 64:1145-1157. [PMID: 38316665 DOI: 10.1021/acs.jcim.3c01753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Creating a successful small molecule drug is a challenging multiparameter optimization problem in an effectively infinite space of possible molecules. Generative models have emerged as powerful tools for traversing data manifolds composed of images, sounds, and text and offer an opportunity to dramatically improve the drug discovery and design process. To create generative optimization methods that are more useful than brute-force molecular generation and filtering via virtual screening, we propose that four integrated features are necessary: large, quantitative data sets of molecular structure and activity, an invertible vector representation of realistic accessible molecules, smooth and differentiable regressors that quantify uncertainty, and algorithms to simultaneously optimize properties of interest. Over the course of 12 months, Terray Therapeutics has collected a data set of 2 billion quantitative binding measurements of small molecules to therapeutic targets, which directly motivates multiparameter generative optimization of molecules conditioned on these data. To this end, we present contrastive optimization for accelerated therapeutic inference (COATI), a pretrained, multimodal encoder-decoder model of druglike chemical space. COATI is constructed without any human biasing of features, using contrastive learning from text and 3D representations of molecules to allow for downstream use with structural models. We demonstrate that COATI possesses many of the desired properties of universal molecular embedding: fixed-dimension, invertibility, autoencoding, accurate regression, and low computation cost. Finally, we present a novel metadynamics algorithm for generative optimization using a small subset of our proprietary data collected for a model protein, carbonic anhydrase, designing molecules that satisfy the multiparameter optimization task of potency, solubility, and drug likeness. This work sets the stage for fully integrated generative molecular design and optimization for small molecules.
Collapse
Affiliation(s)
- Benjamin Kaufman
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| | - Edward C Williams
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| | - Carl Underkoffler
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| | - Ryan Pederson
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| | - Narbe Mardirossian
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| | - Ian Watson
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| | - John Parkhill
- Terray Therapeutics, Inc., 800 Royal Oaks Dr, Monrovia, California 91016, United States
| |
Collapse
|
17
|
Li R, Zhou C, Singh A, Pei Y, Henkelman G, Li L. Local-environment-guided selection of atomic structures for the development of machine-learning potentials. J Chem Phys 2024; 160:074109. [PMID: 38380745 DOI: 10.1063/5.0187892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2023] [Accepted: 01/26/2024] [Indexed: 02/22/2024] Open
Abstract
Machine learning potentials (MLPs) have attracted significant attention in computational chemistry and materials science due to their high accuracy and computational efficiency. The proper selection of atomic structures is crucial for developing reliable MLPs. Insufficient or redundant atomic structures can impede the training process and potentially result in a poor quality MLP. Here, we propose a local-environment-guided screening algorithm for efficient dataset selection in MLP development. The algorithm utilizes a local environment bank to store unique local environments of atoms. The dissimilarity between a particular local environment and those stored in the bank is evaluated using the Euclidean distance. A new structure is selected only if its local environment is significantly different from those already present in the bank. Consequently, the bank is then updated with all the new local environments found in the selected structure. To demonstrate the effectiveness of our algorithm, we applied it to select structures for a Ge system and a Pd13H2 particle system. The algorithm reduced the training data size by around 80% for both without compromising the performance of the MLP models. We verified that the results were independent of the selection and ordering of the initial structures. We also compared the performance of our method with the farthest point sampling algorithm, and the results show that our algorithm is superior in both robustness and computational efficiency. Furthermore, the generated local environment bank can be continuously updated and can potentially serve as a growing database of feature local environments, aiding in efficient dataset maintenance for constructing accurate MLPs.
Collapse
Affiliation(s)
- Renzhe Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- College of Chemistry, Xiangtan University, Xiangtan 411105, Hunan Province, People's Republic of China
| | - Chuan Zhou
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| | - Akksay Singh
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
- Department of Chemistry, The University of Texas at Austin, Austin, Texas 78712, USA
- Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Yong Pei
- College of Chemistry, Xiangtan University, Xiangtan 411105, Hunan Province, People's Republic of China
| | - Graeme Henkelman
- Department of Chemistry, The University of Texas at Austin, Austin, Texas 78712, USA
- Institute for Computational Engineering and Sciences, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Lei Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, People's Republic of China
| |
Collapse
|
18
|
Xi B, Chan MK, Bao K, Zhao W, Chan HM, Chen H, Zhu J. Parameter-Free and Electron Counting Satisfied Material Representation for Machine Learning Potential Energy and Force Fields. J Phys Chem Lett 2024; 15:1636-1643. [PMID: 38306617 PMCID: PMC10875669 DOI: 10.1021/acs.jpclett.3c03250] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2023] [Revised: 01/28/2024] [Accepted: 01/29/2024] [Indexed: 02/04/2024]
Abstract
We proposed a parameter-free volume element representation that satisfies the electron counting model and obtains accurate machine learning potential energy and direct force fitting of randomly perturbed hexagonal BN. Our method preserves permutational, translational, and rotational invariance and can be extended to three-dimensional systems, verified by a system of bulk Si. As a result, we obtained 0.57 meV/atom potential energy root mean squared error (RMSE) and 59 meV/Å force RMSE for perturbed bulk BN systems and 0.43 meV/atom potential energy RMSE and 36 meV/Å force RMSE for perturbed Si systems. In addition, an unbiased perturbation-based data set construction scheme is introduced and a continuous population distribution is obtained with a training data set of 4500, which is about 1 order of magnitude smaller than standard methods based on first-principles molecular dynamics simulations and saves a large amount of computing resources. General validity of our model is verified by structure optimization, molecular dynamics simulations, and extrapolations.
Collapse
Affiliation(s)
- Bin Xi
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Man Kit Chan
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Kejie Bao
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Wenjing Zhao
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Ho Ming Chan
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Hang Chen
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| | - Junyi Zhu
- Department of Physics, The Chinese University of Hong Kong, Shatin, New Territory, Hong Kong SAR 999077, P.R. China
| |
Collapse
|
19
|
Matin S, Allen AEA, Smith J, Lubbers N, Jadrich RB, Messerly R, Nebgen B, Li YW, Tretiak S, Barros K. Machine Learning Potentials with the Iterative Boltzmann Inversion: Training to Experiment. J Chem Theory Comput 2024. [PMID: 38307009 DOI: 10.1021/acs.jctc.3c01051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2024]
Abstract
Methodologies for training machine learning potentials (MLPs) with quantum-mechanical simulation data have recently seen tremendous progress. Experimental data have a very different character than simulated data, and most MLP training procedures cannot be easily adapted to incorporate both types of data into the training process. We investigate a training procedure based on iterative Boltzmann inversion that produces a pair potential correction to an existing MLP using equilibrium radial distribution function data. By applying these corrections to an MLP for pure aluminum based on density functional theory, we observe that the resulting model largely addresses previous overstructuring in the melt phase. Interestingly, the corrected MLP also exhibits improved performance in predicting experimental diffusion constants, which are not included in the training procedure. The presented method does not require autodifferentiating through a molecular dynamics solver and does not make assumptions about the MLP architecture. Our results suggest a practical framework for incorporating experimental data into machine learning models to improve the accuracy of molecular dynamics simulations.
Collapse
Affiliation(s)
- Sakib Matin
- Department of Physics, Boston University, Boston, Massachusetts 02215, United States
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Alice E A Allen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Justin Smith
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- NVIDIA Corp., Santa Clara, California 95051, United States
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Ryan B Jadrich
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Richard Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, United States
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Integrated Nanotechnologies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87546, United States
| |
Collapse
|
20
|
Li K, Tran NV, Pan Y, Wang S, Jin Z, Chen G, Li S, Zheng J, Loh XJ, Li Z. Next-Generation Vitrimers Design through Theoretical Understanding and Computational Simulations. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2302816. [PMID: 38058273 PMCID: PMC10837359 DOI: 10.1002/advs.202302816] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Revised: 09/03/2023] [Indexed: 12/08/2023]
Abstract
Vitrimers are an innovative class of polymers that boast a remarkable fusion of mechanical and dynamic features, complemented by the added benefit of end-of-life recyclability. This extraordinary blend of properties makes them highly attractive for a variety of applications, such as the automotive sector, soft robotics, and the aerospace industry. At their core, vitrimer materials consist of crosslinked covalent networks that have the ability to dynamically reorganize in response to external factors, including temperature changes, pressure variations, or shifts in pH levels. In this review, the aim is to delve into the latest advancements in the theoretical understanding and computational design of vitrimers. The review begins by offering an overview of the fundamental principles that underlie the behavior of these materials, encompassing their structures, dynamic behavior, and reaction mechanisms. Subsequently, recent progress in the computational design of vitrimers is explored, with a focus on the employment of molecular dynamics (MD)/Monte Carlo (MC) simulations and density functional theory (DFT) calculations. Last, the existing challenges and prospective directions for this field are critically analyzed, emphasizing the necessity for additional theoretical and computational advancements, coupled with experimental validation.
Collapse
Affiliation(s)
- Ke Li
- Institute of Materials Research and Engineering (IMRE), Agency for Science, Technology and Research (A*STAR), 2 Fusionopolis Way, Innovis #08-03, Singapore, 138634, Republic of Singapore
| | - Nam Van Tran
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | - Yuqing Pan
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | - Sheng Wang
- Institute of Sustainability for Chemicals, Energy and Environment (ISCE2), Agency for Science, Technology and Research (A*STAR), Singapore, 138634, Singapore
| | - Zhicheng Jin
- Laboratory for Biomaterials and Drug Delivery, The Department of Anesthesiology, Critical Care and Pain Medicine, Boston Children's Hospital, Harvard Medical School, Boston, MA, 02115, USA
| | - Guoliang Chen
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | - Shuzhou Li
- School of Materials Science and Engineering, Nanyang Technological University, 50 Nanyang Avenue, Singapore, 639798, Singapore
| | - Jianwei Zheng
- Institute of High Performance Computing (IHPC), Agency for Science, Technology and Research (A*STAR), 1 Fusionopolis Way, #16-16 Connexis, Singapore, 138632, Republic of Singapore
| | - Xian Jun Loh
- Institute of Materials Research and Engineering (IMRE), Agency for Science, Technology and Research (A*STAR), 2 Fusionopolis Way, Innovis #08-03, Singapore, 138634, Republic of Singapore
- Institute of Sustainability for Chemicals, Energy and Environment (ISCE2), Agency for Science, Technology and Research (A*STAR), Singapore, 138634, Singapore
| | - Zibiao Li
- Institute of Materials Research and Engineering (IMRE), Agency for Science, Technology and Research (A*STAR), 2 Fusionopolis Way, Innovis #08-03, Singapore, 138634, Republic of Singapore
- Institute of Sustainability for Chemicals, Energy and Environment (ISCE2), Agency for Science, Technology and Research (A*STAR), Singapore, 138634, Singapore
- Department of Materials Science and Engineering, National University of Singapore, Singapore, 117576, Singapore
| |
Collapse
|
21
|
Ding Y, Huang J. Implementation and Validation of an OpenMM Plugin for the Deep Potential Representation of Potential Energy. Int J Mol Sci 2024; 25:1448. [PMID: 38338727 PMCID: PMC10855459 DOI: 10.3390/ijms25031448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024] Open
Abstract
Machine learning potentials, particularly the deep potential (DP) model, have revolutionized molecular dynamics (MD) simulations, striking a balance between accuracy and computational efficiency. To facilitate the DP model's integration with the popular MD engine OpenMM, we have developed a versatile OpenMM plugin. This plugin supports a range of applications, from conventional MD simulations to alchemical free energy calculations and hybrid DP/MM simulations. Our extensive validation tests encompassed energy conservation in microcanonical ensemble simulations, fidelity in canonical ensemble generation, and the evaluation of the structural, transport, and thermodynamic properties of bulk water. The introduction of this plugin is expected to significantly expand the application scope of DP models within the MD simulation community, representing a major advancement in the field.
Collapse
Affiliation(s)
- Ye Ding
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China;
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| | - Jing Huang
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| |
Collapse
|
22
|
Chen J, Yu K. PhyNEO: A Neural-Network-Enhanced Physics-Driven Force Field Development Workflow for Bulk Organic Molecule and Polymer Simulations. J Chem Theory Comput 2024; 20:253-265. [PMID: 38118076 DOI: 10.1021/acs.jctc.3c01045] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
An accurate, generalizable, and transferable force field plays a crucial role in the molecular dynamics simulations of organic polymers and biomolecules. Conventional empirical force fields often fail to capture precise intermolecular interactions due to their negligence of important physics, such as polarization, charge penetration, many-body dispersion, etc. Moreover, the parameterization of these force fields relies heavily on top-down fittings, limiting their transferabilities to new systems where the experimental data are often unavailable. To address these challenges, we introduce a general and fully ab initio force field construction strategy, named PhyNEO. It features a hybrid approach that combines both the physics-driven and the data-driven methods and is able to generate a bulk potential with chemical accuracy using only quantum chemistry data of very small clusters. Careful separations of long-/short-range interactions and nonbonding/bonding interactions are the key to the success of PhyNEO. By such a strategy, we mitigate the limitations of pure data-driven methods in long-range interactions, thus largely increasing the data efficiency and the scalability of machine learning models. The new approach is thoroughly tested on poly(ethylene oxide) and polyethylene glycol systems, giving superior accuracies in both microscopic and bulk properties compared to conventional force fields. This work thus offers a promising framework for the development of advanced force fields in a wide range of organic molecular systems.
Collapse
Affiliation(s)
- Junmin Chen
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| | - Kuang Yu
- Tsinghua-Berkeley Shenzhen Institute, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
- Institute of Materials Research (iMR), Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, Guangdong 518055, P. R. China
| |
Collapse
|
23
|
Shayestehpour O, Zahn S. Efficient Molecular Dynamics Simulations of Deep Eutectic Solvents with First-Principles Accuracy Using Machine Learning Interatomic Potentials. J Chem Theory Comput 2023; 19:8732-8742. [PMID: 37972596 PMCID: PMC10720642 DOI: 10.1021/acs.jctc.3c00944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2023] [Revised: 11/03/2023] [Accepted: 11/03/2023] [Indexed: 11/19/2023]
Abstract
In recent years, deep eutectic solvents emerged as highly tunable and ecofriendly alternatives to common organic solvents and liquid electrolytes. In the present work, the ability of machine learning (ML) interatomic potentials for molecular dynamics (MD) simulations of these liquids is explored, showcasing a trained neural network potential for a 1:2 ratio mixture of choline chloride and urea (reline). Using the ML potentials trained on density functional theory data, MD simulations for large systems of thousands of atoms and nanosecond-long time scales are feasible at a fraction of the computational cost of the target first-principles simulations. The obtained structural and dynamical properties of reline from MD simulations using our machine learning models are in good agreement with the first-principles MD simulations and experimental results. Running a single MD simulation is highlighted as a general shortcoming of typical first-principles studies if the dynamic properties are investigated. Furthermore, velocity cross-correlation functions are employed to study the collective dynamics of the molecular components in reline.
Collapse
Affiliation(s)
| | - Stefan Zahn
- Leibniz Institute of Surface Engineering, 04318 Leipzig, Germany
| |
Collapse
|
24
|
Fonseca G, Poltavsky I, Tkatchenko A. Force Field Analysis Software and Tools (FFAST): Assessing Machine Learning Force Fields under the Microscope. J Chem Theory Comput 2023; 19:8706-8717. [PMID: 38011895 PMCID: PMC10720330 DOI: 10.1021/acs.jctc.3c00985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/06/2023] [Accepted: 11/07/2023] [Indexed: 11/29/2023]
Abstract
As the sophistication of machine learning force fields (MLFF) increases to match the complexity of extended molecules and materials, so does the need for tools to properly analyze and assess the practical performance of MLFFs. To go beyond average error metrics and into a complete picture of a model's applicability and limitations, we developed FFAST (force field analysis software and tools): a cross-platform software package designed to gain detailed insights into a model's performance and limitations, complete with an easy-to-use graphical user interface. The software allows the user to gauge the performance of any molecular force field,─such as popular state-of-the-art MLFF models, ─ on various popular data set types, providing general prediction error overviews, outlier detection mechanisms, atom-projected errors, and more. It has a 3D visualizer to find and picture problematic configurations, atoms, or clusters in a large data set. In this paper, the example of the MACE and NequIP models is used on two data sets of interest [stachyose and docosahexaenoic acid (DHA)]─to illustrate the use cases of the software. With this, it was found that carbons and oxygens involved in or near glycosidic bonds inside the stachyose molecule present increased prediction errors. In addition, prediction errors on DHA rise as the molecule folds, especially for the carboxylic group at the edge of the molecule. We emphasize the need for a systematic assessment of MLFF models for ensuring their successful application to the study of dynamics of molecules and materials.
Collapse
Affiliation(s)
- Gregory Fonseca
- Department of Physics and Materials
Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Igor Poltavsky
- Department of Physics and Materials
Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials
Science, University of Luxembourg, Luxembourg City L-1511, Luxembourg
| |
Collapse
|
25
|
Chen JA, Chao SD. Intermolecular Non-Bonded Interactions from Machine Learning Datasets. Molecules 2023; 28:7900. [PMID: 38067629 PMCID: PMC10707888 DOI: 10.3390/molecules28237900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 04/04/2024] Open
Abstract
Accurate determination of intermolecular non-covalent-bonded or non-bonded interactions is the key to potentially useful molecular dynamics simulations of polymer systems. However, it is challenging to balance both the accuracy and computational cost in force field modelling. One of the main difficulties is properly representing the calculated energy data as a continuous force function. In this paper, we employ well-developed machine learning techniques to construct a general purpose intermolecular non-bonded interaction force field for organic polymers. The original ab initio dataset SOFG-31 was calculated by us and has been well documented, and here we use it as our training set. The CLIFF kernel type machine learning scheme is used for predicting the interaction energies of heterodimers selected from the SOFG-31 dataset. Our test results show that the overall errors are well below the chemical accuracy of about 1 kcal/mol, thus demonstrating the promising feasibility of machine learning techniques in force field modelling.
Collapse
Affiliation(s)
- Jia-An Chen
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
26
|
Zhao Q, Anstine DM, Isayev O, Savoie BM. Δ 2 machine learning for reaction property prediction. Chem Sci 2023; 14:13392-13401. [PMID: 38033903 PMCID: PMC10686042 DOI: 10.1039/d3sc02408c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2023] [Accepted: 07/11/2023] [Indexed: 12/02/2023] Open
Abstract
The emergence of Δ-learning models, whereby machine learning (ML) is used to predict a correction to a low-level energy calculation, provides a versatile route to accelerate high-level energy evaluations at a given geometry. However, Δ-learning models are inapplicable to reaction properties like heats of reaction and activation energies that require both a high-level geometry and energy evaluation. Here, a Δ2-learning model is introduced that can predict high-level activation energies based on low-level critical-point geometries. The Δ2 model uses an atom-wise featurization typical of contemporary ML interatomic potentials (MLIPs) and is trained on a dataset of ∼167 000 reactions, using the GFN2-xTB energy and critical-point geometry as a low-level input and the B3LYP-D3/TZVP energy calculated at the B3LYP-D3/TZVP critical point as a high-level target. The excellent performance of the Δ2 model on unseen reactions demonstrates the surprising ease with which the model implicitly learns the geometric deviations between the low-level and high-level geometries that condition the activation energy prediction. The transferability of the Δ2 model is validated on several external testing sets where it shows near chemical accuracy, illustrating the benefits of combining ML models with readily available physical-based information from semi-empirical quantum chemistry calculations. Fine-tuning of the Δ2 model on a small number of Gaussian-4 calculations produced a 35% accuracy improvement over DFT activation energy predictions while retaining xTB-level cost. The Δ2 model approach proves to be an efficient strategy for accelerating chemical reaction characterization with minimal sacrifice in prediction accuracy.
Collapse
Affiliation(s)
- Qiyuan Zhao
- Davidson School of Chemical Engineering, Purdue University West Lafayette IN 47906 USA
| | - Dylan M Anstine
- Department of Chemistry, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Olexandr Isayev
- Department of Chemistry, Carnegie Mellon University Pittsburgh PA 15213 USA
| | - Brett M Savoie
- Davidson School of Chemical Engineering, Purdue University West Lafayette IN 47906 USA
| |
Collapse
|
27
|
Xia J, Zhang Y, Jiang B. Accuracy Assessment of Atomistic Neural Network Potentials: The Impact of Cutoff Radius and Message Passing. J Phys Chem A 2023; 127:9874-9883. [PMID: 37943102 DOI: 10.1021/acs.jpca.3c06024] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2023]
Abstract
Atomistic neural network potentials have achieved great success in accelerating atomistic simulations in complicated systems in recent years. They are typically based on the atomic decomposition of total properties, truncating the interatomic correlations to a local environment within a given cutoff radius. A more recently developed message passing (MP) neural network framework can, in principle, incorporate nonlocal effects through iteratively correlating some atoms outside the cutoff sphere with atoms inside, a process referred to as MP. However, how the model accuracy depends on the cutoff radius and the MP process has rarely been discussed. In this work, we investigate this dependence using a recursively embedded atom neural network method that possesses both local and MP features, in two representative systems: liquid H2O and solid Al2O3. We focus on how these settings influence predictions for structural and vibrational properties, namely, radial distribution functions (RDFs) and vibrational density of states (VDOSs). We find that while MP lowers test errors of energy and forces in general, it may not improve the prediction for RDFs and/or VDOSs if direct interatomic correlations in the local environment are insufficiently described. A cutoff radius exceeding the first neighbor shell is necessary, beyond which involving MP quickly enhances the model accuracy until convergence. This is a potentially more efficient way to increase the model accuracy than directly increasing the cutoff radius, especially with more memory savings in the GPU implementation. Our findings also suggest that using the mean test error as the measure of the model accuracy alone is inadequate.
Collapse
Affiliation(s)
- Junfan Xia
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Yaolong Zhang
- École Polytechnique FFlytech de Lausanne, 1015 Lausanne, Switzerland
| | - Bin Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Department of Chemical Physics, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
28
|
Plé T, Lagardère L, Piquemal JP. Force-field-enhanced neural network interactions: from local equivariant embedding to atom-in-molecule properties and long-range effects. Chem Sci 2023; 14:12554-12569. [PMID: 38020379 PMCID: PMC10646944 DOI: 10.1039/d3sc02581k] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 10/03/2023] [Indexed: 12/01/2023] Open
Abstract
We introduce FENNIX (Force-Field-Enhanced Neural Network InteraXions), a hybrid approach between machine-learning and force-fields. We leverage state-of-the-art equivariant neural networks to predict local energy contributions and multiple atom-in-molecule properties that are then used as geometry-dependent parameters for physically-motivated energy terms which account for long-range electrostatics and dispersion. Using high-accuracy ab initio data (small organic molecules/dimers), we trained a first version of the model. Exhibiting accurate gas-phase energy predictions, FENNIX is transferable to the condensed phase. It is able to produce stable Molecular Dynamics simulations, including nuclear quantum effects, for water predicting accurate liquid properties. The extrapolating power of the hybrid physically-driven machine learning FENNIX approach is exemplified by computing: (i) the solvated alanine dipeptide free energy landscape; (ii) the reactive dissociation of small molecules.
Collapse
Affiliation(s)
- Thomas Plé
- Sorbonne Université, LCT, UMR 7616 CNRS F-75005 Paris France thomas.ple@sorbonne-université louis.lagardere@sorbonne-université jean-philip.piquemal@sorbonne-université
| | - Louis Lagardère
- Sorbonne Université, LCT, UMR 7616 CNRS F-75005 Paris France thomas.ple@sorbonne-université louis.lagardere@sorbonne-université jean-philip.piquemal@sorbonne-université
| | - Jean-Philip Piquemal
- Sorbonne Université, LCT, UMR 7616 CNRS F-75005 Paris France thomas.ple@sorbonne-université louis.lagardere@sorbonne-université jean-philip.piquemal@sorbonne-université
| |
Collapse
|
29
|
Illarionov A, Sakipov S, Pereyaslavets L, Kurnikov IV, Kamath G, Butin O, Voronina E, Ivahnenko I, Leontyev I, Nawrocki G, Darkhovskiy M, Olevanov M, Cherniavskyi YK, Lock C, Greenslade S, Sankaranarayanan SKRS, Kurnikova MG, Potoff J, Kornberg RD, Levitt M, Fain B. Combining Force Fields and Neural Networks for an Accurate Representation of Chemically Diverse Molecular Interactions. J Am Chem Soc 2023; 145:23620-23629. [PMID: 37856313 PMCID: PMC10623557 DOI: 10.1021/jacs.3c07628] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2023] [Indexed: 10/21/2023]
Abstract
A key goal of molecular modeling is the accurate reproduction of the true quantum mechanical potential energy of arbitrary molecular ensembles with a tractable classical approximation. The challenges are that analytical expressions found in general purpose force fields struggle to faithfully represent the intermolecular quantum potential energy surface at close distances and in strong interaction regimes; that the more accurate neural network approximations do not capture crucial physics concepts, e.g., nonadditive inductive contributions and application of electric fields; and that the ultra-accurate narrowly targeted models have difficulty generalizing to the entire chemical space. We therefore designed a hybrid wide-coverage intermolecular interaction model consisting of an analytically polarizable force field combined with a short-range neural network correction for the total intermolecular interaction energy. Here, we describe the methodology and apply the model to accurately determine the properties of water, the free energy of solvation of neutral and charged molecules, and the binding free energy of ligands to proteins. The correction is subtyped for distinct chemical species to match the underlying force field, to segment and reduce the amount of quantum training data, and to increase accuracy and computational speed. For the systems considered, the hybrid ab initio parametrized Hamiltonian reproduces the two-body dimer quantum mechanics (QM) energies to within 0.03 kcal/mol and the nonadditive many-molecule contributions to within 2%. Simulations of molecular systems using this interaction model run at speeds of several nanoseconds per day.
Collapse
Affiliation(s)
- Alexey Illarionov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Serzhan Sakipov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Leonid Pereyaslavets
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor V. Kurnikov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ganesh Kamath
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Oleg Butin
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Ekaterina Voronina
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Lomonosov
MSU, Skobeltsyn Institute of Nuclear Physics, Moscow, 119991, Russia
| | - Ilya Ivahnenko
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Igor Leontyev
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Grzegorz Nawrocki
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Mikhail Darkhovskiy
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Michael Olevanov
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Lomonosov
MSU, Dept. of Physics, Moscow, 119991, Russia
| | - Yevhen K. Cherniavskyi
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Christopher Lock
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
- Department
of Neurology and Neurological Sciences, Stanford University School of Medicine, Palo Alto, California 94304, United States
| | - Sean Greenslade
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| | - Subramanian KRS Sankaranarayanan
- Center
for Nanoscale Materials, Argonne National
Lab, Argonne, Illinois 604391, United States
- Department
of Mechanical and Industrial Engineering, University of Illinois, Chicago, Illinois 60607, United States
| | - Maria G. Kurnikova
- Department
of Chemistry, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Jeffrey Potoff
- Department
of Chemical Engineering and Materials Science, Wayne State University, Detroit, Michigan 48202, United States
| | - Roger D. Kornberg
- Department
of Structural Biology, Stanford University
School of Medicine, Stanford, California 94304, United States
| | - Michael Levitt
- Department
of Structural Biology, Stanford University
School of Medicine, Stanford, California 94304, United States
| | - Boris Fain
- InterX
Inc. (a Subsidiary of NeoTX Therapeutics Ltd.), 805 Allston Way, Berkeley, California 94710, United States
| |
Collapse
|
30
|
Tokita AM, Behler J. How to train a neural network potential. J Chem Phys 2023; 159:121501. [PMID: 38127396 DOI: 10.1063/5.0160326] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Accepted: 07/24/2023] [Indexed: 12/23/2023] Open
Abstract
The introduction of modern Machine Learning Potentials (MLPs) has led to a paradigm change in the development of potential energy surfaces for atomistic simulations. By providing efficient access to energies and forces, they allow us to perform large-scale simulations of extended systems, which are not directly accessible by demanding first-principles methods. In these simulations, MLPs can reach the accuracy of electronic structure calculations, provided that they have been properly trained and validated using a suitable set of reference data. Due to their highly flexible functional form, the construction of MLPs has to be done with great care. In this Tutorial, we describe the necessary key steps for training reliable MLPs, from data generation via training to final validation. The procedure, which is illustrated for the example of a high-dimensional neural network potential, is general and applicable to many types of MLPs.
Collapse
Affiliation(s)
- Alea Miako Tokita
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl für Theoretische Chemie II, Ruhr-Universität Bochum, 44780 Bochum, Germany and Research Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| |
Collapse
|
31
|
Yang J, Cong Y, Li Y, Li H. Machine Learning Approach Based on a Range-Corrected Deep Potential Model for Efficient Vibrational Frequency Computation. J Chem Theory Comput 2023; 19:6366-6374. [PMID: 37652890 DOI: 10.1021/acs.jctc.3c00386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/02/2023]
Abstract
As an ensemble average result, vibrational spectrum simulation can be time-consuming with high accuracy methods. We present a machine learning approach based on the range-corrected deep potential (DPRc) model to improve the computing efficiency. The DPRc method divides the system into "probe region" and "solvent region"; "solvent-solvent" interactions are not counted in the neural network. We applied the approach to two systems: formic acid C═O stretching and MeCN C≡N stretching vibrational frequency shifts in water. All data sets were prepared using the quantum vibration perturbation approach. Effects of different region divisions, one-body correction, cut range, and training data size were tested. The model with a single-molecule "probe region" showed stable accuracy; it ran roughly 10 times faster than regular deep potential and reduced the training time by about four. The approach is efficient, easy to apply, and extendable to calculating various spectra.
Collapse
Affiliation(s)
- Jitai Yang
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - Yang Cong
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - You Li
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| | - Hui Li
- Institute of Theoretical Chemistry, College of Chemistry, Jilin University, 2519 Jiefang Road, Changchun 130023, P. R. China
| |
Collapse
|
32
|
Galvelis R, Varela-Rial A, Doerr S, Fino R, Eastman P, Markland TE, Chodera JD, De Fabritiis G. NNP/MM: Accelerating Molecular Dynamics Simulations with Machine Learning Potentials and Molecular Mechanics. J Chem Inf Model 2023; 63:5701-5708. [PMID: 37694852 PMCID: PMC10577237 DOI: 10.1021/acs.jcim.3c00773] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2023]
Abstract
Machine learning potentials have emerged as a means to enhance the accuracy of biomolecular simulations. However, their application is constrained by the significant computational cost arising from the vast number of parameters compared with traditional molecular mechanics. To tackle this issue, we introduce an optimized implementation of the hybrid method (NNP/MM), which combines a neural network potential (NNP) and molecular mechanics (MM). This approach models a portion of the system, such as a small molecule, using NNP while employing MM for the remaining system to boost efficiency. By conducting molecular dynamics (MD) simulations on various protein-ligand complexes and metadynamics (MTD) simulations on a ligand, we showcase the capabilities of our implementation of NNP/MM. It has enabled us to increase the simulation speed by ∼5 times and achieve a combined sampling of 1 μs for each complex, marking the longest simulations ever reported for this class of simulations.
Collapse
Affiliation(s)
- Raimondas Galvelis
- Acellera Labs, C/Doctor Trueta 183, Barcelona 08005, Spain
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Doctor Aiguader 88, Barcelona 08003, Spain
| | - Alejandro Varela-Rial
- Acellera Ltd, Devonshire House 582 Honeypot Lane, Stanmore Middlesex, HA7 1JS, United Kingdom
| | - Stefan Doerr
- Acellera Ltd, Devonshire House 582 Honeypot Lane, Stanmore Middlesex, HA7 1JS, United Kingdom
| | - Roberto Fino
- Acellera Labs, C/Doctor Trueta 183, Barcelona 08005, Spain
| | - Peter Eastman
- Department of Chemistry, Stanford University, 337 Campus Drive, Stanford, California 94305, United States
| | - Thomas E Markland
- Department of Chemistry, Stanford University, 337 Campus Drive, Stanford, California 94305, United States
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, New York 10065, United States
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, PRBB, C/Doctor Aiguader 88, Barcelona 08003, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, Barcelona 08010, Spain
- Acellera Ltd, Devonshire House 582 Honeypot Lane, Stanmore Middlesex, HA7 1JS, United Kingdom
| |
Collapse
|
33
|
Fedik N, Nebgen B, Lubbers N, Barros K, Kulichenko M, Li YW, Zubatyuk R, Messerly R, Isayev O, Tretiak S. Synergy of semiempirical models and machine learning in computational chemistry. J Chem Phys 2023; 159:110901. [PMID: 37712780 DOI: 10.1063/5.0151833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 07/11/2023] [Indexed: 09/16/2023] Open
Abstract
Catalyzed by enormous success in the industrial sector, many research programs have been exploring data-driven, machine learning approaches. Performance can be poor when the model is extrapolated to new regions of chemical space, e.g., new bonding types, new many-body interactions. Another important limitation is the spatial locality assumption in model architecture, and this limitation cannot be overcome with larger or more diverse datasets. The outlined challenges are primarily associated with the lack of electronic structure information in surrogate models such as interatomic potentials. Given the fast development of machine learning and computational chemistry methods, we expect some limitations of surrogate models to be addressed in the near future; nevertheless spatial locality assumption will likely remain a limiting factor for their transferability. Here, we suggest focusing on an equally important effort-design of physics-informed models that leverage the domain knowledge and employ machine learning only as a corrective tool. In the context of material science, we will focus on semi-empirical quantum mechanics, using machine learning to predict corrections to the reduced-order Hamiltonian model parameters. The resulting models are broadly applicable, retain the speed of semiempirical chemistry, and frequently achieve accuracy on par with much more expensive ab initio calculations. These early results indicate that future work, in which machine learning and quantum chemistry methods are developed jointly, may provide the best of all worlds for chemistry applications that demand both high accuracy and high numerical efficiency.
Collapse
Affiliation(s)
- Nikita Fedik
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Maksim Kulichenko
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Ying Wai Li
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Roman Zubatyuk
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Richard Messerly
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Olexandr Isayev
- Department of Chemistry, Mellon College of Science, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Integrated Nanotechnologies Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| |
Collapse
|
34
|
Kývala L, Dellago C. Optimizing the architecture of Behler-Parrinello neural network potentials. J Chem Phys 2023; 159:094105. [PMID: 37655764 DOI: 10.1063/5.0167260] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 08/10/2023] [Indexed: 09/02/2023] Open
Abstract
The architecture of neural network potentials is typically optimized at the beginning of the training process and remains unchanged throughout. Here, we investigate the accuracy of Behler-Parrinello neural network potentials for varying training set sizes. Using the QM9 and 3BPA datasets, we show that adjusting the network architecture according to the training set size improves the accuracy significantly. We demonstrate that both an insufficient and an excessive number of fitting parameters can have a detrimental impact on the accuracy of the neural network potential. Furthermore, we investigate the influences of descriptor complexity, neural network depth, and activation function on the model's performance. We find that for the neural network potentials studied here, two hidden layers yield the best accuracy and that unbounded activation functions outperform bounded ones.
Collapse
Affiliation(s)
- Lukáš Kývala
- Faculty of Physics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria
- Vienna Doctoral School in Physics, University of Vienna, Boltzmanngasse 5, 1090 Vienna, Austria
| | - Christoph Dellago
- Faculty of Physics, University of Vienna, Kolingasse 14-16, 1090 Vienna, Austria
| |
Collapse
|
35
|
Wang T, He X, Li M, Shao B, Liu TY. AIMD-Chig: Exploring the conformational space of a 166-atom protein Chignolin with ab initio molecular dynamics. Sci Data 2023; 10:549. [PMID: 37607915 PMCID: PMC10444755 DOI: 10.1038/s41597-023-02465-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 08/11/2023] [Indexed: 08/24/2023] Open
Abstract
Molecular dynamics (MD) simulations have revolutionized the modeling of biomolecular conformations and provided unprecedented insight into molecular interactions. Due to the prohibitive computational overheads of ab initio simulation for large biomolecules, dynamic modeling for proteins is generally constrained on force field with molecular mechanics, which suffers from low accuracy as well as ignores the electronic effects. Here, we report AIMD-Chig, an MD dataset including 2 million conformations of 166-atom protein Chignolin sampled at the density functional theory (DFT) level with 7,763,146 CPU hours. 10,000 conformations were initialized covering the whole conformational space of Chignolin, including folded, unfolded, and metastable states. Ab initio simulations were driven by M06-2X/6-31 G* with a Berendsen thermostat at 340 K. We reported coordinates, energies, and forces for each conformation. AIMD-Chig brings the DFT level conformational space exploration from small organic molecules to real-world proteins. It can serve as the benchmark for developing machine learning potentials for proteins and facilitate the exploration of protein dynamics with ab initio accuracy.
Collapse
Affiliation(s)
- Tong Wang
- Microsoft Research AI4Science, Beijing, China.
| | - Xinheng He
- Microsoft Research AI4Science, Beijing, China
- Work done during an internship at Microsoft Research AI4Science, Beijing, China
- State Key Laboratory of Drug Research and CAS Key Laboratory of Receptor Research and, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai, China
- University of Chinese Academy of Sciences, Beijing, China
| | - Mingyu Li
- Microsoft Research AI4Science, Beijing, China
- Work done during an internship at Microsoft Research AI4Science, Beijing, China
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education, Shanghai Jiao Tong University, School of Medicine, Shanghai, China
| | - Bin Shao
- Microsoft Research AI4Science, Beijing, China.
| | - Tie-Yan Liu
- Microsoft Research AI4Science, Beijing, China
| |
Collapse
|
36
|
Tkachenko NV, Tkachenko AA, Nebgen B, Tretiak S, Boldyrev AI. Neural network atomistic potentials for global energy minima search in carbon clusters. Phys Chem Chem Phys 2023; 25:21173-21182. [PMID: 37490276 DOI: 10.1039/d3cp02317f] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/26/2023]
Abstract
The global energy optimization problem is an acute and important problem in chemistry. It is crucial to know the geometry of the lowest energy isomer (global minimum, GM) of a given compound for the evaluation of its chemical and physical properties. This problem is especially relevant for atomic clusters. Due to the exponential growth of the number of local minima geometries with the increase of the number of atoms in the cluster, it is important to find a computationally efficient and reliable method to navigate the energy landscape and locate a true global minima structure. Newly developed neural network (NN) atomistic potentials offer a numerically efficient and relatively accurate approach for molecular structure optimization. An important question that needs to be answered is "Can NN potentials, trained on a given set, represent the potential energy surface (PES) of a neighboring domain?". In this work, we tested the applicability of ANI-1ccx and ANI-nr NN atomistic potentials for the global minima optimization of carbon clusters Cn (n = 3-10). We showed that with the introduction of the cluster connectivity restriction and consequent DFT or ab initio calculations, ANI-1ccx and ANI-nr can be considered as robust PES pre-samplers that can capture the GM structure even for large clusters such as C20.
Collapse
Affiliation(s)
- Nikolay V Tkachenko
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322-0300, USA.
| | | | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Sergei Tretiak
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA
| | - Alexander I Boldyrev
- Department of Chemistry and Biochemistry, Utah State University, Logan, Utah 84322-0300, USA.
| |
Collapse
|
37
|
Wang Y, Xu C, Li Z, Barati Farimani A. Denoise Pretraining on Nonequilibrium Molecules for Accurate and Transferable Neural Potentials. J Chem Theory Comput 2023; 19:5077-5087. [PMID: 37390120 PMCID: PMC10413865 DOI: 10.1021/acs.jctc.3c00289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2023] [Indexed: 07/02/2023]
Abstract
Recent advances in equivariant graph neural networks (GNNs) have made deep learning amenable to developing fast surrogate models to expensive ab initio quantum mechanics (QM) approaches for molecular potential predictions. However, building accurate and transferable potential models using GNNs remains challenging, as the data are greatly limited by the expensive computational costs and level of theory of QM methods, especially for large and complex molecular systems. In this work, we propose denoise pretraining on nonequilibrium molecular conformations to achieve more accurate and transferable GNN potential predictions. Specifically, atomic coordinates of sampled nonequilibrium conformations are perturbed by random noises, and GNNs are pretrained to denoise the perturbed molecular conformations which recovers the original coordinates. Rigorous experiments on multiple benchmarks reveal that pretraining significantly improves the accuracy of neural potentials. Furthermore, we show that the proposed pretraining approach is model-agnostic, as it improves the performance of different invariant and equivariant GNNs. Notably, our models pretrained on small molecules demonstrate remarkable transferability, improving performance when fine-tuned on diverse molecular systems, including different elements, charged molecules, biomolecules, and larger systems. These results highlight the potential for leveraging denoise pretraining approaches to build more generalizable neural potentials for complex molecular systems.
Collapse
Affiliation(s)
- Yuyang Wang
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
- Machine
Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
| | - Changwen Xu
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - Zijie Li
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| | - Amir Barati Farimani
- Department
of Mechanical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
- Machine
Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Materials Science and Engineering, Carnegie
Mellon University, Pittsburgh, Pennsylvania 15213, United States
- Department
of Chemical Engineering, Carnegie Mellon
University, Pittsburgh, Pennsylvania 15213, United States
| |
Collapse
|
38
|
Chen X, Xu S, Shabani S, Zhao Y, Fu M, Millis AJ, Fogler MM, Pasupathy AN, Liu M, Basov DN. Machine Learning for Optical Scanning Probe Nanoscopy. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2023; 35:e2109171. [PMID: 36333118 DOI: 10.1002/adma.202109171] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 07/09/2022] [Indexed: 06/16/2023]
Abstract
The ability to perform nanometer-scale optical imaging and spectroscopy is key to deciphering the low-energy effects in quantum materials, as well as vibrational fingerprints in planetary and extraterrestrial particles, catalytic substances, and aqueous biological samples. These tasks can be accomplished by the scattering-type scanning near-field optical microscopy (s-SNOM) technique that has recently spread to many research fields and enabled notable discoveries. Herein, it is shown that the s-SNOM, together with scanning probe research in general, can benefit in many ways from artificial-intelligence (AI) and machine-learning (ML) algorithms. Augmented with AI- and ML-enhanced data acquisition and analysis, scanning probe optical nanoscopy is poised to become more efficient, accurate, and intelligent.
Collapse
Affiliation(s)
- Xinzhong Chen
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, 11794, USA
| | - Suheng Xu
- Department of Physics, Columbia University, New York, NY, 10027, USA
| | - Sara Shabani
- Department of Physics, Columbia University, New York, NY, 10027, USA
| | - Yueqi Zhao
- Department of Physics, University of California at San Diego, La Jolla, CA, 92093-0319, USA
| | - Matthew Fu
- Department of Physics, Columbia University, New York, NY, 10027, USA
| | - Andrew J Millis
- Department of Physics, Columbia University, New York, NY, 10027, USA
| | - Michael M Fogler
- Department of Physics, University of California at San Diego, La Jolla, CA, 92093-0319, USA
| | - Abhay N Pasupathy
- Department of Physics, Columbia University, New York, NY, 10027, USA
| | - Mengkun Liu
- Department of Physics and Astronomy, Stony Brook University, Stony Brook, NY, 11794, USA
- National Synchrotron Light Source II, Brookhaven National Laboratory, Upton, NY, 11973, USA
| | - D N Basov
- Department of Physics, Columbia University, New York, NY, 10027, USA
| |
Collapse
|
39
|
Zhang P, Yang W. Toward a general neural network force field for protein simulations: Refining the intramolecular interaction in protein. J Chem Phys 2023; 159:024118. [PMID: 37431910 PMCID: PMC10481389 DOI: 10.1063/5.0142280] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/22/2023] [Indexed: 07/12/2023] Open
Abstract
Molecular dynamics (MD) is an extremely powerful, highly effective, and widely used approach to understanding the nature of chemical processes in atomic details for proteins. The accuracy of results from MD simulations is highly dependent on force fields. Currently, molecular mechanical (MM) force fields are mainly utilized in MD simulations because of their low computational cost. Quantum mechanical (QM) calculation has high accuracy, but it is exceedingly time consuming for protein simulations. Machine learning (ML) provides the capability for generating accurate potential at the QM level without increasing much computational effort for specific systems that can be studied at the QM level. However, the construction of general machine learned force fields, needed for broad applications and large and complex systems, is still challenging. Here, general and transferable neural network (NN) force fields based on CHARMM force fields, named CHARMM-NN, are constructed for proteins by training NN models on 27 fragments partitioned from the residue-based systematic molecular fragmentation (rSMF) method. The NN for each fragment is based on atom types and uses new input features that are similar to MM inputs, including bonds, angles, dihedrals, and non-bonded terms, which enhance the compatibility of CHARMM-NN to MM MD and enable the implementation of CHARMM-NN force fields in different MD programs. While the main part of the energy of the protein is based on rSMF and NN, the nonbonded interactions between the fragments and with water are taken from the CHARMM force field through mechanical embedding. The validations of the method for dipeptides on geometric data, relative potential energies, and structural reorganization energies demonstrate that the CHARMM-NN local minima on the potential energy surface are very accurate approximations to QM, showing the success of CHARMM-NN for bonded interactions. However, the MD simulations on peptides and proteins indicate that more accurate methods to represent protein-water interactions in fragments and non-bonded interactions between fragments should be considered in the future improvement of CHARMM-NN, which can increase the accuracy of approximation beyond the current mechanical embedding QM/MM level.
Collapse
Affiliation(s)
- Pan Zhang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Weitao Yang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
40
|
Kabylda A, Vassilev-Galindo V, Chmiela S, Poltavsky I, Tkatchenko A. Efficient interatomic descriptors for accurate machine learning force fields of extended molecules. Nat Commun 2023; 14:3562. [PMID: 37322039 DOI: 10.1038/s41467-023-39214-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 05/17/2023] [Indexed: 06/17/2023] Open
Abstract
Machine learning force fields (MLFFs) are gradually evolving towards enabling molecular dynamics simulations of molecules and materials with ab initio accuracy but at a small fraction of the computational cost. However, several challenges remain to be addressed to enable predictive MLFF simulations of realistic molecules, including: (1) developing efficient descriptors for non-local interatomic interactions, which are essential to capture long-range molecular fluctuations, and (2) reducing the dimensionality of the descriptors to enhance the applicability and interpretability of MLFFs. Here we propose an automatized approach to substantially reduce the number of interatomic descriptor features while preserving the accuracy and increasing the efficiency of MLFFs. To simultaneously address the two stated challenges, we illustrate our approach on the example of the global GDML MLFF. We found that non-local features (atoms separated by as far as 15 Å in studied systems) are crucial to retain the overall accuracy of the MLFF for peptides, DNA base pairs, fatty acids, and supramolecular complexes. Interestingly, the number of required non-local features in the reduced descriptors becomes comparable to the number of local interatomic features (those below 5 Å). These results pave the way to constructing global molecular MLFFs whose cost increases linearly, instead of quadratically, with system size.
Collapse
Affiliation(s)
- Adil Kabylda
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Valentin Vassilev-Galindo
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Stefan Chmiela
- Machine Learning Group, Technische Universität Berlin, 10587, Berlin, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, 10587, Berlin, Germany
| | - Igor Poltavsky
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511, Luxembourg City, Luxembourg.
| |
Collapse
|
41
|
Yan X, Yue T, Winkler DA, Yin Y, Zhu H, Jiang G, Yan B. Converting Nanotoxicity Data to Information Using Artificial Intelligence and Simulation. Chem Rev 2023. [PMID: 37262026 DOI: 10.1021/acs.chemrev.3c00070] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Decades of nanotoxicology research have generated extensive and diverse data sets. However, data is not equal to information. The question is how to extract critical information buried in vast data streams. Here we show that artificial intelligence (AI) and molecular simulation play key roles in transforming nanotoxicity data into critical information, i.e., constructing the quantitative nanostructure (physicochemical properties)-toxicity relationships, and elucidating the toxicity-related molecular mechanisms. For AI and molecular simulation to realize their full impacts in this mission, several obstacles must be overcome. These include the paucity of high-quality nanomaterials (NMs) and standardized nanotoxicity data, the lack of model-friendly databases, the scarcity of specific and universal nanodescriptors, and the inability to simulate NMs at realistic spatial and temporal scales. This review provides a comprehensive and representative, but not exhaustive, summary of the current capability gaps and tools required to fill these formidable gaps. Specifically, we discuss the applications of AI and molecular simulation, which can address the large-scale data challenge for nanotoxicology research. The need for model-friendly nanotoxicity databases, powerful nanodescriptors, new modeling approaches, molecular mechanism analysis, and design of the next-generation NMs are also critically discussed. Finally, we provide a perspective on future trends and challenges.
Collapse
Affiliation(s)
- Xiliang Yan
- Institute of Environmental Research at the Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| | - Tongtao Yue
- Key Laboratory of Marine Environment and Ecology, Ministry of Education, Institute of Coastal Environmental Pollution Control, Ocean University of China, Qingdao 266100, China
| | - David A Winkler
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria 3052, Australia
- School of Pharmacy, University of Nottingham, Nottingham NG7 2QL, U.K
- Department of Biochemistry and Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Yongguang Yin
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Hao Zhu
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Bing Yan
- Institute of Environmental Research at the Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| |
Collapse
|
42
|
Shepherd S, Tribello GA, Wilkins DM. A fully quantum-mechanical treatment for kaolinite. J Chem Phys 2023; 158:2892274. [PMID: 37220200 DOI: 10.1063/5.0152361] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 05/03/2023] [Indexed: 05/25/2023] Open
Abstract
Neural network potentials for kaolinite minerals have been fitted to data extracted from density functional theory calculations that were performed using the revPBE + D3 and revPBE + vdW functionals. These potentials have then been used to calculate the static and dynamic properties of the mineral. We show that revPBE + vdW is better at reproducing the static properties. However, revPBE + D3 does a better job of reproducing the experimental IR spectrum. We also consider what happens to these properties when a fully quantum treatment of the nuclei is employed. We find that nuclear quantum effects (NQEs) do not make a substantial difference to the static properties. However, when NQEs are included, the dynamic properties of the material change substantially.
Collapse
Affiliation(s)
- Sam Shepherd
- Centre for Quantum Materials and Technologies, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - Gareth A Tribello
- Centre for Quantum Materials and Technologies, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| | - David M Wilkins
- Centre for Quantum Materials and Technologies, School of Mathematics and Physics, Queen's University Belfast, Belfast BT7 1NN, Northern Ireland, United Kingdom
| |
Collapse
|
43
|
Chigaev M, Smith JS, Anaya S, Nebgen B, Bettencourt M, Barros K, Lubbers N. Lightweight and effective tensor sensitivity for atomistic neural networks. J Chem Phys 2023; 158:2889493. [PMID: 37158328 DOI: 10.1063/5.0142127] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Accepted: 04/20/2023] [Indexed: 05/10/2023] Open
Abstract
Atomistic machine learning focuses on the creation of models that obey fundamental symmetries of atomistic configurations, such as permutation, translation, and rotation invariances. In many of these schemes, translation and rotation invariance are achieved by building on scalar invariants, e.g., distances between atom pairs. There is growing interest in molecular representations that work internally with higher rank rotational tensors, e.g., vector displacements between atoms, and tensor products thereof. Here, we present a framework for extending the Hierarchically Interacting Particle Neural Network (HIP-NN) with Tensor Sensitivity information (HIP-NN-TS) from each local atomic environment. Crucially, the method employs a weight tying strategy that allows direct incorporation of many-body information while adding very few model parameters. We show that HIP-NN-TS is more accurate than HIP-NN, with negligible increase in parameter count, for several datasets and network sizes. As the dataset becomes more complex, tensor sensitivities provide greater improvements to model accuracy. In particular, HIP-NN-TS achieves a record mean absolute error of 0.927 kcalmol for conformational energy variation on the challenging COMP6 benchmark, which includes a broad set of organic molecules. We also compare the computational performance of HIP-NN-TS to HIP-NN and other models in the literature.
Collapse
Affiliation(s)
- Michael Chigaev
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Justin S Smith
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- NVIDIA, 2788 San Tomas Expy, Santa Clara, California 95051, USA
| | - Steven Anaya
- High Performance Computing Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Benjamin Nebgen
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | | | - Kipton Barros
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
- Center for Nonlinear Studies, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Nicholas Lubbers
- Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| |
Collapse
|
44
|
Feng Y, Wang C. Surface Confinement of Finite-Size Water Droplets for SO 3 Hydrolysis Reaction Revealed by Molecular Dynamics Simulations Based on a Machine Learning Force Field. J Am Chem Soc 2023; 145:10631-10640. [PMID: 37130210 DOI: 10.1021/jacs.3c00698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
Abstract
As an important source for sulfuric acid in the atmosphere, hydrolysis of sulfur trioxide (SO3) takes place with water clusters of sizes from several molecules to several nanometers, resulting in various final products, including neutral (H2SO4)-(H2O) clusters and ionic (HSO4)--(H3O)+ clusters. The diverse products may be due to the ability of proton transfer and the formation of hydrated ions for water cluster of finite sizes, especially the sub-micrometer ones. However, the detailed molecular-level mechanism is still unclear due to the lack of available characterization and simulations tools. Here, we developed a quantum chemistry-level machine learning (ML) model to simulate the hydrolysis of SO3 with water clusters of sizes up to nanometers. The simulation results demonstrate diverse reaction paths taking place between SO3 and water clusters of different sizes. Generally, neutral (H2SO4)-(H2O) clusters are preferred by water clusters of ultra-small size, and a loop structure-mediated mechanism with SO3(H2O)n≤4 structures and a non-loop structure-mediated mechanism with structure relaxation are observed. As the water cluster size increases to (H2O)8, a (HSO4)--(H3O)+ ion-pair product emerges; and the Eigen-Zundel ion conversion-like proton transfer mechanism takes place and stabilizes the ion pairs. As the water cluster sizes further increase beyond several nanometers ((H2O)n≥32), the (SO4)2-[(H3O)+]2 ion-pair product appears. The reason could be that the surface of these water clusters is large enough to screen Coulomb repulsion between two tri-coordinated ion-pair complexes. These findings would provide new perspectives for understanding SO3 hydrolysis in the real atmosphere and sulfuric acid chemistry in atmospheric aerosols.
Collapse
Affiliation(s)
- Yajuan Feng
- School of Information Science and Technology, University of Science and Technology of China, Hefei, Anhui 230026, China
| | - Chao Wang
- National Synchrotron Radiation Laboratory, University of Science and Technology of China, Hefei, Anhui 230026, China
| |
Collapse
|
45
|
Han B, Isborn CM, Shi L. Incorporating Polarization and Charge Transfer into a Point-Charge Model for Water Using Machine Learning. J Phys Chem Lett 2023; 14:3869-3877. [PMID: 37067482 DOI: 10.1021/acs.jpclett.3c00036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Rigid nonpolarizable water models with fixed point charges have been widely employed in molecular dynamics simulations due to their efficiency and reasonable accuracy for the potential energy surface. However, the dipole moment surface of water is not necessarily well-described by the same fixed charges, leading to failure in reproducing dipole-related properties. Here, we developed a machine-learning model trained against electronic structure data to assign point charges for water, and the resulting dipole moment surface significantly improved the predictions of the dielectric constant and the low-frequency IR spectrum of liquid water. Our analysis reveals that within our atom-centered point-charge description of the dipole moment surface, the intermolecular charge transfer is the major source of the peak intensity at 200 cm-1, whereas the intramolecular polarization controls the enhancement of the dielectric constant. The effects of exact Hartree-Fock exchange in the hybrid density functional on these properties are also discussed.
Collapse
Affiliation(s)
- Bowen Han
- Chemistry and Biochemistry, University of California, Merced, California 95343, United States
| | - Christine M Isborn
- Chemistry and Biochemistry, University of California, Merced, California 95343, United States
| | - Liang Shi
- Chemistry and Biochemistry, University of California, Merced, California 95343, United States
| |
Collapse
|
46
|
Sun F, Kadupitiya J, Jadhao V. Probing Accuracy-Speedup Tradeoff in Machine Learning Surrogates for Molecular Dynamics Simulations. J Chem Theory Comput 2023. [PMID: 37094180 DOI: 10.1021/acs.jctc.2c01282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/26/2023]
Abstract
The performance promise of machine learning surrogates of molecular dynamics simulations of soft materials is significant but generally comes at the cost of acquiring large training datasets to learn the complex relationships between input soft material attributes and output properties. Under the constraint of limited high-performance computing resources, optimizing the size of the training datasets becomes paramount. Using an artificial neural network based surrogate for molecular dynamics simulations of confined electrolytes, we explore the tradeoff between surrogate accuracy and computational gains. Accuracy is assessed by computing the root-mean-square errors between the surrogate predictions and the ground truth results obtained via molecular dynamics simulations. The computational performance is judged by evaluating the speedup which incorporates the training dataset creation time. Improvement in accuracy occurs with a loss of speedup, which scales as the inverse of the training dataset size. The link between surrogate generalizability and the accuracy-speedup tradeoff is assessed by examining the errors incurred in surrogate predictions on unseen, interpolated input variables and developing a net speedup metric to capture the associated gains.
Collapse
Affiliation(s)
- Fanbo Sun
- Intelligent Systems Engineering, Indiana University, 700 N. Woodlawn Avenue, Bloomington, Indiana 47408, United States
| | - Jcs Kadupitiya
- Intelligent Systems Engineering, Indiana University, 700 N. Woodlawn Avenue, Bloomington, Indiana 47408, United States
| | - Vikram Jadhao
- Intelligent Systems Engineering, Indiana University, 700 N. Woodlawn Avenue, Bloomington, Indiana 47408, United States
| |
Collapse
|
47
|
Merritt ICD, Jacquemin D, Vacher M. Nonadiabatic Coupling in Trajectory Surface Hopping: How Approximations Impact Excited-State Reaction Dynamics. J Chem Theory Comput 2023; 19:1827-1842. [PMID: 36897995 DOI: 10.1021/acs.jctc.2c00968] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/12/2023]
Abstract
Photochemical reactions are widely modeled using the popular trajectory surface hopping (TSH) method, an affordable mixed quantum-classical approximation to the full quantum dynamics of the system. TSH is able to account for nonadiabatic effects using an ensemble of trajectories, which are propagated on a single potential energy surface at a time and which can hop from one electronic state to another. The occurrences and locations of these hops are typically determined using the nonadiabatic coupling between electronic states, which can be assessed in a number of ways. In this work, we benchmark the impact of some approximations to the coupling term on the TSH dynamics for several typical isomerization and ring-opening reactions. We have identified that two of the schemes tested, the popular local diabatization scheme and a scheme based on biorthonormal wave function overlap implemented in the OpenMOLCAS code as part of this work, reproduce at a much reduced cost the dynamics obtained using the explicitly calculated nonadiabatic coupling vectors. The other two schemes tested can give different results, and in some cases, even entirely incorrect dynamics. Of these two, the scheme based on configuration interaction vectors gives unpredictable failures, while the other scheme based on the Baeck-An approximation systematically overestimates hopping to the ground state as compared to the reference approaches.
Collapse
Affiliation(s)
| | - Denis Jacquemin
- Nantes Université, CNRS, CEISAM UMR 6230, F-44000 Nantes, France
| | - Morgane Vacher
- Nantes Université, CNRS, CEISAM UMR 6230, F-44000 Nantes, France
| |
Collapse
|
48
|
Olatomiwa A, Adam T, Edet C, Adewale A, Chik A, Mohammed M, Gopinath SC, Hashim U. Recent advances in density functional theory approach for optoelectronics properties of graphene. Heliyon 2023; 9:e14279. [PMID: 36950613 PMCID: PMC10025043 DOI: 10.1016/j.heliyon.2023.e14279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 02/28/2023] [Accepted: 03/01/2023] [Indexed: 03/09/2023] Open
Abstract
Graphene has received tremendous attention among diverse 2D materials because of its remarkable properties. Its emergence over the last two decades gave a new and distinct dynamic to the study of materials, with several research projects focusing on exploiting its intrinsic properties for optoelectronic devices. This review provides a comprehensive overview of several published articles based on density functional theory and recently introduced machine learning approaches applied to study the electronic and optical properties of graphene. A comprehensive catalogue of the bond lengths, band gaps, and formation energies of various doped graphene systems that determine thermodynamic stability was reported in the literature. In these studies, the peculiarity of the obtained results reported is consequent on the nature and type of the dopants, the choice of the XC functionals, the basis set, and the wrong input parameters. The different density functional theory models, as well as the strengths and uncertainties of the ML potentials employed in the machine learning approach to enhance the prediction models for graphene, were elucidated. Lastly, the thermal properties, modelling of graphene heterostructures, the superconducting behaviour of graphene, and optimization of the DFT models are grey areas that future studies should explore in enhancing its unique potential. Therefore, the identified future trends and knowledge gaps have a prospect in both academia and industry to design future and reliable optoelectronic devices.
Collapse
Affiliation(s)
- A.L. Olatomiwa
- Institute of Nano Electronic Engineering, Universiti Malaysia Perlis, 01000, Kangar, Perlis, Malaysia
- Faculty of Electronic Engineering and Technology, Universiti Malaysia Perlis, 02600, Arau, Perlis, Malaysia
| | - Tijjani Adam
- Institute of Nano Electronic Engineering, Universiti Malaysia Perlis, 01000, Kangar, Perlis, Malaysia
- Faculty of Electronic Engineering and Technology, Universiti Malaysia Perlis, 02600, Arau, Perlis, Malaysia
- Micro System Technology, Centre of Excellence (CoE), Universiti Malaysia Perlis (UniMAP), Pauh Campus, 02600, Arau, Perlis, Malaysia
| | - C.O. Edet
- Faculty of Electronic Engineering and Technology, Universiti Malaysia Perlis, 02600, Arau, Perlis, Malaysia
- Institute of Engineering Mathematics, Universiti Malaysia Perlis, 02600, Arau, Perlis, Malaysia
- Department of Physics, Cross River University of Technology, Calabar, Nigeria
| | - A.A. Adewale
- Department of Pure and Applied Physics, Ladoke Akintola University of Technology, Ogbomoso, Nigeria
| | - Abdullah Chik
- Centre for Frontier Materials Research, Universiti Malaysia Perlis, 01000, Kangar, Perlis, Malaysia
- Faculty of Chemical Engineering and Technology, Universiti Malaysia Perlis (UniMAP), Taman Muhibbah, Jejawi, 02600, Arau, Perlis, Malaysia
| | - Mohammed Mohammed
- Faculty of Chemical Engineering and Technology, Universiti Malaysia Perlis (UniMAP), Taman Muhibbah, Jejawi, 02600, Arau, Perlis, Malaysia
- Center of Excellence Geopolymer & Green Technology (CEGeoGTech), Universiti Malaysia Perlis, 02600, Arau, Perlis, Malaysia
| | - Subash C.B. Gopinath
- Institute of Nano Electronic Engineering, Universiti Malaysia Perlis, 01000, Kangar, Perlis, Malaysia
- Micro System Technology, Centre of Excellence (CoE), Universiti Malaysia Perlis (UniMAP), Pauh Campus, 02600, Arau, Perlis, Malaysia
- Faculty of Chemical Engineering and Technology, Universiti Malaysia Perlis (UniMAP), Taman Muhibbah, Jejawi, 02600, Arau, Perlis, Malaysia
| | - U. Hashim
- Institute of Nano Electronic Engineering, Universiti Malaysia Perlis, 01000, Kangar, Perlis, Malaysia
| |
Collapse
|
49
|
Tessmer MH, Stoll S. chiLife: An open-source Python package for in silico spin labeling and integrative protein modeling. PLoS Comput Biol 2023; 19:e1010834. [PMID: 37000838 PMCID: PMC10096462 DOI: 10.1371/journal.pcbi.1010834] [Citation(s) in RCA: 14] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/12/2023] [Accepted: 03/16/2023] [Indexed: 04/03/2023] Open
Abstract
Here we introduce chiLife, a Python package for site-directed spin label (SDSL) modeling for electron paramagnetic resonance (EPR) spectroscopy, in particular double electron-electron resonance (DEER). It is based on in silico attachment of rotamer ensemble representations of spin labels to protein structures. chiLife enables the development of custom protein analysis and modeling pipelines using SDSL EPR experimental data. It allows the user to add custom spin labels, scoring functions and spin label modeling methods. chiLife is designed with integration into third-party software in mind, to take advantage of the diverse and rapidly expanding set of molecular modeling tools available with a Python interface. This article describes the main design principles of chiLife and presents a series of examples.
Collapse
Affiliation(s)
- Maxx H. Tessmer
- Department of Chemistry, University of Washington, Seattle, Washington United States of America
| | - Stefan Stoll
- Department of Chemistry, University of Washington, Seattle, Washington United States of America
| |
Collapse
|
50
|
Käser S, Vazquez-Salazar LI, Meuwly M, Töpfer K. Neural network potentials for chemistry: concepts, applications and prospects. DIGITAL DISCOVERY 2023; 2:28-58. [PMID: 36798879 PMCID: PMC9923808 DOI: 10.1039/d2dd00102k] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/23/2022] [Accepted: 12/20/2022] [Indexed: 12/24/2022]
Abstract
Artificial Neural Networks (NN) are already heavily involved in methods and applications for frequent tasks in the field of computational chemistry such as representation of potential energy surfaces (PES) and spectroscopic predictions. This perspective provides an overview of the foundations of neural network-based full-dimensional potential energy surfaces, their architectures, underlying concepts, their representation and applications to chemical systems. Methods for data generation and training procedures for PES construction are discussed and means for error assessment and refinement through transfer learning are presented. A selection of recent results illustrates the latest improvements regarding accuracy of PES representations and system size limitations in dynamics simulations, but also NN application enabling direct prediction of physical results without dynamics simulations. The aim is to provide an overview for the current state-of-the-art NN approaches in computational chemistry and also to point out the current challenges in enhancing reliability and applicability of NN methods on a larger scale.
Collapse
Affiliation(s)
- Silvan Käser
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | | | - Markus Meuwly
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| | - Kai Töpfer
- Department of Chemistry, University of Basel Klingelbergstrasse 80 CH-4056 Basel Switzerland
| |
Collapse
|