1
|
He S, Segura Abarrategi J, Bediaga H, Arrasate S, González-Díaz H. On the additive artificial intelligence-based discovery of nanoparticle neurodegenerative disease drug delivery systems. BEILSTEIN JOURNAL OF NANOTECHNOLOGY 2024; 15:535-555. [PMID: 38774585 PMCID: PMC11106676 DOI: 10.3762/bjnano.15.47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/23/2024] [Indexed: 05/24/2024]
Abstract
Neurodegenerative diseases are characterized by slowly progressing neuronal cell death. Conventional drug treatment strategies often fail because of poor solubility, low bioavailability, and the inability of the drugs to effectively cross the blood-brain barrier. Therefore, the development of new neurodegenerative disease drugs (NDDs) requires immediate attention. Nanoparticle (NP) systems are of increasing interest for transporting NDDs to the central nervous system. However, discovering effective nanoparticle neuronal disease drug delivery systems (N2D3Ss) is challenging because of the vast number of combinations of NP and NDD compounds, as well as the various assays involved. Artificial intelligence/machine learning (AI/ML) algorithms have the potential to accelerate this process by predicting the most promising NDD and NP candidates for assaying. Nevertheless, the relatively limited amount of reported data on N2D3S activity compared to assayed NDDs makes AI/ML analysis challenging. In this work, the IFPTML technique, which combines information fusion (IF), perturbation theory (PT), and machine learning (ML), was employed to address this challenge. Initially, we conducted the fusion into a unified dataset comprising 4403 NDD assays from ChEMBL and 260 NP cytotoxicity assays from journal articles. Through a resampling process, three new working datasets were generated, each containing 500,000 cases. We utilized linear discriminant analysis (LDA) along with artificial neural network (ANN) algorithms, such as multilayer perceptron (MLP) and deep learning networks (DLN), to construct linear and non-linear IFPTML models. The IFPTML-LDA models exhibited sensitivity (Sn) and specificity (Sp) values in the range of 70% to 73% (>375,000 training cases) and 70% to 80% (>125,000 validation cases), respectively. In contrast, the IFPTML-MLP and IFPTML-DLN achieved Sn and Sp values in the range of 85% to 86% for both training and validation series. Additionally, IFPTML-ANN models showed an area under the receiver operating curve (AUROC) of approximately 0.93 to 0.95. These results indicate that the IFPTML models could serve as valuable tools in the design of drug delivery systems for neurosciences.
Collapse
Affiliation(s)
- Shan He
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Julen Segura Abarrategi
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Harbil Bediaga
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- Painting Department, Fine Arts Faculty, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay, Basque Country, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Instituto Biofisika (UPV/EHU-CSIC), 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
2
|
Fan ZX, Chao SD. A Machine Learning Force Field for Bio-Macromolecular Modeling Based on Quantum Chemistry-Calculated Interaction Energy Datasets. Bioengineering (Basel) 2024; 11:51. [PMID: 38247928 PMCID: PMC11154266 DOI: 10.3390/bioengineering11010051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2023] [Revised: 12/23/2023] [Accepted: 12/25/2023] [Indexed: 01/23/2024] Open
Abstract
Accurate energy data from noncovalent interactions are essential for constructing force fields for molecular dynamics simulations of bio-macromolecular systems. There are two important practical issues in the construction of a reliable force field with the hope of balancing the desired chemical accuracy and working efficiency. One is to determine a suitable quantum chemistry level of theory for calculating interaction energies. The other is to use a suitable continuous energy function to model the quantum chemical energy data. For the first issue, we have recently calculated the intermolecular interaction energies using the SAPT0 level of theory, and we have systematically organized these energies into the ab initio SOFG-31 (homodimer) and SOFG-31-heterodimer datasets. In this work, we re-calculate these interaction energies by using the more advanced SAPT2 level of theory with a wider series of basis sets. Our purpose is to determine the SAPT level of theory proper for interaction energies with respect to the CCSD(T)/CBS benchmark chemical accuracy. Next, to utilize these energy datasets, we employ one of the well-developed machine learning techniques, called the CLIFF scheme, to construct a general-purpose force field for biomolecular dynamics simulations. Here we use the SOFG-31 dataset and the SOFG-31-heterodimer dataset as the training and test sets, respectively. Our results demonstrate that using the CLIFF scheme can reproduce a diverse range of dimeric interaction energy patterns with only a small training set. The overall errors for each SAPT energy component, as well as the SAPT total energy, are all well below the desired chemical accuracy of ~1 kcal/mol.
Collapse
Affiliation(s)
- Zhen-Xuan Fan
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
3
|
Demir Gİ, Tekin A. NICE-FF: A non-empirical, intermolecular, consistent, and extensible force field for nucleic acids and beyond. J Chem Phys 2023; 159:244117. [PMID: 38153156 DOI: 10.1063/5.0176641] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Accepted: 12/04/2023] [Indexed: 12/29/2023] Open
Abstract
A new non-empirical ab initio intermolecular force field (NICE-FF in buffered 14-7 potential form) has been developed for nucleic acids and beyond based on the dimer interaction energies (IEs) calculated at the spin component scaled-MI-second order Møller-Plesset perturbation theory. A fully automatic framework has been implemented for this purpose, capable of generating well-polished computational grids, performing the necessary ab initio calculations, conducting machine learning (ML) assisted force field (FF) parametrization, and extending existing FF parameters by incorporating new atom types. For the ML-assisted parametrization of NICE-FF, interaction energies of ∼18 000 dimer geometries (with IE < 0) were used, and the best fit gave a mean square deviation of about 0.46 kcal/mol. During this parametrization, atom types apparent in four deoxyribonucleic acid (DNA) bases have been first trained using the generated DNA base datasets. Both uracil and hypoxanthine, which contain the same atom types found in DNA bases, have been considered as test molecules. Three new atom types have been added to the DNA atom types by using IE datasets of both pyrazinamide and 9-methylhypoxanthine. Finally, the last test molecule, theophylline, has been selected, which contains already-fitted atom-type parameters. The performance of NICE-FF has been investigated on the S22 dataset, and it has been found that NICE-FF outperforms the well-known FFs by generating the most consistent IEs with the high-level ab initio ones. Moreover, NICE-FF has been integrated into our in-house developed crystal structure prediction (CSP) tool [called FFCASP (Fast and Flexible CrystAl Structure Predictor)], aiming to find the experimental crystal structures of all considered molecules. CSPs, which were performed up to 4 formula units (Z), resulted in NICE-FF being able to locate almost all the known experimental crystal structures with sufficiently low RMSD20 values to provide good starting points for density functional theory optimizations.
Collapse
Affiliation(s)
- Gözde İniş Demir
- Informatics Institute, Istanbul Technical University, 34469 Maslak, Istanbul, Türkiye
| | - Adem Tekin
- Informatics Institute, Istanbul Technical University, 34469 Maslak, Istanbul, Türkiye
- Research Institute for Fundamental Sciences (TÜBİTAK-TBAE), Kocaeli, Türkiye
| |
Collapse
|
4
|
Chen JA, Chao SD. Intermolecular Non-Bonded Interactions from Machine Learning Datasets. Molecules 2023; 28:7900. [PMID: 38067629 PMCID: PMC10707888 DOI: 10.3390/molecules28237900] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Revised: 11/22/2023] [Accepted: 11/29/2023] [Indexed: 04/04/2024] Open
Abstract
Accurate determination of intermolecular non-covalent-bonded or non-bonded interactions is the key to potentially useful molecular dynamics simulations of polymer systems. However, it is challenging to balance both the accuracy and computational cost in force field modelling. One of the main difficulties is properly representing the calculated energy data as a continuous force function. In this paper, we employ well-developed machine learning techniques to construct a general purpose intermolecular non-bonded interaction force field for organic polymers. The original ab initio dataset SOFG-31 was calculated by us and has been well documented, and here we use it as our training set. The CLIFF kernel type machine learning scheme is used for predicting the interaction energies of heterodimers selected from the SOFG-31 dataset. Our test results show that the overall errors are well below the chemical accuracy of about 1 kcal/mol, thus demonstrating the promising feasibility of machine learning techniques in force field modelling.
Collapse
Affiliation(s)
- Jia-An Chen
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
| | - Sheng D. Chao
- Institute of Applied Mechanics, National Taiwan University, Taipei 106, Taiwan;
- Center for Quantum Science and Engineering, National Taiwan University, Taipei 106, Taiwan
| |
Collapse
|
5
|
Tian Z, Zhang S, Chern GW. Machine learning for structure-property mapping of Ising models: Scalability and limitations. Phys Rev E 2023; 108:065304. [PMID: 38243546 DOI: 10.1103/physreve.108.065304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 11/27/2023] [Indexed: 01/21/2024]
Abstract
We present a scalable machine learning (ML) framework for predicting intensive properties and particularly classifying phases of Ising models. Scalability and transferability are central to the unprecedented computational efficiency of ML methods. In general, linear-scaling computation can be achieved through the divide-and-conquer approach, and the locality of physical properties is key to partitioning the system into subdomains that can be solved separately. Based on the locality assumption, ML model is developed for the prediction of intensive properties of a finite-size block. Predictions of large-scale systems can then be obtained by averaging results of the ML model from randomly sampled blocks of the system. We show that the applicability of this approach depends on whether the block-size of the ML model is greater than the characteristic length scale of the system. In particular, in the case of phase identification across a critical point, the accuracy of the ML prediction is limited by the diverging correlation length. We obtain an intriguing scaling relation between the prediction accuracy and the ratio of ML block size over the spin-spin correlation length. Implications for practical applications are also discussed. While the two-dimensional Ising model is used to demonstrate the proposed approach, the ML framework can be generalized to other many-body or condensed-matter systems.
Collapse
Affiliation(s)
- Zhongzheng Tian
- Department of Physics, University of Virginia, Charlottesville, Virginia 22904, USA
| | - Sheng Zhang
- Department of Physics, University of Virginia, Charlottesville, Virginia 22904, USA
| | - Gia-Wei Chern
- Department of Physics, University of Virginia, Charlottesville, Virginia 22904, USA
| |
Collapse
|
6
|
Hwang IH, Kelly SD, Chan MKY, Stavitski E, Heald SM, Han SW, Schwarz N, Sun CJ. The AXEAP2 program for Kβ X-ray emission spectra analysis using artificial intelligence. JOURNAL OF SYNCHROTRON RADIATION 2023; 30:923-933. [PMID: 37526993 PMCID: PMC10481262 DOI: 10.1107/s1600577523005684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2023] [Accepted: 06/26/2023] [Indexed: 08/03/2023]
Abstract
The processing and analysis of synchrotron data can be a complex task, requiring specialized expertise and knowledge. Our previous work addressed the challenge of X-ray emission spectrum (XES) data processing by developing a standalone application using unsupervised machine learning. However, the task of analyzing the processed spectra remains another challenge. Although the non-resonant Kβ XES of 3d transition metals are known to provide electronic structure information such as oxidation and spin state, finding appropriate parameters to match experimental data is a time-consuming and labor-intensive process. Here, a new XES data analysis method based on the genetic algorithm is demonstrated, applying it to Mn, Co and Ni oxides. This approach is also implemented as a standalone application, Argonne X-ray Emission Analysis 2 (AXEAP2), which finds a set of parameters that result in a high-quality fit of the experimental spectrum with minimal intervention. AXEAP2 is able to find a set of parameters that reproduce the experimental spectrum, and provide insights into the 3d electron spin state, 3d-3p electron exchange force and Kβ emission core-hole lifetime.
Collapse
Affiliation(s)
- In-Hui Hwang
- X-ray Science Division, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Shelly D. Kelly
- X-ray Science Division, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Maria K. Y. Chan
- Center for Nanoscale Materials, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Eli Stavitski
- National Synchrotron Light Source II, Brookhaven National Laboratory, NY 11973, USA
| | - Steve M. Heald
- X-ray Science Division, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Sang-Wook Han
- Department of Physics Education and Institute of Fusion Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Nicholas Schwarz
- X-ray Science Division, Argonne National Laboratory, Lemont, IL 60439, USA
| | - Cheng-Jun Sun
- X-ray Science Division, Argonne National Laboratory, Lemont, IL 60439, USA
| |
Collapse
|
7
|
Hagg A, Kirschner KN. Open-Source Machine Learning in Computational Chemistry. J Chem Inf Model 2023; 63:4505-4532. [PMID: 37466636 PMCID: PMC10430767 DOI: 10.1021/acs.jcim.3c00643] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2023] [Indexed: 07/20/2023]
Abstract
The field of computational chemistry has seen a significant increase in the integration of machine learning concepts and algorithms. In this Perspective, we surveyed 179 open-source software projects, with corresponding peer-reviewed papers published within the last 5 years, to better understand the topics within the field being investigated by machine learning approaches. For each project, we provide a short description, the link to the code, the accompanying license type, and whether the training data and resulting models are made publicly available. Based on those deposited in GitHub repositories, the most popular employed Python libraries are identified. We hope that this survey will serve as a resource to learn about machine learning or specific architectures thereof by identifying accessible codes with accompanying papers on a topic basis. To this end, we also include computational chemistry open-source software for generating training data and fundamental Python libraries for machine learning. Based on our observations and considering the three pillars of collaborative machine learning work, open data, open source (code), and open models, we provide some suggestions to the community.
Collapse
Affiliation(s)
- Alexander Hagg
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Electrical Engineering, Mechanical Engineering and Technical Journalism, University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| | - Karl N. Kirschner
- Institute
of Technology, Resource and Energy-Efficient Engineering (TREE), University of Applied Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
- Department
of Computer Science, University of Applied
Sciences Bonn-Rhein-Sieg, 53757 Sankt Augustin, Germany
| |
Collapse
|
8
|
Zhang P, Yang W. Toward a general neural network force field for protein simulations: Refining the intramolecular interaction in protein. J Chem Phys 2023; 159:024118. [PMID: 37431910 PMCID: PMC10481389 DOI: 10.1063/5.0142280] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Accepted: 06/22/2023] [Indexed: 07/12/2023] Open
Abstract
Molecular dynamics (MD) is an extremely powerful, highly effective, and widely used approach to understanding the nature of chemical processes in atomic details for proteins. The accuracy of results from MD simulations is highly dependent on force fields. Currently, molecular mechanical (MM) force fields are mainly utilized in MD simulations because of their low computational cost. Quantum mechanical (QM) calculation has high accuracy, but it is exceedingly time consuming for protein simulations. Machine learning (ML) provides the capability for generating accurate potential at the QM level without increasing much computational effort for specific systems that can be studied at the QM level. However, the construction of general machine learned force fields, needed for broad applications and large and complex systems, is still challenging. Here, general and transferable neural network (NN) force fields based on CHARMM force fields, named CHARMM-NN, are constructed for proteins by training NN models on 27 fragments partitioned from the residue-based systematic molecular fragmentation (rSMF) method. The NN for each fragment is based on atom types and uses new input features that are similar to MM inputs, including bonds, angles, dihedrals, and non-bonded terms, which enhance the compatibility of CHARMM-NN to MM MD and enable the implementation of CHARMM-NN force fields in different MD programs. While the main part of the energy of the protein is based on rSMF and NN, the nonbonded interactions between the fragments and with water are taken from the CHARMM force field through mechanical embedding. The validations of the method for dipeptides on geometric data, relative potential energies, and structural reorganization energies demonstrate that the CHARMM-NN local minima on the potential energy surface are very accurate approximations to QM, showing the success of CHARMM-NN for bonded interactions. However, the MD simulations on peptides and proteins indicate that more accurate methods to represent protein-water interactions in fragments and non-bonded interactions between fragments should be considered in the future improvement of CHARMM-NN, which can increase the accuracy of approximation beyond the current mechanical embedding QM/MM level.
Collapse
Affiliation(s)
- Pan Zhang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| | - Weitao Yang
- Department of Chemistry, Duke University, Durham, North Carolina 27708, USA
| |
Collapse
|
9
|
Izvekov S, Rice BM. Hierarchical Machine Learning of Low-Resolution Coarse-Grained Free Energy Potentials. J Chem Theory Comput 2023. [PMID: 37256918 DOI: 10.1021/acs.jctc.3c00128] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
A force-matching-based method for supervised machine learning (ML) of coarse-grained (CG) free energy (FE) potentials─known as multiscale coarse-graining via force-matching (MSCG/FM)─is an efficient method to develop microscopically informed CG models that are thermodynamically and statistically equivalent to the reference microscopic models. For low-resolution models, when the coarse-graining is at supramolecular scales, objective-oriented clustering of nonbonded particles is required and the reduced description becomes a function of the clustering algorithm. In the present work, we explore the dependence of the ML of the CG Helmholtz FE potential on the clustering algorithm. We consider coarse-graining based on partitional (k-means, leading to Voronoi diagram) and hierarchical agglomerative (bottom-up) clustering algorithms common in unsupervised ML and develop theory connecting the MSCG/FM learned CG Helmholtz potential and the clustering statistics. By combining the agglomerative clustering and the MSCG/FM learning in a recursive manner, we propose an efficient ML methodology to develop the fine-to-low resolution hierarchies of the CG models. The methodology does not suffer from degrading accuracy or increased computational cost to construct larger hierarchies and as such does not impose an upper size limitation of the CG particles resulting from the extended hierarchies. The utility of the methodology is demonstrated by obtaining the bottom-up agglomerative hierarchy for liquid nitromethane from all-atom molecular dynamics (MD) simulations. For agglomerative hierarchies, we prove the existence of renormalization group transformations that indicate self-similarity and allow for learning the low-resolution MSCG/FM potentials at low computational cost by rescaling and renormalizing the certain finer-resolution members of the hierarchy. The hierarchies of the CG models can be used to carry out simulations under constant-pressure conditions.
Collapse
Affiliation(s)
- Sergei Izvekov
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| | - Betsy M Rice
- U.S. Army DEVCOM Army Research Laboratory, Aberdeen Proving Ground, Maryland 21005, United States
| |
Collapse
|
10
|
Chen WK, Wang SR, Liu XY, Fang WH, Cui G. Nonadiabatic Derivative Couplings Calculated Using Information of Potential Energy Surfaces without Wavefunctions: Ab Initio and Machine Learning Implementations. Molecules 2023; 28:molecules28104222. [PMID: 37241962 DOI: 10.3390/molecules28104222] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 05/16/2023] [Accepted: 05/18/2023] [Indexed: 05/28/2023] Open
Abstract
In this work, we implemented an approximate algorithm for calculating nonadiabatic coupling matrix elements (NACMEs) of a polyatomic system with ab initio methods and machine learning (ML) models. Utilizing this algorithm, one can calculate NACMEs using only the information of potential energy surfaces (PESs), i.e., energies, and gradients as well as Hessian matrix elements. We used a realistic system, namely CH2NH, to compare NACMEs calculated by this approximate PES-based algorithm and the accurate wavefunction-based algorithm. Our results show that this approximate PES-based algorithm can give very accurate results comparable to the wavefunction-based algorithm except at energetically degenerate points, i.e., conical intersections. We also tested a machine learning (ML)-trained model with this approximate PES-based algorithm, which also supplied similarly accurate NACMEs but more efficiently. The advantage of this PES-based algorithm is its significant potential to combine with electronic structure methods that do not implement wavefunction-based algorithms, low-scaling energy-based fragment methods, etc., and in particular efficient ML models, to compute NACMEs. The present work could encourage further research on nonadiabatic processes of large systems simulated by ab initio nonadiabatic dynamics simulation methods in which NACMEs are always required.
Collapse
Affiliation(s)
- Wen-Kai Chen
- Hebei Key Laboratory of Inorganic Nano-Materials, College of Chemistry and Materials Science, Hebei Normal University, Shijiazhuang 050024, China
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Sheng-Rui Wang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
| | - Xiang-Yang Liu
- College of Chemistry and Material Science, Sichuan Normal University, Chengdu 610068, China
| | - Wei-Hai Fang
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
- Hefei National Laboratory, Hefei 230088, China
| | - Ganglong Cui
- Key Laboratory of Theoretical and Computational Photochemistry, Ministry of Education, College of Chemistry, Beijing Normal University, Beijing 100875, China
- Hefei National Laboratory, Hefei 230088, China
| |
Collapse
|
11
|
Miao Q, Yuan Q. Machine learning coarse-grained models of dissolutive wetting: a droplet on soluble surfaces. Phys Chem Chem Phys 2023; 25:7487-7495. [PMID: 36853270 DOI: 10.1039/d3cp00112a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
Abstract
Dissolutive wetting is not only a key problem in application fields such as energy, medicine, micro-devices and etc., but also a frontier issue of academic research. As an important tool for exploring the micro-mechanisms of dissolutive wetting, molecular dynamics simulations are limited by simulation scale and force field parameters. Thus, artificial intelligence is introduced into the multi-scale simulation framework to tackle such challenges. By combining density functional theory, molecular dynamics simulations and experiments, we obtain a coarse-grained model of the glucose-water dissolution pair. Furthermore, the structure of the solid molecules and the hydration shell near the solute particles are calculated by quantum mechanics/molecular mechanics to verify the accuracy of the model. Finally, the applicability of the coarse-grained model in dissolutive wetting is proven by experimental results. We believe our machine learning method not only lays a foundation for exploring the micro-mechanisms of dissolutive wetting, but also provides a general approach for obtaining the force field parameters of different systems.
Collapse
Affiliation(s)
- Qing Miao
- State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China. .,School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China.,Hypervelocity Aerodynamics Institute of CARDC, Mianyang 621000, People's Republic of China
| | - Quanzi Yuan
- State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100190, People's Republic of China. .,School of Engineering Science, University of Chinese Academy of Sciences, Beijing 100049, People's Republic of China
| |
Collapse
|
12
|
Kříž K, Schmidt L, Andersson AT, Walz MM, van der Spoel D. An Imbalance in the Force: The Need for Standardized Benchmarks for Molecular Simulation. J Chem Inf Model 2023; 63:412-431. [PMID: 36630710 PMCID: PMC9875315 DOI: 10.1021/acs.jcim.2c01127] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2022] [Indexed: 01/12/2023]
Abstract
Force fields (FFs) for molecular simulation have been under development for more than half a century. As with any predictive model, rigorous testing and comparisons of models critically depends on the availability of standardized data sets and benchmarks. While such benchmarks are rather common in the fields of quantum chemistry, this is not the case for empirical FFs. That is, few benchmarks are reused to evaluate FFs, and development teams rather use their own training and test sets. Here we present an overview of currently available tests and benchmarks for computational chemistry, focusing on organic compounds, including halogens and common ions, as FFs for these are the most common ones. We argue that many of the benchmark data sets from quantum chemistry can in fact be reused for evaluating FFs, but new gas phase data is still needed for compounds containing phosphorus and sulfur in different valence states. In addition, more nonequilibrium interaction energies and forces, as well as molecular properties such as electrostatic potentials around compounds, would be beneficial. For the condensed phases there is a large body of experimental data available, and tools to utilize these data in an automated fashion are under development. If FF developers, as well as researchers in artificial intelligence, would adopt a number of these data sets, it would become easier to compare the relative strengths and weaknesses of different models and to, eventually, restore the balance in the force.
Collapse
Affiliation(s)
- Kristian Kříž
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Lisa Schmidt
- Faculty
of Biosciences, University of Heidelberg, Heidelberg69117, Germany
| | - Alfred T. Andersson
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - Marie-Madeleine Walz
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| | - David van der Spoel
- Department
of Cell and Molecular Biology, Uppsala University, Box 596, SE-75124Uppsala, Sweden
| |
Collapse
|
13
|
Perez I. Ab initio methods for the computation of physical properties and performance parameters of electrochemical energy storage devices. Phys Chem Chem Phys 2023; 25:1476-1503. [PMID: 36602004 DOI: 10.1039/d2cp03611h] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Abstract
With the rapid development of electric vehicles and mobile technologies, there is a high demand for electrochemical energy storage devices and electrochemical energy conversion devices. Devices meeting these needs include metal-ion batteries (MIBs), supercapacitors (SCs), electrochromic devices (ECDs), and multifunctional devices such as electrochromic batteries and supercapatteries. Currently, the goal has been the enhancement of operational parameters and physical properties that results in a higher performance of these devices. In the case of batteries, SCs, and supercapatteries, scientists seek to improve the equilibrium voltage, energy density, power, capacitance, and charge rate. In the case of ECDs, the focus is on improvement of the optical modulation and coloration efficiency. However, synthesis and characterization of new materials, or of materials with optimized properties, is time consuming and highly expensive. Computational simulation of materials can expedite the experimental endeavor by modelling novel atomic structures and predicting device performance. This is possible using ab initio theories and applying physical principles that allow us to understand the underlying mechanisms governing the behavior of materials in these devices. Taking as a point of departure density functional theory (DFT), in this review, we discuss the first principles methods used for the computation of physical properties and performance parameters of electrochemical energy storage devices. A wide coverage of DFT is given, dealing with the strengths and weaknesses of the most popular functionals used in the field of electrochemical energy storage. With these tools, ab initio methods for the computation of basic properties such as effective mass, mobility, optical band gap, transmissivity, conductivity (ionic and electronic), and criteria for structure stability (cohesive energy, formation energy, adsorption energy, and phonon frequency) are addressed. We also highlight the first principles techniques for the calculation of performance parameters in MIBs, SCs, and ECDs.
Collapse
Affiliation(s)
- Israel Perez
- National Council of Science and Technology (CONACYT)-Department of Physics and Mathematics, Institute of Engineering and Technology, Universidad Autonoma de Ciudad Juarez, Av. del Charro 450 Col. Romero Partido, C.P. 32310, Juarez, Chihuahua, Mexico.
| |
Collapse
|
14
|
Thürlemann M, Böselt L, Riniker S. Regularized by Physics: Graph Neural Network Parametrized Potentials for the Description of Intermolecular Interactions. J Chem Theory Comput 2023; 19:562-579. [PMID: 36633918 PMCID: PMC9878731 DOI: 10.1021/acs.jctc.2c00661] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2022] [Indexed: 01/13/2023]
Abstract
Simulations of molecular systems using electronic structure methods are still not feasible for many systems of biological importance. As a result, empirical methods such as force fields (FF) have become an established tool for the simulation of large and complex molecular systems. The parametrization of FF is, however, time-consuming and has traditionally been based on experimental data. Recent years have therefore seen increasing efforts to automatize FF parametrization or to replace FF with machine-learning (ML) based potentials. Here, we propose an alternative strategy to parametrize FF, which makes use of ML and gradient-descent based optimization while retaining a functional form founded in physics. Using a predefined functional form is shown to enable interpretability, robustness, and efficient simulations of large systems over long time scales. To demonstrate the strength of the proposed method, a fixed-charge and a polarizable model are trained on ab initio potential-energy surfaces. Given only information about the constituting elements, the molecular topology, and reference potential energies, the models successfully learn to assign atom types and corresponding FF parameters from scratch. The resulting models and parameters are validated on a wide range of experimentally and computationally derived properties of systems including dimers, pure liquids, and molecular crystals.
Collapse
Affiliation(s)
- Moritz Thürlemann
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Lennard Böselt
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| | - Sereina Riniker
- Laboratory of Physical Chemistry, ETH Zürich, Vladimir-Prelog-Weg 2, 8093 Zürich, Switzerland
| |
Collapse
|
15
|
Lee S, Ermanis K, Goodman JM. MolE8: finding DFT potential energy surface minima values from force-field optimised organic molecules with new machine learning representations. Chem Sci 2022; 13:7204-7214. [PMID: 35799803 PMCID: PMC9214916 DOI: 10.1039/d1sc06324c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2021] [Accepted: 05/23/2022] [Indexed: 11/21/2022] Open
Abstract
The use of machine learning techniques in computational chemistry has gained significant momentum since large molecular databases are now readily available. Predictions of molecular properties using machine learning have advantages over the traditional quantum mechanics calculations because they can be cheaper computationally without losing the accuracy. We present a new extrapolatable and explainable molecular representation based on bonds, angles and dihedrals that can be used to train machine learning models. The trained models can accurately predict the electronic energy and the free energy of small organic molecules with atom types C, H N and O, with a mean absolute error of 1.2 kcal mol-1. The models can be extrapolated to larger organic molecules with an average error of less than 3.7 kcal mol-1 for 10 or fewer heavy atoms, which represent a chemical space two orders of magnitude larger. The rapid energy predictions of multiple molecules, up to 7 times faster than previous ML models of similar accuracy, has been achieved by sampling geometries around the potential energy surface minima. Therefore, the input geometries do not have to be located precisely on the minima and we show that accurate density functional theory energy predictions can be made from force-field optimised geometries with a mean absolute error 2.5 kcal mol-1.
Collapse
Affiliation(s)
- Sanha Lee
- Yusuf Hamied Department of Chemistry, University of Cambridge Lensfield Road Cambridge CB2 1EW UK
| | | | - Jonathan M Goodman
- Yusuf Hamied Department of Chemistry, University of Cambridge Lensfield Road Cambridge CB2 1EW UK
| |
Collapse
|
16
|
Better force fields start with better data: A data set of cation dipeptide interactions. Sci Data 2022; 9:327. [PMID: 35715420 PMCID: PMC9205945 DOI: 10.1038/s41597-022-01297-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2021] [Accepted: 03/18/2022] [Indexed: 11/08/2022] Open
Abstract
We present a data set from a first-principles study of amino-methylated and acetylated (capped) dipeptides of the 20 proteinogenic amino acids - including alternative possible side chain protonation states and their interactions with selected divalent cations (Ca2+, Mg2+ and Ba2+). The data covers 21,909 stationary points on the respective potential-energy surfaces in a wide relative energy range of up to 4 eV (390 kJ/mol). Relevant properties of interest, like partial charges, were derived for the conformers. The motivation was to provide a solid data basis for force field parameterization and further applications like machine learning or benchmarking. In particular the process of creating all this data on the same first-principles footing, i.e. density-functional theory calculations employing the generalized gradient approximation with a van der Waals correction, makes this data suitable for first principles data-driven force field development. To make the data accessible across domain borders and to machines, we formalized the metadata in an ontology.
Collapse
|
17
|
Quinn TR, Patel HN, Koh KH, Haines BE, Norrby PO, Helquist P, Wiest O. Automated fitting of transition state force fields for biomolecular simulations. PLoS One 2022; 17:e0264960. [PMID: 35271647 PMCID: PMC8912266 DOI: 10.1371/journal.pone.0264960] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2021] [Accepted: 02/22/2022] [Indexed: 12/29/2022] Open
Abstract
The generation of surrogate potential energy functions (PEF) that are orders of magnitude faster to compute but as accurate as the underlying training data from high-level electronic structure methods is one of the most promising applications of fitting procedures in chemistry. In previous work, we have shown that transition state force fields (TSFFs), fitted to the functional form of MM3* force fields using the quantum guided molecular mechanics (Q2MM) method, provide an accurate description of transition states that can be used for stereoselectivity predictions of small molecule reactions. Here, we demonstrate the applicability of the method for fit TSFFs to the well-established Amber force field, which could be used for molecular dynamics studies of enzyme reaction. As a case study, the fitting of a TSFF to the second hydride transfer in Pseudomonas mevalonii 3-hydroxy-3-methylglutaryl coenzyme A reductase (PmHMGR) is used. The differences and similarities to fitting of small molecule TSFFs are discussed.
Collapse
Affiliation(s)
- Taylor R. Quinn
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
- Early TDE Discovery, Early Oncology, Oncology R&D, AstraZeneca, Boston, Massachusetts, United States of America
| | - Himani N. Patel
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Kevin H. Koh
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Brandon E. Haines
- Department of Chemistry, Westmont College, Santa Barbara, California, United States of America
| | - Per-Ola Norrby
- Data Science and Modelling, Pharmaceutical Sciences, R&D, AstraZeneca Gothenburg, Mölndal, Sweden
| | - Paul Helquist
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
| | - Olaf Wiest
- Department of Chemistry and Biochemistry, University of Notre Dame, Notre Dame, Indiana, United States of America
- Lab of Computational Chemistry and Drug Design, School of Chemical Biology and Biotechnology, Peking University, Shenzhen Graduate School, Shenzhen, China
- * E-mail:
| |
Collapse
|
18
|
Li Z, Meidani K, Yadav P, Barati Farimani A. Graph neural networks accelerated molecular dynamics. J Chem Phys 2022; 156:144103. [DOI: 10.1063/5.0083060] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Molecular Dynamics (MD) simulation is a powerful tool for understanding the dynamics and structure of matter. Since the resolution of MD is atomic-scale, achieving long timescale simulations with femtosecond integration is very expensive. In each MD step, numerous iterative computations are performed to calculate energy based on different types of interaction and their corresponding spatial gradients. These repetitive computations can be learned and surrogated by a deep learning model, such as a Graph Neural Network (GNN). In this work, we developed a GNN Accelerated MD (GAMD) model that directly predicts forces, given the state of the system (atom positions, atom types), bypassing the evaluation of potential energy. By training the GNN on a variety of data sources (simulation data derived from classical MD and density functional theory), we show that GAMD can predict the dynamics of two typical molecular systems, Lennard-Jones system and water system, in the NVT ensemble with velocities regulated by a thermostat. We further show that GAMD’s learning and inference are agnostic to the scale, where it can scale to much larger systems at test time. We also perform a comprehensive benchmark test comparing our implementation of GAMD to production-level MD software, showing GAMD’s competitive performance on the large-scale simulation.
Collapse
Affiliation(s)
- Zijie Li
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Kazem Meidani
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Prakarsh Yadav
- Department of Mechanical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| | - Amir Barati Farimani
- Machine Learning Department, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
- Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213, USA
| |
Collapse
|
19
|
Madin OC, Boothroyd S, Messerly RA, Fass J, Chodera JD, Shirts MR. Bayesian-Inference-Driven Model Parametrization and Model Selection for 2CLJQ Fluid Models. J Chem Inf Model 2022; 62:874-889. [PMID: 35129974 PMCID: PMC9217127 DOI: 10.1021/acs.jcim.1c00829] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
A high level of physical detail in a molecular model improves its ability to perform high accuracy simulations but can also significantly affect its complexity and computational cost. In some situations, it is worthwhile to add complexity to a model to capture properties of interest; in others, additional complexity is unnecessary and can make simulations computationally infeasible. In this work, we demonstrate the use of Bayesian inference for molecular model selection, using Monte Carlo sampling techniques accelerated with surrogate modeling to evaluate the Bayes factor evidence for different levels of complexity in the two-centered Lennard-Jones + quadrupole (2CLJQ) fluid model. Examining three nested levels of model complexity, we demonstrate that the use of variable quadrupole and bond length parameters in this model framework is justified only for some chemistries. Through this process, we also get detailed information about the distributions and correlation of parameter values, enabling improved parametrization and parameter analysis. We also show how the choice of parameter priors, which encode previous model knowledge, can have substantial effects on the selection of models, penalizing careless introduction of additional complexity. We detail the computational techniques used in this analysis, providing a roadmap for future applications of molecular model selection via Bayesian inference and surrogate modeling.
Collapse
Affiliation(s)
- Owen C. Madin
- Department of Chemical & Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Simon Boothroyd
- Boothroyd Scientific Consulting Ltd., 71-75 Shelton Street, London, Greater London, United Kingdom, WC2H 9JQ
| | | | - Josh Fass
- Computational & Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| | - John D. Chodera
- Department of Chemical & Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| | - Michael R. Shirts
- Department of Chemical & Biological Engineering, University of Colorado Boulder, Boulder, CO 80309
| |
Collapse
|
20
|
Krajňák V, Naik S, Wiggins S. Predicting trajectory behaviour via machine-learned invariant manifolds. Chem Phys Lett 2022. [DOI: 10.1016/j.cplett.2021.139290] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
21
|
Piccini G, Lee MS, Yuk SF, Zhang D, Collinge G, Kollias L, Nguyen MT, Glezakou VA, Rousseau R. Ab initio molecular dynamics with enhanced sampling in heterogeneous catalysis. Catal Sci Technol 2022. [DOI: 10.1039/d1cy01329g] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Enhanced sampling ab initio simulations enable to study chemical phenomena in catalytic systems including thermal effects & anharmonicity, & collective dynamics describing enthalpic & entropic contributions, which can significantly impact on reaction free energy landscapes.
Collapse
Affiliation(s)
- GiovanniMaria Piccini
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
- Istituto Eulero, Università della Svizzera italiana, Via Giuseppe Buffi 13, Lugano, Ticino, Switzerland
| | - Mal-Soon Lee
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Simuck F. Yuk
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
- Department of Chemistry and Life Science, United States Military Academy, West Point, NY 10996, USA
| | - Difan Zhang
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Greg Collinge
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Loukas Kollias
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Manh-Thuong Nguyen
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Vassiliki-Alexandra Glezakou
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - Roger Rousseau
- Basic & Applied Molecular Foundations, Physical and Computational Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
22
|
Abbasi A, Amjad-Iranagh S, Dabir B. CellSys: An open-source tool for building initial structures for bio-membranes and drug-delivery systems. J Comput Chem 2021; 43:331-339. [PMID: 34897717 DOI: 10.1002/jcc.26793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2021] [Revised: 11/17/2021] [Accepted: 11/24/2021] [Indexed: 11/11/2022]
Abstract
Since phospholipids are the most important components in the structure of biomembranes, they deserve to be considered with a lot of attention in both experimental and computational theoretical studies using molecular simulation methods related to the research in the fields of drug design and drug delivery where they involve knowledge about the interactions of drug molecules with cell membranes. To employ the molecular simulation approach for this purpose the essential requirement is having information about the initial structure of phospholipids and how they interact with the drugs. Therefore in this article, we introduce an open-source software package in Python programming language for utilizing data manipulation for generation and developing the initial structure of biomolecular cells to provide the needed information for investigation in drug delivery systems. In addition, the proposed software package can be used for the efficient storage of membrane structural data to be exploited in designing new drug delivery systems. To verify the performance of the code and the results of the simulations, several analyses have been done, such as the calculation of area per lipid and self-diffusion coefficient, in addition to lipid order parameter. The results were in complete agreement with the references.
Collapse
Affiliation(s)
- Ali Abbasi
- Department of Chemical Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Sepideh Amjad-Iranagh
- Department of Materials and Metallurgical Engineering, Amirkabir University of Technology, Tehran, Iran
| | - Bahram Dabir
- Department of Chemical Engineering, Amirkabir University of Technology, Tehran, Iran
| |
Collapse
|
23
|
Caceres-Delpiano J, Wang LP, Essex JW. The automated optimisation of a coarse-grained force field using free energy data. Phys Chem Chem Phys 2021; 23:24842-24851. [PMID: 34723311 PMCID: PMC8579472 DOI: 10.1039/d0cp05041e] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2020] [Accepted: 10/18/2021] [Indexed: 11/21/2022]
Abstract
Atomistic models provide a detailed representation of molecular systems, but are sometimes inadequate for simulations of large systems over long timescales. Coarse-grained models enable accelerated simulations by reducing the number of degrees of freedom, at the cost of reduced accuracy. New optimisation processes to parameterise these models could improve their quality and range of applicability. We present an automated approach for the optimisation of coarse-grained force fields, by reproducing free energy data derived from atomistic molecular simulations. To illustrate the approach, we implemented hydration free energy gradients as a new target for force field optimisation in ForceBalance and applied it successfully to optimise the un-charged side-chains and the protein backbone in the SIRAH protein coarse-grain force field. The optimised parameters closely reproduced hydration free energies of atomistic models and gave improved agreement with experiment.
Collapse
Affiliation(s)
| | - Lee-Ping Wang
- Department of Chemistry, University of California, Davis, California 95616, USA.
| | - Jonathan W Essex
- School of Chemistry, University of Southampton, Southapton, S017 1BJ, UK.
| |
Collapse
|
24
|
Improvement of the Force Field for β-d-Glucose with Machine Learning. Molecules 2021; 26:molecules26216691. [PMID: 34771103 PMCID: PMC8588059 DOI: 10.3390/molecules26216691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Revised: 10/29/2021] [Accepted: 10/30/2021] [Indexed: 11/17/2022] Open
Abstract
While the construction of a dependable force field for performing classical molecular dynamics (MD) simulation is crucial for elucidating the structure and function of biomolecular systems, the attempts to do this for glycans are relatively sparse compared to those for proteins and nucleic acids. Currently, the use of GLYCAM06 force field is the most popular, but there have been a number of concerns about its accuracy in the systematic description of structural changes. In the present work, we focus on the improvement of the GLYCAM06 force field for β-d-glucose, a simple and the most abundant monosaccharide molecule, with the aid of machine learning techniques implemented with the TensorFlow library. Following the pre-sampling over a wide range of configuration space generated by MD simulation, the atomic charge and dihedral angle parameters in the GLYCAM06 force field were re-optimized to accurately reproduce the relative energies of β-d-glucose obtained by the density functional theory (DFT) calculations according to the structural changes. The validation for the newly proposed force-field parameters was then carried out by verifying that the relative energy errors compared to the DFT value were significantly reduced and that some inconsistencies with experimental (e.g., NMR) results observed in the GLYCAM06 force field were resolved relevantly.
Collapse
|
25
|
Diéguez-Santana K, González-Díaz H. Towards machine learning discovery of dual antibacterial drug-nanoparticle systems. NANOSCALE 2021; 13:17854-17870. [PMID: 34671801 DOI: 10.1039/d1nr04178a] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Artificial Intelligence/Machine Learning (AI/ML) algorithms may speed up the design of DADNP systems formed by Antibacterial Drugs (AD) and Nanoparticles (NP). In this work, we used IFPTML = Information Fusion (IF) + Perturbation-Theory (PT) + Machine Learning (ML) algorithm for the first time to study of a large dataset of putative DADNP systems composed by >165 000 ChEMBL AD assays and 300 NP assays vs. multiple bacteria species. We trained alternative models with Linear Discriminant Analysis (LDA), Artificial Neural Networks (ANN), Bayesian Networks (BNN), K-Nearest Neighbour (KNN) and other algorithms. IFPTML-LDA model was simpler with values of Sp ≈ 90% and Sn ≈ 74% in both training (>124 K cases) and validation (>41 K cases) series. IFPTML-ANN and KNN models are notably more complicated even when they are more balanced Sn ≈ Sp ≈ 88.5%-99.0% and AUROC ≈ 0.94-0.99 in both series. We also carried out a simulation (>1900 calculations) of the expected behavior for putative DADNPs in 72 different biological assays. The putative DADNPs studied are formed by 27 different drugs with multiple classes of NP and types of coats. In addition, we tested the validity of our additive model with 80 DADNP complexes experimentally synthetized and biologically tested (reported in >45 papers). All these DADNPs show values of MIC < 50 μg mL-1 (cutoff used) better that MIC of AD and NP alone (synergistic or additive effect). The assays involve DADNP complexes with 10 types of NP, 6 coating materials, NP size range 5-100 nm vs. 15 different antibiotics, and 12 bacteria species. The IFPTML-LDA model classified correctly 100% (80 out of 80) DADNP complexes as biologically active. IFPMTL additive strategy may become a useful tool to assist the design of DADNP systems for antibacterial therapy taking into consideration only information about AD and NP components by separate.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Basque Center for Biophysics CSIC-UPVEH, University of Basque Country UPV/EHU, 48940 Leioa, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
26
|
Zheng J, Sun X, Hu J, Wang S, Yao Z, Deng S, Pan X, Pan Z, Wang J. Symbolic Transformer Accelerating Machine Learning Screening of Hydrogen and Deuterium Evolution Reaction Catalysts in MA 2Z 4 Materials. ACS APPLIED MATERIALS & INTERFACES 2021; 13:50878-50891. [PMID: 34672634 DOI: 10.1021/acsami.1c13236] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Two-dimensional (2D) materials have been developed into various catalysts with high performance, but employing them for developing highly stable and active nonprecious hydrogen evolution reaction (HER) catalysts still encounters many challenges. To this end, the machine learning (ML) screening of HER catalysts is accelerated by using genetic programming (GP) of symbolic transformers for various typical 2D MA2Z4 materials. The values of the Gibbs free energy of hydrogen adsorption (ΔGH*) are accurately and rapidly predicted via extreme gradient boosting regression by using only simple GP-processed elemental features, with a low predictive root-mean-square error of 0.14 eV. With the analysis of ML and density functional theory (DFT) methods, it is found that various electronic structural properties of metal atoms and the p-band center of surface atoms play a crucial role in regulating the HER performance. Based on these findings, NbSi2N4 and VSi2N4 are discovered to be active catalysts with thermodynamical and dynamical stability as ΔGH* approaches to zero (-0.041 and 0.024 eV). In addition, DFT calculations reveal that these catalysts also exhibit good deuterium evolution reaction (DER) performance. Overall, a multistep workflow is developed through ML models combined with DFT calculations for efficiently screening the potential HER and DER catalysts from 2D materials with the same crystal prototype, which is believed to have significant contribution to catalyst design and fabrication.
Collapse
Affiliation(s)
- Jingnan Zheng
- Institute of Industrial Catalysis, State Key Laboratory Breeding Base of Green-Chemical Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310032, P. R. China
| | | | | | - ShiBin Wang
- Institute of Industrial Catalysis, State Key Laboratory Breeding Base of Green-Chemical Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310032, P. R. China
| | - Zihao Yao
- Institute of Industrial Catalysis, State Key Laboratory Breeding Base of Green-Chemical Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310032, P. R. China
| | - Shengwei Deng
- Institute of Industrial Catalysis, State Key Laboratory Breeding Base of Green-Chemical Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310032, P. R. China
| | | | | | - Jianguo Wang
- Institute of Industrial Catalysis, State Key Laboratory Breeding Base of Green-Chemical Synthesis Technology, College of Chemical Engineering, Zhejiang University of Technology, Hangzhou 310032, P. R. China
| |
Collapse
|
27
|
Niblett SP, Galib M, Limmer DT. Learning intermolecular forces at liquid-vapor interfaces. J Chem Phys 2021; 155:164101. [PMID: 34717371 DOI: 10.1063/5.0067565] [Citation(s) in RCA: 19] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
By adopting a perspective informed by contemporary liquid-state theory, we consider how to train an artificial neural network potential to describe inhomogeneous, disordered systems. We find that neural network potentials based on local representations of atomic environments are capable of describing some properties of liquid-vapor interfaces but typically fail for properties that depend on unbalanced long-ranged interactions that build up in the presence of broken translation symmetry. These same interactions cancel in the translationally invariant bulk, allowing local neural network potentials to describe bulk properties correctly. By incorporating explicit models of the slowly varying long-ranged interactions and training neural networks only on the short-ranged components, we can arrive at potentials that robustly recover interfacial properties. We find that local neural network models can sometimes approximate a local molecular field potential to correct for the truncated interactions, but this behavior is variable and hard to learn. Generally, we find that models with explicit electrostatics are easier to train and have higher accuracy. We demonstrate this perspective in a simple model of an asymmetric dipolar fluid, where the exact long-ranged interaction is known, and in an ab initio water model, where it is approximated.
Collapse
Affiliation(s)
- Samuel P Niblett
- Department of Chemistry, University of California, Berkeley California 94609, USA
| | - Mirza Galib
- Department of Chemistry, University of California, Berkeley California 94609, USA
| | - David T Limmer
- Department of Chemistry, University of California, Berkeley California 94609, USA
| |
Collapse
|
28
|
Hernandes VF, Marques MS, Bordin JR. Phase classification using neural networks: application to supercooled, polymorphic core-softened mixtures. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2021; 34:024002. [PMID: 34638114 DOI: 10.1088/1361-648x/ac2f0f] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2021] [Accepted: 10/12/2021] [Indexed: 06/13/2023]
Abstract
Characterization of phases of soft matter systems is a challenge faced in many physical chemical problems. For polymorphic fluids it is an even greater challenge. Specifically, glass forming fluids, as water, can have, besides solid polymorphism, more than one liquid and glassy phases, and even a liquid-liquid critical point. In this sense, we apply a neural network algorithm to analyze the phase behavior of a mixture of core-softened fluids that interact through the continuous-shouldered well (CSW) potential, which have liquid polymorphism and liquid-liquid critical points, similar to water. We also apply the neural network to mixtures of CSW fluids and core-softened alcohols models. We combine and expand methods based on bond-orientational order parameters to study mixtures, applied to mixtures of hardcore fluids and to supercooled water, to include longer range coordination shells. With this, the trained neural network was able to properly predict the crystalline solid phases, the fluid phases and the amorphous phase for the pure CSW and CSW-alcohols mixtures with high efficiency. More than this, information about the phase populations, obtained from the network approach, can help verify if the phase transition is continuous or discontinuous, and also to interpret how the metastable amorphous region spreads along the stable high density fluid phase. These findings help to understand the behavior of supercooled polymorphic fluids and extend the comprehension of how amphiphilic solutes affect the phases behavior.
Collapse
Affiliation(s)
- V F Hernandes
- Programa de Pós-Graduação em Física, Departamento de Física, Instituto de Física e Matemática, Universidade Federal de Pelotas, Caixa Postal 354, 96001-970, Pelotas-RS, Brazil
| | - M S Marques
- Centro das Ciências Exatas e das Tecnologias, Universidade Federal do Oeste da Bahia Rua Bertioga, 892, Morada Nobre, CEP 47810-059, Barreiras-BA, Brazil
| | - José Rafael Bordin
- Departamento de Física, Instituto de Física e Matemática, Universidade Federal de Pelotas, Caixa Postal 354, 96001-970, Pelotas-RS, Brazil
| |
Collapse
|
29
|
Satsangi S, Mishra A, Singh AK. Feature Blending: An Approach toward Generalized Machine Learning Models for Property Prediction. ACS PHYSICAL CHEMISTRY AU 2021; 2:16-22. [PMID: 36855577 PMCID: PMC9718311 DOI: 10.1021/acsphyschemau.1c00017] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
From studying the atomic structure and chemical behavior to the discovery of new materials and investigating properties of existing materials, machine learning (ML) has been employed in realms that are arduous to probe experimentally. While numerous highly accurate models, specifically for property prediction, have been reported in the literature, there has been a lack of a generalized framework. Herein we propose a novel feature selection approach that enables the development of a unified ML model for property prediction for several classes of materials. It involves an ingenious blending of selected features from various classes of data such that the resultant feature set equips the model with global data descriptors capturing both class-specific as well as global traits. We took accurate band gaps of three distinct classes of 2D materials as our target property to develop the proposed feature blending approach. Using Gaussian process regression (GPR) with the blended features, the ML model developed here resulted in an average root-mean-squared error of 0.12 eV for unseen data belonging to any of the participating classes. The feature blending approach proposed here can be extended to additional classes of materials and also to predict other properties.
Collapse
|
30
|
Barone V, Puzzarini C, Mancini G. Integration of theory, simulation, artificial intelligence and virtual reality: a four-pillar approach for reconciling accuracy and interpretability in computational spectroscopy. Phys Chem Chem Phys 2021; 23:17079-17096. [PMID: 34346437 DOI: 10.1039/d1cp02507d] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
The established pillars of computational spectroscopy are theory and computer based simulations. Recently, artificial intelligence and virtual reality are becoming the third and fourth pillars of an integrated strategy for the investigation of complex phenomena. The main goal of the present contribution is the description of some new perspectives for computational spectroscopy, in the framework of a strategy in which computational methodologies at the state of the art, high-performance computing, artificial intelligence and virtual reality tools are integrated with the aim of improving research throughput and achieving goals otherwise not possible. Some of the key tools (e.g., continuous molecular perception model and virtual multifrequency spectrometer) and theoretical developments (e.g., non-periodic boundaries, joint variational-perturbative models) are shortly sketched and their application illustrated by means of representative case studies taken from recent work by the authors. Some of the results presented are already well beyond the state of the art in the field of computational spectroscopy, thereby also providing a proof of concept for other research fields.
Collapse
Affiliation(s)
- Vincenzo Barone
- Scuola Normale Superiore, Piazza dei Cavalieri 7, I-56126 Pisa, Italy.
| | | | | |
Collapse
|
31
|
Ryazantsev MN, Strashkov DM, Nikolaev DM, Shtyrov AA, Panov MS. Photopharmacological compounds based on azobenzenes and azoheteroarenes: principles of molecular design, molecular modelling, and synthesis. RUSSIAN CHEMICAL REVIEWS 2021. [DOI: 10.1070/rcr5001] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
|
32
|
Schriber JB, Nascimento DR, Koutsoukas A, Spronk SA, Cheney DL, Sherrill CD. CLIFF: A component-based, machine-learned, intermolecular force field. J Chem Phys 2021; 154:184110. [PMID: 34241025 DOI: 10.1063/5.0042989] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Computation of intermolecular interactions is a challenge in drug discovery because accurate ab initio techniques are too computationally expensive to be routinely applied to drug-protein models. Classical force fields are more computationally feasible, and force fields designed to match symmetry adapted perturbation theory (SAPT) interaction energies can remain accurate in this context. Unfortunately, the application of such force fields is complicated by the laborious parameterization required for computations on new molecules. Here, we introduce the component-based machine-learned intermolecular force field (CLIFF), which combines accurate, physics-based equations for intermolecular interaction energies with machine-learning models to enable automatic parameterization. The CLIFF uses functional forms corresponding to electrostatic, exchange-repulsion, induction/polarization, and London dispersion components in SAPT. Molecule-independent parameters are fit with respect to SAPT2+(3)δMP2/aug-cc-pVTZ, and molecule-dependent atomic parameters (atomic widths, atomic multipoles, and Hirshfeld ratios) are obtained from machine learning models developed for C, N, O, H, S, F, Cl, and Br. The CLIFF achieves mean absolute errors (MAEs) no worse than 0.70 kcal mol-1 in both total and component energies across a diverse dimer test set. For the side chain-side chain interaction database derived from protein fragments, the CLIFF produces total interaction energies with an MAE of 0.27 kcal mol-1 with respect to reference data, outperforming similar and even more expensive methods. In applications to a set of model drug-protein interactions, the CLIFF is able to accurately rank-order ligand binding strengths and achieves less than 10% error with respect to SAPT reference values for most complexes.
Collapse
Affiliation(s)
- Jeffrey B Schriber
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | - Daniel R Nascimento
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | - Alexios Koutsoukas
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Steven A Spronk
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - Daniel L Cheney
- Molecular Structure and Design, Bristol Myers Squibb Company, P.O. Box 5400, Princeton, New Jersey 08543, USA
| | - C David Sherrill
- Center for Computational Molecular Science and Technology, School of Chemistry and Biochemistry and School of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| |
Collapse
|
33
|
Han R, Rodríguez-Mayorga M, Luber S. A Machine Learning Approach for MP2 Correlation Energies and Its Application to Organic Compounds. J Chem Theory Comput 2021; 17:777-790. [DOI: 10.1021/acs.jctc.0c00898] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ruocheng Han
- Department of Chemistry A, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | | | - Sandra Luber
- Department of Chemistry A, University of Zurich, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| |
Collapse
|
34
|
Jesus WS, Prudente FV, Marques JMC, Pereira FB. Modeling microsolvation clusters with electronic-structure calculations guided by analytical potentials and predictive machine learning techniques. Phys Chem Chem Phys 2021; 23:1738-1749. [PMID: 33427847 DOI: 10.1039/d0cp05200k] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
We propose a new methodology to study, at the density functional theory (DFT) level, the clusters resulting from the microsolvation of alkali-metal ions with rare-gas atoms. The workflow begins with a global optimization search to generate a pool of low-energy minimum structures for different cluster sizes. This is achieved by employing an analytical potential energy surface (PES) and an evolutionary algorithm (EA). The next main stage of the methodology is devoted to establish an adequate DFT approach to treat the microsolvation system, through a systematic benchmark study involving several combinations of functionals and basis sets, in order to characterize the global minimum structures of the smaller clusters. In the next stage, we apply machine learning (ML) classification algorithms to predict how the low-energy minima of the analytical PES map to the DFT ones. An early and accurate detection of likely DFT local minima is extremely important to guide the choice of the most promising low-energy minima of large clusters to be re-optimized at the DFT level of theory. In this work, the methodology was applied to the Li+Krn (n = 2-14 and 16) microsolvation clusters for which the most competitive DFT approach was found to be the B3LYP-D3/aug-pcseg-1. Additionally, the ML classifier was able to accurately predict most of the solutions to be re-optimized at the DFT level of theory, thereby greatly enhancing the efficiency of the process and allowing its applicability to larger clusters.
Collapse
Affiliation(s)
- W S Jesus
- Instituto de Física, Universidade Federal da Bahia, 40170-115 Salvador, BA, Brazil.
| | - F V Prudente
- Instituto de Física, Universidade Federal da Bahia, 40170-115 Salvador, BA, Brazil.
| | - J M C Marques
- CQC, Department of Chemistry, University of Coimbra, 3004-535 Coimbra, Portugal.
| | - F B Pereira
- Coimbra Polytechnic - ISEC, Coimbra, Portugal and Centro de Informática e Sistemas da Universidade de Coimbra (CISUC), Coimbra, Portugal.
| |
Collapse
|
35
|
Sabbih GO, Korsah MA, Jeevanandam J, Danquah MK. Biophysical analysis of SARS-CoV-2 transmission and theranostic development via N protein computational characterization. Biotechnol Prog 2020; 37:e3096. [PMID: 33118327 PMCID: PMC7645878 DOI: 10.1002/btpr.3096] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2020] [Revised: 10/22/2020] [Accepted: 10/26/2020] [Indexed: 01/01/2023]
Abstract
Recently, SARS-CoV-2 has been identified as the causative factor of viral infection called COVID-19 that belongs to the zoonotic beta coronavirus family known to cause respiratory disorders or viral pneumonia, followed by an extensive attack on organs that express angiotensin-converting enzyme II (ACE2). Human transmission of this virus occurs via respiratory droplets from symptomatic and asymptomatic patients, which are released into the environment after sneezing or coughing. These droplets are capable of staying in the air as aerosols or surfaces and can be transmitted to persons through inhalation or contact with contaminated surfaces. Thus, there is an urgent need for advanced theranostic solutions to control the spread of COVID-19 infection. The development of such fit-for-purpose technologies hinges on a proper understanding of the transmission, incubation, and structural characteristics of the virus in the external environment and within the host. Hence, this article describes the development of an intrinsic model to describe the incubation characteristics of the virus under varying environmental factors. It also discusses on the evaluation of SARS-CoV-2 structural nucleocapsid protein properties via computational approaches to generate high-affinity binding probes for effective diagnosis and targeted treatment applications by specific targeting of viruses. In addition, this article provides useful insights on the transmission behavior of the virus and creates new opportunities for theranostics development.
Collapse
Affiliation(s)
- Godfred O Sabbih
- Department of Chemical Engineering, University of Tennessee, Chattanooga, Tennessee, USA
| | - Maame A Korsah
- Department of Mathematics, University of Tennessee, Chattanooga, Tennessee, USA
| | - Jaison Jeevanandam
- CQM - Centro de Química da Madeira, MMRG, Universidade da Madeira, Campus da Penteada, Funchal, Portugal
| | - Michael K Danquah
- Department of Chemical Engineering, University of Tennessee, Chattanooga, Tennessee, USA
| |
Collapse
|
36
|
Valadez Huerta G, Raabe G. Genetic Parameterization of Interfacial Force Fields Based on Classical Bulk Force Fields and Ab Initio Data: Application to the Methanol-ZnO Interfaces. J Chem Inf Model 2020; 60:6033-6043. [DOI: 10.1021/acs.jcim.0c01093] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Gerardo Valadez Huerta
- Institut für Thermodynamik, Technische Universität Braunschweig, Hans-Sommer-Straße 5, D-38106 Braunschweig, Germany
| | - Gabriele Raabe
- Institut für Thermodynamik, Technische Universität Braunschweig, Hans-Sommer-Straße 5, D-38106 Braunschweig, Germany
| |
Collapse
|
37
|
Oweida TJ, Mahmood A, Manning MD, Rigin S, Yingling YG. Merging Materials and Data Science: Opportunities, Challenges, and Education in Materials Informatics. ACTA ACUST UNITED AC 2020. [DOI: 10.1557/adv.2020.171] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
|
38
|
Mondal A, Young JM, Barckholtz TA, Kiss G, Koziol L, Panagiotopoulos AZ. Genetic Algorithm Driven Force Field Parameterization for Molten Alkali-Metal Carbonate and Hydroxide Salts. J Chem Theory Comput 2020; 16:5736-5746. [DOI: 10.1021/acs.jctc.0c00285] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Anirban Mondal
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Jeffrey M. Young
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | | | - Gabor Kiss
- ExxonMobil Research and Engineering, Annandale, New Jersey 08801, United States
| | - Lucas Koziol
- ExxonMobil Research and Engineering, Annandale, New Jersey 08801, United States
| | | |
Collapse
|
39
|
Han R, Luber S. Trajectory-based machine learning method and its application to molecular dynamics. Mol Phys 2020. [DOI: 10.1080/00268976.2020.1788189] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- R. Han
- Department of Chemistry A, University of Zurich, Zurich, Switzerland
| | - S. Luber
- Department of Chemistry A, University of Zurich, Zurich, Switzerland
| |
Collapse
|
40
|
Yang PJ, Sugiyama M, Tsuda K, Yanai T. Artificial Neural Networks Applied as Molecular Wave Function Solvers. J Chem Theory Comput 2020; 16:3513-3529. [PMID: 32320233 DOI: 10.1021/acs.jctc.9b01132] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We use artificial neural networks (ANNs) based on the Boltzmann machine (BM) architectures as an encoder of ab initio molecular many-electron wave functions represented with the complete active space configuration interaction (CAS-CI) model. As first introduced by the work of Carleo and Troyer for physical systems, the coefficients of the electronic configurations in the CI expansion are parametrized with the BMs as a function of their occupancies that act as descriptors. This ANN-based wave function ansatz is referred to as the neural-network quantum state (NQS). The machine learning is used for training the BMs in terms of finding a variationally optimal form of the ground-state wave function on the basis of the energy minimization. It is relevant to reinforcement learning and does not use any reference data nor prior knowledge of the wave function, while the Hamiltonian is given based on a user-specified chemical structure in the first-principles manner. Carleo and Troyer used the restricted Boltzmann machine (RBM), which has hidden units, for the neural network architecture of NQS, while, in this study, we further introduce its replacement with the BM that has only visible units but with different orders of connectivity. For this hidden-node free BM, the second- and third-order BMs based on quadratic and cubic energy functions, respectively, were implemented. We denote these second- and third-order BMs as BM2 and BM3, respectively. The pilot implementation of the NQS solver into an exact diagonalization module of the quantum chemistry program was made to assess the capability of variants of the BM-based NQS. The test calculations were performed by determining the CAS-CI wave functions of illustrative molecular systems, indocyanine green, and dinitrogen dissociation. The simulated energies have been shown to converge to CAS-CI energy in most cases by improving RBM with an increasing number of hidden nodes. BM3 systematically yields lower energies than BM2, reproducing the CAS-CI energies of dinitrogen across potential energy curves within an error of 50 μEh.
Collapse
Affiliation(s)
- Peng-Jian Yang
- Department of Chemistry, Nagoya University, Furocho, Chikusa Ward, Nagoya, Aichi 464-8601, Japan
| | - Mahito Sugiyama
- National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, Tokyo 101-8430, Japan.,JST, PRESTO, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
| | - Koji Tsuda
- Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwa-no-ha, Kashiwa, Chiba 277-8561, Japan.,RIKEN Center for Advanced Intelligence Project, 1-4-1 Nihonbashi, Chuo-ku, Tokyo 103-0027, Japan.,Research and Services Division of Materials Data and Integrated System, National Institute for Materials Science, Ibaraki 305-0047, Japan
| | - Takeshi Yanai
- Department of Chemistry, Nagoya University, Furocho, Chikusa Ward, Nagoya, Aichi 464-8601, Japan.,Institute of Transformative Bio-Molecules (WPI-ITbM), Nagoya University, Furocho, Chikusa Ward, Nagoya, Aichi 464-8601, Japan.,JST, PRESTO, 4-1-8 Honcho, Kawaguchi, Saitama 332-0012, Japan
| |
Collapse
|
41
|
Organic Photovoltaics: Relating Chemical Structure, Local Morphology, and Electronic Properties. TRENDS IN CHEMISTRY 2020. [DOI: 10.1016/j.trechm.2020.03.006] [Citation(s) in RCA: 33] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
|
42
|
Wright E, Ferrato MH, Bryer AJ, Searles R, Perilla JR, Chandrasekaran S. Accelerating prediction of chemical shift of protein structures on GPUs: Using OpenACC. PLoS Comput Biol 2020; 16:e1007877. [PMID: 32401799 PMCID: PMC7250467 DOI: 10.1371/journal.pcbi.1007877] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2020] [Revised: 05/26/2020] [Accepted: 04/15/2020] [Indexed: 11/23/2022] Open
Abstract
Experimental chemical shifts (CS) from solution and solid state magic-angle-spinning nuclear magnetic resonance (NMR) spectra provide atomic level information for each amino acid within a protein or protein complex. However, structure determination of large complexes and assemblies based on NMR data alone remains challenging due to the complexity of the calculations. Here, we present a hardware accelerated strategy for the estimation of NMR chemical-shifts of large macromolecular complexes based on the previously published PPM_One software. The original code was not viable for computing large complexes, with our largest dataset taking approximately 14 hours to complete. Our results show that serial code refactoring and parallel acceleration brought down the time taken of the software running on an NVIDIA Volta 100 (V100) Graphic Processing Unit (GPU) to 46.71 seconds for our largest dataset of 11.3 million atoms. We use OpenACC, a directive-based programming model for porting the application to a heterogeneous system consisting of x86 processors and NVIDIA GPUs. Finally, we demonstrate the feasibility of our approach in systems of increasing complexity ranging from 100K to 11.3M atoms. Nuclear magnetic resonance (NMR) spectroscopy yields chemical shifts (CSs) which reveal chemical details of the environment of an atom in a protein. Computer estimation of CSs require the calculation of several contributing terms including interatomic distances, ring current effects and the formation of hydrogen bonds. Here, taking advantage of graphic processing units (GPUs), the estimation of chemical shifts are accelerated thus enabling the determination of the CSs for large systems, encompassing millions of atoms. The rapid determination of CSs enables the use of CS-based validation for other molecular dynamics computations.
Collapse
Affiliation(s)
- Eric Wright
- Dept. of Computer and Information Sciences, University of Delaware, Newark, Delaware, United States of America
| | - Mauricio H. Ferrato
- Dept. of Computer and Information Sciences, University of Delaware, Newark, Delaware, United States of America
| | - Alexander J. Bryer
- Department of Chemistry & Biochemistry, University of Delaware, Newark, Delaware, United States of America
| | - Robert Searles
- Dept. of Computer and Information Sciences, University of Delaware, Newark, Delaware, United States of America
| | - Juan R. Perilla
- Department of Chemistry & Biochemistry, University of Delaware, Newark, Delaware, United States of America
| | - Sunita Chandrasekaran
- Dept. of Computer and Information Sciences, University of Delaware, Newark, Delaware, United States of America
- * E-mail:
| |
Collapse
|
43
|
Bose S, Chakrabarty S, Ghosh D. Support Vector Regression-Based Monte Carlo Simulation of Flexible Water Clusters. ACS OMEGA 2020; 5:7065-7073. [PMID: 32280847 PMCID: PMC7143414 DOI: 10.1021/acsomega.9b02968] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Accepted: 03/12/2020] [Indexed: 06/11/2023]
Abstract
Molecular simulations based on classical force fields are computationally efficient but lack accuracy due to the empirical formulation of non-bonded interactions. Quantum mechanical (QM) methods, albeit accurate, have inhibitory computational costs for large molecules and clusters. Hence, to overcome the bottleneck, machine learning (ML)-based methods have been employed in the recent years. We had earlier reported a combined scheme of many-body expansion (MBE) and ML to predict the interaction energies of rigid water clusters. In this work, we proceed toward building a flexible water model using the ML-MBE scheme. This ML-MBE scheme has an error of <1% for interaction energy prediction in comparison to the parent QM method for flexible water decamers. Machine learning-based Monte Carlo simulations (MLMC) are performed with this water model, and the structural properties of these configurations are compared with those obtained from ab initio molecular dynamics (AIMD) and the TIP3P classical force field. The radial distribution functions, tetrahedral order parameters, and number of hydrogen bonds in AIMD and MLMC have a similar qualitative and quantitative trend, whereas the classical force fields show a significant deviation.
Collapse
Affiliation(s)
- Samik Bose
- School
of Chemical Sciences, Indian Association
for the Cultivation of Science, Kolkata 700032, West Bengal, India
| | - Suman Chakrabarty
- Department
of Chemical, Biological & Macromolecular Sciences, S. N. Bose National Centre for Basic Sciences, Kolkata 700106, West Bengal, India
| | - Debashree Ghosh
- School
of Chemical Sciences, Indian Association
for the Cultivation of Science, Kolkata 700032, West Bengal, India
| |
Collapse
|
44
|
Mueller T, Hernandez A, Wang C. Machine learning for interatomic potential models. J Chem Phys 2020; 152:050902. [PMID: 32035452 DOI: 10.1063/1.5126336] [Citation(s) in RCA: 109] [Impact Index Per Article: 27.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The use of supervised machine learning to develop fast and accurate interatomic potential models is transforming molecular and materials research by greatly accelerating atomic-scale simulations with little loss of accuracy. Three years ago, Jörg Behler published a perspective in this journal providing an overview of some of the leading methods in this field. In this perspective, we provide an updated discussion of recent developments, emerging trends, and promising areas for future research in this field. We include in this discussion an overview of three emerging approaches to developing machine-learned interatomic potential models that have not been extensively discussed in existing reviews: moment tensor potentials, message-passing networks, and symbolic regression.
Collapse
Affiliation(s)
- Tim Mueller
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Alberto Hernandez
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | - Chuhong Wang
- Department of Materials Science and Engineering, Johns Hopkins University, Baltimore, Maryland 21218, USA
| |
Collapse
|
45
|
Brunken C, Reiher M. Self-Parametrizing System-Focused Atomistic Models. J Chem Theory Comput 2020; 16:1646-1665. [DOI: 10.1021/acs.jctc.9b00855] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Christoph Brunken
- Laboratory for Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| | - Markus Reiher
- Laboratory for Physical Chemistry, ETH Zurich, Vladimir-Prelog-Weg 2, 8093 Zurich, Switzerland
| |
Collapse
|
46
|
Gilmore RAJ, Dove MT, Misquitta AJ. First-Principles Many-Body Nonadditive Polarization Energies from Monomer and Dimer Calculations Only: A Case Study on Water. J Chem Theory Comput 2020; 16:224-242. [PMID: 31769980 DOI: 10.1021/acs.jctc.9b00819] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The many-body polarization energy is the major source of nonadditivity in strongly polar systems such as water. This nonadditivity is often considerable and must be included, if only in an average manner, to correctly describe the physical properties of the system. Models for the polarization energy are usually parametrized using experimental data, or theoretical estimates of the many-body effects. Here we show how many-body polarization models can be developed for water complexes using data for the monomer and dimer only using ideas recently developed in the field of intermolecular perturbation theory and state-of-the-art approaches for calculating distributed molecular properties based on the iterated stockholder atoms (ISA) algorithm. We show how these models can be calculated, and we validate their accuracy in describing the many-body nonadditive energies of a range of water clusters. We further investigate their sensitivity to the details of the polarization damping models used. We show how our very best polarization models yield many-body energies that agree with those computed with coupled-cluster methods, but at a fraction of the computational cost.
Collapse
Affiliation(s)
- Rory A J Gilmore
- School of Physics and Astronomy and the Thomas Young Centre for Theory and Simulation of Materials at Queen Mary University of London , London E1 4NS , U.K
| | - Martin T Dove
- School of Physics and Astronomy and the Thomas Young Centre for Theory and Simulation of Materials at Queen Mary University of London , London E1 4NS , U.K
| | - Alston J Misquitta
- School of Physics and Astronomy and the Thomas Young Centre for Theory and Simulation of Materials at Queen Mary University of London , London E1 4NS , U.K
| |
Collapse
|
47
|
Abstract
There is significant potential for electronic structure methods to improve the quality of the predictions furnished by the tools of computer-aided drug design, which typically rely on empirically derived functions. In this perspective, we consider some recent examples of how quantum mechanics has been applied in predicting protein-ligand geometries, protein-ligand binding affinities and ligand strain on binding. We then outline several significant developments in quantum mechanics methodology likely to influence these approaches: in particular, we note the advent of more computationally expedient ab initio quantum mechanical methods that can provide chemical accuracy for larger molecular systems than hitherto possible. We highlight the emergence of increasingly accurate semiempirical quantum mechanical methods and the associated role of machine learning and molecular databases in their development. Indeed, the convergence of improved algorithms for solving and analyzing electronic structure, modern machine learning methods, and increasingly comprehensive benchmark data sets of molecular geometries and energies provides a context in which the potential of quantum mechanics will be increasingly realized in driving future developments and applications in structure-based drug discovery.
Collapse
Affiliation(s)
- Richard A Bryce
- Division of Pharmacy and Optometry, School of Health Sciences, University of Manchester, Manchester, UK.
| |
Collapse
|
48
|
Plazinski W, Plazinska A, Brzyska A. Efficient sampling of high-energy states by machine learning force fields. Phys Chem Chem Phys 2020; 22:14364-14374. [DOI: 10.1039/d0cp01399d] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
A method extending the range of applicability of machine-learning force fields is proposed. It relies on biased subsampling of the high-energy states described by the predefined coordinate(s).
Collapse
Affiliation(s)
- Wojciech Plazinski
- Jerzy Haber Institute of Catalysis and Surface Chemistry Polish Academy of Sciences
- 30-239 Krakow
- Poland
| | - Anita Plazinska
- Department of Biopharmacy
- Medical University of Lublin Chodźki 4a
- 20-093 Lublin
- Poland
| | - Agnieszka Brzyska
- Jerzy Haber Institute of Catalysis and Surface Chemistry Polish Academy of Sciences
- 30-239 Krakow
- Poland
| |
Collapse
|
49
|
A fast neural network approach for direct covariant forces prediction in complex multi-element extended systems. NAT MACH INTELL 2019. [DOI: 10.1038/s42256-019-0098-0] [Citation(s) in RCA: 22] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
50
|
Flood E, Boiteux C, Lev B, Vorobyov I, Allen TW. Atomistic Simulations of Membrane Ion Channel Conduction, Gating, and Modulation. Chem Rev 2019; 119:7737-7832. [DOI: 10.1021/acs.chemrev.8b00630] [Citation(s) in RCA: 65] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Affiliation(s)
- Emelie Flood
- School of Science, RMIT University, Melbourne, Victoria 3000, Australia
| | - Céline Boiteux
- School of Science, RMIT University, Melbourne, Victoria 3000, Australia
| | - Bogdan Lev
- School of Science, RMIT University, Melbourne, Victoria 3000, Australia
| | - Igor Vorobyov
- Department of Physiology & Membrane Biology/Department of Pharmacology, University of California, Davis, 95616, United States
| | - Toby W. Allen
- School of Science, RMIT University, Melbourne, Victoria 3000, Australia
| |
Collapse
|