1
|
Singh K, Lee KH, Peláez D, Bande A. Accelerating wavepacket propagation with machine learning. J Comput Chem 2024; 45:2360-2373. [PMID: 39031712 DOI: 10.1002/jcc.27443] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 05/13/2024] [Accepted: 05/16/2024] [Indexed: 07/22/2024]
Abstract
In this work, we discuss the use of a recently introduced machine learning (ML) technique known as Fourier neural operators (FNO) as an efficient alternative to the traditional solution of the time-dependent Schrödinger equation (TDSE). FNOs are ML models which are employed in the approximated solution of partial differential equations. For a wavepacket propagating in an anharmonic potential and for a tunneling system, we show that the FNO approach can accurately and faithfully model wavepacket propagation via the density. Additionally, we demonstrate that FNOs can be a suitable replacement for traditional TDSE solvers in cases where the results of the quantum dynamical simulation are required repeatedly such as in the case of parameter optimization problems (e.g., control). The speed-up from the FNO method allows for its combination with the Markov-chain Monte Carlo approach in applications that involve solving inverse problems such as optimal and coherent laser control of the outcome of dynamical processes.
Collapse
Affiliation(s)
- Kanishka Singh
- Theory of Electron Dynamics and Spectroscopy, Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Berlin, Germany
- Institute of Chemistry and Biochemistry, Freie Universität Berlin, Berlin, Germany
| | - Ka Hei Lee
- Theory of Electron Dynamics and Spectroscopy, Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Berlin, Germany
- Fachbereich Physik, Freie Universität Berlin, Berlin, Germany
| | - Daniel Peláez
- CNRS, Institut des Sciences Moléculaires d'Orsay, Université Paris-Saclay, Orsay, France
| | - Annika Bande
- Theory of Electron Dynamics and Spectroscopy, Helmholtz-Zentrum Berlin für Materialien und Energie GmbH, Berlin, Germany
- Institute of Inorganic Chemistry, Leibniz University Hannover, Hannover, Germany
- Cluster of Excellence PhoenixD, Leibniz University Hannover, Hannover, Germany
| |
Collapse
|
2
|
Harding-Larsen D, Funk J, Madsen NG, Gharabli H, Acevedo-Rocha CG, Mazurenko S, Welner DH. Protein representations: Encoding biological information for machine learning in biocatalysis. Biotechnol Adv 2024:108459. [PMID: 39366493 DOI: 10.1016/j.biotechadv.2024.108459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/19/2024] [Accepted: 09/29/2024] [Indexed: 10/06/2024]
Abstract
Enzymes offer a more environmentally friendly and low-impact solution to conventional chemistry, but they often require additional engineering for their application in industrial settings, an endeavour that is challenging and laborious. To address this issue, the power of machine learning can be harnessed to produce predictive models that enable the in silico study and engineering of improved enzymatic properties. Such machine learning models, however, require the conversion the complex biological information to a numerical input, also called protein representations. These inputs demand special attention to ensure the training of accurate and precise models, and, in this review, we therefore examine the critical step of encoding protein information to numeric representations for use in machine learning. We selected the most important approaches for encoding the three distinct biological protein representations - primary sequence, 3D structure, and dynamics - to explore their requirements for employment and inductive biases. Combined representations of proteins and substrates are also introduced as emergent tools in biocatalysis. We propose the division of fixed representations, a collection of rule-based encoding strategies, and learned representations extracted from the latent spaces of large neural networks. To select the most suitable protein representation, we propose two main factors to consider. The first one is the model setup, which is influenced by the size of the training dataset and the choice of architecture. The second factor is the model objectives such as consideration about the assayed property, the difference between wild-type models and mutant predictors, and requirements for explainability. This review is aimed at serving as a source of information and guidance for properly representing enzymes in future machine learning models for biocatalysis.
Collapse
Affiliation(s)
- David Harding-Larsen
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Jonathan Funk
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Niklas Gesmar Madsen
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Hani Gharabli
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Carlos G Acevedo-Rocha
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Stanislav Mazurenko
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Ditte Hededam Welner
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark.
| |
Collapse
|
3
|
Yang Z, Yang Y, Huang Y, Shao Y, Hao H, Yao S, Xi Q, Guo Y, Tong L, Jian M, Shao Y, Zhang J. Wet-spinning of carbon nanotube fibers: dispersion, processing and properties. Natl Sci Rev 2024; 11:nwae203. [PMID: 39301072 PMCID: PMC11409889 DOI: 10.1093/nsr/nwae203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 05/21/2024] [Accepted: 06/10/2024] [Indexed: 09/22/2024] Open
Abstract
Owing to the intrinsic excellent mechanical, electrical, and thermal properties of carbon nanotubes (CNTs), carbon nanotube fibers (CNTFs) have been expected to become promising candidates for the next-generation of high-performance fibers. They have received considerable interest for cutting-edge applications, such as ultra-light electric wire, aerospace craft, military equipment, and space elevators. Wet-spinning is a broadly utilized commercial technique for high-performance fiber manufacturing. Thus, compared with array spinning from drawable CNTs vertical array and direct dry spinning from floating catalyst chemical vapor deposition (FCCVD), the wet-spinning technique is considered to be a promising strategy to realize the production of CNTFs on a large scale. In this tutorial review, we begin with a summative description of CNTFs wet-spinning process. Then, we discuss the high-concentration CNTs wet-spinning dope preparation strategies and corresponding non-covalent adsorption/charge transfer mechanisms. The filament solidification during the coagulation process is another critical procedure for determining the configurations and properties for derived CNTFs. Next, we discuss post-treatment, including continuous drafting and thermal annealing, to further optimize the CNTs orientation and compact configuration. Finally, we summarize the physical property-structure relationship to give insights for further performance promotion in order to satisfy the prerequisite for detailed application. Insights into propelling high-performance CNTFs production from lab-scale to industry-scale are proposed, in anticipation of this novel fiber having an impact on our lives in the near future.
Collapse
Affiliation(s)
- Zhicheng Yang
- School of Materials Science and Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
- Beijing Graphene Institute (BGI), Beijing 100095, China
| | - Yinan Yang
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
| | - Yufei Huang
- Center for Nanochemistry, Beijing Science and Engineering Center for Nanocarbons, Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yanyan Shao
- College of Energy Soochow Institute for Energy and Materials Innovations (SIEMIS), Key Laboratory of Advanced Carbon Materials and Wearable Energy Technologies of Jiangsu Province, SUDA-BGI Collaborative Innovation Center, Soochow University, Suzhou 215006, China
| | - He Hao
- Center for Nanochemistry, Beijing Science and Engineering Center for Nanocarbons, Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Shendong Yao
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100080, China
| | - Qiqing Xi
- School of Materials Science and Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Yinben Guo
- School of Materials Science and Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
| | - Lianming Tong
- Center for Nanochemistry, Beijing Science and Engineering Center for Nanocarbons, Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Muqiang Jian
- Beijing Graphene Institute (BGI), Beijing 100095, China
| | - Yuanlong Shao
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100080, China
- Beijing Graphene Institute (BGI), Beijing 100095, China
| | - Jin Zhang
- School of Materials Science and Engineering, Peking University, Beijing 100871, China
- Center for Nanochemistry, Beijing Science and Engineering Center for Nanocarbons, Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100080, China
- Beijing Graphene Institute (BGI), Beijing 100095, China
| |
Collapse
|
4
|
Rodríguez JI, Vergara-Beltrán UA. Physics-Inspired Evolutionary Machine Learning Method: From the Schrödinger Equation to an Orbital-Free-DFT Kinetic Energy Functional. J Phys Chem A 2024. [PMID: 39348527 DOI: 10.1021/acs.jpca.4c04155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/02/2024]
Abstract
We introduce a machine learning (ML)-supervised model function (which is in fact a functional rather than a regular function) that is inspired by the variational principle of physics. This ML hypothesis evolutionary method, termed ML-Ω, allows us to go from data to differential equation(s) underlying the physical (chemical, engineering, etc.) phenomena from which the data are derived from. The fundamental equations of physics can be derived from this ML-Ω evolutionary method when the proper training data is used. By training the ML-Ω model function with only three hydrogen-like atom energies, the method can find Schrödinger's exact functional and, from it, Schrödinger's fundamental equation. Then, in the field of density functional theory (DFT), when the model function is trained with the energies from the known Thomas-Fermi (TF) formula E = - 0.7687 Z 7 / 3 , it correctly finds the exact TF functional. Finally, the method is applied to find a local orbital-free (OF) functional expression of the independent electron kinetic energy functional Ts based on the γTFλvW model. By considering the theoretical energies of only five atoms (He, Be, Ne, Mg, and Ar) as the training set, the evolutionary ML-Ω method finds an ML-Ω-OF-DFT local Ts functional (γTFλvW(0.964,1/4)) that outperforms all the OF-DFT functionals of a representative group. Moreover, our ML-Ω-OF functional overcomes the difficulty of LDA's and some local generalized gradient approximation (GGA)-DFT's functionals to describe the stretched bond region at the correct spin configuration of diatomic molecules. Nonsmooth and nonclosed form functionals can be considered in the ML-Ω model function and still be effectively trained. Although our evolutionary ML-Ω model function can work without an explicit prior-form functional, by using the techniques of symbolic regression, in this work, we exploit prior-form functional expressions to make the training process simpler and faster. The ML-Ω method can be considered at the intersection of ML and the natural sciences.
Collapse
Affiliation(s)
- Juan I Rodríguez
- Centro de Investigación en Ciencia Aplicada y Tecnología Avanzada, Unidad Querétaro, Instituto Politécnico Nacional, Cerro Blanco 141 Col. Colinas del Cimatario, Querétaro C.P. 76090, México
| | - Ulises A Vergara-Beltrán
- Escuela Superior de Física y Matemáticas, Instituto Politécnico Nacional, Edificio 9, Zacatenco. Col. San Pedro Zacatenco, Ciudad de México C.P. 07738, México
| |
Collapse
|
5
|
Sit MK, Das S, Samanta K. Machine Learning-Assisted Mixed Quantum-Classical Dynamics without Explicit Nonadiabatic Coupling: Application to the Photodissociation of Peroxynitric Acid. J Phys Chem A 2024; 128:8244-8253. [PMID: 39283987 DOI: 10.1021/acs.jpca.4c02876] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/27/2024]
Abstract
We have devised a hybrid quantum-classical scheme utilizing machine-learned potential energy surfaces (PES), which circumvents the need for explicit computation of nonadiabatic coupling elements. The quantities necessary to account for the nonadiabatic effects are directly obtained from the PESs. The simulation of dynamics is based on the fewest-switches surface-hopping method. We applied this scheme to model the photodissociation of both N-O and O-O bonds in a conformer of peroxynitric acid (HO2NO2). Adiabatic PES data for the six lowest states of this molecule were computed at the CASSCF level for various nuclear configurations. These served as the training data for the machine-learning models for the PESs. The dynamics simulation was initiated on the lowest optically bright singlet excited state (S4) and propagated along the two Jacobi coordinates J → 1 and J → 2 while accounting for the nonadiabatic effects through transitions between PESs. Our analysis revealed that there is a very high chance of dissociation of the N-O bond leading to the HO2 and NO2 fragments.
Collapse
Affiliation(s)
- Mahesh K Sit
- School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Argul, Odisha 752050, India
| | - Subhasish Das
- School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Argul, Odisha 752050, India
| | - Kousik Samanta
- School of Basic Sciences, Indian Institute of Technology Bhubaneswar, Argul, Odisha 752050, India
| |
Collapse
|
6
|
Fang YB, Shang C, Liu ZP, Gong XG. Structural transitions in liquid semiconductor alloys: A molecular dynamics study with a neural network potential. J Chem Phys 2024; 161:104504. [PMID: 39258571 DOI: 10.1063/5.0223453] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2024] [Accepted: 08/26/2024] [Indexed: 09/12/2024] Open
Abstract
Liquid-liquid phase transitions hold a unique and profound significance within condensed matter physics. These transitions, while conceptually intriguing, often pose formidable computational challenges. However, recent advances in neural network (NN) potentials offer a promising avenue to effectively address these challenges. In this paper, we delve into the structural transitions of liquid CdTe, CdS, and their alloy systems using molecular dynamics simulations, harnessing the power of an NN potential named LaspNN. Our investigations encompass both pressure and temperature effects. Through our simulations, we uncover three primary liquid structures around melting points that emerge as pressure increases: tetrahedral, rock salt, and close-packed structures, which greatly resemble those of solid states. In the high-temperature regime, we observe the formation of Te chains and S dimers, providing a deeper understanding of the liquid's atomic arrangements. When examining CdSxTe1-x alloys, our findings indicate that a small substitution of S by Te atoms for S-rich alloys (x > 0.5) exhibits a structural transition much different from CdS, while a large substitution of Te by S atoms for Te-rich alloys (x < 0.5) barely exhibits a structural transition similar to CdTe. We construct a schematic diagram for liquid alloys that considers both temperature and pressure, providing a comprehensive overview of the alloy system's behavior. The local aggregation of Te atoms demonstrates a linear relationship with alloy composition x, whereas that of S atoms exhibits a nonlinear one, shedding light on the composition-dependent structural changes.
Collapse
Affiliation(s)
- Yi-Bin Fang
- Key Laboratory for Computational Physical Sciences (MOE), State Key Laboratory of Surface Physics, Department of Physics, Fudan University, Shanghai 200433, China
- Shanghai Qi Zhi Institute, Shanghai 200232, China
| | - Cheng Shang
- Shanghai Qi Zhi Institute, Shanghai 200232, China
- Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Zhi-Pan Liu
- Shanghai Qi Zhi Institute, Shanghai 200232, China
- Collaborative Innovation Center of Chemistry for Energy Materials (iChEM), Shanghai Key Laboratory of Molecular Catalysis and Innovative Materials, Key Laboratory of Computational Physical Science, Department of Chemistry, Fudan University, Shanghai 200433, China
| | - Xin-Gao Gong
- Key Laboratory for Computational Physical Sciences (MOE), State Key Laboratory of Surface Physics, Department of Physics, Fudan University, Shanghai 200433, China
- Shanghai Qi Zhi Institute, Shanghai 200232, China
| |
Collapse
|
7
|
Wang J, Hei H, Zheng Y, Zhang H, Ye H. Five-Site Water Models for Ice and Liquid Water Generated by a Series-Parallel Machine Learning Strategy. J Chem Theory Comput 2024; 20:7533-7545. [PMID: 39133036 DOI: 10.1021/acs.jctc.4c00440] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/13/2024]
Abstract
Icing, a common natural phenomenon, always originates from a molecule. Molecular simulation is crucial for understanding the relevant process but still faces a great challenge in obtaining a uniform and accurate description of ice and liquid water with limited model parameters. Here, we propose a series-parallel machine learning (ML) approach consisting of a classification back-propagation neural network (BPNN), parallel regression BPNNs, and a genetic algorithm to establish conventional TIP5P-BG and temperature-dependent TIP5P-BGT models. The established water models exhibit a comprehensive balance among the crucial physical properties (melting point, density, vaporization enthalpy, self-diffusion coefficient, and viscosity) with mean absolute percentage errors of 2.65 and 2.40%, respectively, and excellent predictive performance on the related properties of liquid water. For ice, the simulation results on the critical nucleus size and growth rate are in good accordance with experiments. This work offers a powerful molecular model for phase transition and icing in nanoconfinement and a construction strategy for a complex molecular model in the extreme case.
Collapse
Affiliation(s)
- Jian Wang
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| | - Haitao Hei
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| | - Yonggang Zheng
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
- DUT-BSU Joint Institute, Dalian University of Technology, Dalian 116024, P. R. China
| | - Hongwu Zhang
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| | - Hongfei Ye
- International Research Center for Computational Mechanics, State Key Laboratory of Structural Analysis, Optimization and CAE Software for Industrial Equipment, Department of Engineering Mechanics, Faculty of Vehicle Engineering and Mechanics, Dalian University of Technology, Dalian 116024, P. R. China
| |
Collapse
|
8
|
Shahbazi F, Esfahani MN, Keshmiri A, Jabbari M. Assessment of machine learning models trained by molecular dynamics simulations results for inferring ethanol adsorption on an aluminium surface. Sci Rep 2024; 14:20437. [PMID: 39227616 PMCID: PMC11372171 DOI: 10.1038/s41598-024-71007-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Accepted: 08/23/2024] [Indexed: 09/05/2024] Open
Abstract
Molecular dynamics (MD) simulations can reduce our need for experimental tests and provide detailed insight into the chemical reactions and binding kinetics. There are two challenges while dealing with MD simulations: one is the time and length scale limitations, and the latter is efficiently processing the massive amount of data resulting from the MD simulations and generating the proper reaction rates. In this work, we evaluated the use of regression machine learning (ML) methods to solve these two challenges by developing a framework for ethanol adsorption on an Aluminium (Al) slab. This framework comprises three main stages: first, an all-atom molecular dynamics model; second, ML regression models; and third, validation and testing. In stage one, the adsorption of ethanol molecules on the Al surface for various temperatures, velocities and concentrations is simulated using the large-scale atomic/molecular massively parallel simulator (LAMMPS) and ReaxFF. The outcome of stage one is utilised for training, testing, and validating the predictive models in stages two and three. We developed and evaluated 28 different ML models for predicting the number of adsorbed molecules over time, including linear regression, support vector machine (SVM), decision trees, ensemble, Gaussian process regression (GPR), neural network (NN) and Bayesian hyper-parameter optimisation models. Based on the results, the Bayesian-based GPR showed the highest accuracy and the lowest training time. The developed model can predict the number of adsorbed molecules for new cases within seconds, while MD simulations take a few weeks. This adsorption rate can then be used in macroscale simulations to tackle the time and length scale limitations. The proposed numerical framework has the potential to be generalised and, therefore, contribute to future low-cost binding reaction estimations, providing a valuable tool for industry and experimentalists.
Collapse
Affiliation(s)
- Fatemeh Shahbazi
- Warwick Manufacturing Group (WMG), University of Warwick, Coventry, CV4 7AL, UK.
- School of Engineering, University of Manchester, Manchester, M13 9PL, UK.
| | | | - Amir Keshmiri
- School of Engineering, University of Manchester, Manchester, M13 9PL, UK
| | - Masoud Jabbari
- School of Mechanical Engineering, University of Leeds, Leeds, LS2 9JT, UK
| |
Collapse
|
9
|
Gubler M, Finkler JA, Schäfer MR, Behler J, Goedecker S. Accelerating Fourth-Generation Machine Learning Potentials Using Quasi-Linear Scaling Particle Mesh Charge Equilibration. J Chem Theory Comput 2024; 20. [PMID: 39151921 PMCID: PMC11360134 DOI: 10.1021/acs.jctc.4c00334] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 07/16/2024] [Accepted: 07/17/2024] [Indexed: 08/19/2024]
Abstract
Machine learning potentials (MLPs) have revolutionized the field of atomistic simulations by describing atomic interactions with the accuracy of electronic structure methods at a small fraction of the cost. Most current MLPs construct the energy of a system as a sum of atomic energies, which depend on information about the atomic environments provided in the form of predefined or learnable feature vectors. If, in addition, nonlocal phenomena like long-range charge transfer are important, fourth-generation MLPs need to be used, which include a charge equilibration (Qeq) step to take the global structure of the system into account. This Qeq can significantly increase the computational cost and thus can become a computational bottleneck for large systems. In this Article, we present a highly efficient formulation of Qeq that does not require the explicit computation of the Coulomb matrix elements, resulting in a quasi-linear scaling method. Moreover, our approach also allows for the efficient calculation of energy derivatives, which explicitly consider the global structure-dependence of the atomic charges as obtained from Qeq. Due to its generality, the method is not restricted to MLPs and can also be applied within a variety of other force fields.
Collapse
Affiliation(s)
- Moritz Gubler
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| | - Jonas A. Finkler
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| | - Moritz R. Schäfer
- Lehrstuhl
für Theoretische Chemie II, Ruhr-Universität
Bochum, 44780 Bochum, Germany
- Research
Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Jörg Behler
- Lehrstuhl
für Theoretische Chemie II, Ruhr-Universität
Bochum, 44780 Bochum, Germany
- Research
Center Chemical Sciences and Sustainability, Research Alliance Ruhr, 44780 Bochum, Germany
| | - Stefan Goedecker
- Department
of Physics, University of Basel, Klingelbergstrasse 82, CH-4056 Basel, Switzerland
| |
Collapse
|
10
|
Latham AP, Tempkin JOB, Otsuka S, Zhang W, Ellenberg J, Sali A. Integrative spatiotemporal modeling of biomolecular processes: application to the assembly of the Nuclear Pore Complex. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.08.06.606842. [PMID: 39149317 PMCID: PMC11326192 DOI: 10.1101/2024.08.06.606842] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Dynamic processes involving biomolecules are essential for the function of the cell. Here, we introduce an integrative method for computing models of these processes based on multiple heterogeneous sources of information, including time-resolved experimental data and physical models of dynamic processes. We first compute integrative structure models at fixed time points and then optimally select and connect these snapshots into a series of trajectories that optimize the likelihood of both the snapshots and transitions between them. The method is demonstrated by application to the assembly process of the human Nuclear Pore Complex in the context of the reforming nuclear envelope during mitotic cell division, based on live-cell correlated electron tomography, bulk fluorescence correlation spectroscopy-calibrated quantitative live imaging, and a structural model of the fully-assembled Nuclear Pore Complex. Modeling of the assembly process improves the model precision over static integrative structure modeling alone. The method is applicable to a wide range of time-dependent systems in cell biology, and is available to the broader scientific community through an implementation in the open source Integrative Modeling Platform software.
Collapse
Affiliation(s)
- Andrew P Latham
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Jeremy O B Tempkin
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Shotaro Otsuka
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Wanlu Zhang
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Jan Ellenberg
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Department of Pharmaceutical Chemistry, Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA
| |
Collapse
|
11
|
Frank JT, Unke OT, Müller KR, Chmiela S. A Euclidean transformer for fast and stable machine learned force fields. Nat Commun 2024; 15:6539. [PMID: 39107296 PMCID: PMC11303804 DOI: 10.1038/s41467-024-50620-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2023] [Accepted: 07/10/2024] [Indexed: 08/10/2024] Open
Abstract
Recent years have seen vast progress in the development of machine learned force fields (MLFFs) based on ab-initio reference calculations. Despite achieving low test errors, the reliability of MLFFs in molecular dynamics (MD) simulations is facing growing scrutiny due to concerns about instability over extended simulation timescales. Our findings suggest a potential connection between robustness to cumulative inaccuracies and the use of equivariant representations in MLFFs, but the computational cost associated with these representations can limit this advantage in practice. To address this, we propose a transformer architecture called SO3KRATES that combines sparse equivariant representations (Euclidean variables) with a self-attention mechanism that separates invariant and equivariant information, eliminating the need for expensive tensor products. SO3KRATES achieves a unique combination of accuracy, stability, and speed that enables insightful analysis of quantum properties of matter on extended time and system size scales. To showcase this capability, we generate stable MD trajectories for flexible peptides and supra-molecular structures with hundreds of atoms. Furthermore, we investigate the PES topology for medium-sized chainlike molecules (e.g., small peptides) by exploring thousands of minima. Remarkably, SO3KRATES demonstrates the ability to strike a balance between the conflicting demands of stability and the emergence of new minimum-energy conformations beyond the training data, which is crucial for realistic exploration tasks in the field of biochemistry.
Collapse
Affiliation(s)
- J Thorben Frank
- Machine Learning Group, TU Berlin, Berlin, Germany
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| | | | - Klaus-Robert Müller
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
- Google DeepMind, Berlin, Germany.
- Department of Artificial Intelligence, Korea University, Seoul, Korea.
- Max Planck Institut für Informatik, Saarbrücken, Germany.
| | - Stefan Chmiela
- Machine Learning Group, TU Berlin, Berlin, Germany.
- BIFOLD, Berlin Institute for the Foundations of Learning and Data, Berlin, Germany.
| |
Collapse
|
12
|
Frasnetti E, Magni A, Castelli M, Serapian SA, Moroni E, Colombo G. Structures, dynamics, complexes, and functions: From classic computation to artificial intelligence. Curr Opin Struct Biol 2024; 87:102835. [PMID: 38744148 DOI: 10.1016/j.sbi.2024.102835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/14/2024] [Accepted: 04/22/2024] [Indexed: 05/16/2024]
Abstract
Computational approaches can provide highly detailed insight into the molecular recognition processes that underlie drug binding, the assembly of protein complexes, and the regulation of biological functional processes. Classical simulation methods can bridge a wide range of length- and time-scales typically involved in such processes. Lately, automated learning and artificial intelligence methods have shown the potential to expand the reach of physics-based approaches, ushering in the possibility to model and even design complex protein architectures. The synergy between atomistic simulations and AI methods is an emerging frontier with a huge potential for advances in structural biology. Herein, we explore various examples and frameworks for these approaches, providing select instances and applications that illustrate their impact on fundamental biomolecular problems.
Collapse
Affiliation(s)
- Elena Frasnetti
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Andrea Magni
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Matteo Castelli
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Stefano A Serapian
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | | | - Giorgio Colombo
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy.
| |
Collapse
|
13
|
Hameed S, Sharif S, Ovais M, Xiong H. Emerging trends and future challenges of advanced 2D nanomaterials for combating bacterial resistance. Bioact Mater 2024; 38:225-257. [PMID: 38745587 PMCID: PMC11090881 DOI: 10.1016/j.bioactmat.2024.04.033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/25/2024] [Accepted: 04/29/2024] [Indexed: 05/16/2024] Open
Abstract
The number of multi-drug-resistant bacteria has increased over the last few decades, which has caused a detrimental impact on public health worldwide. In resolving antibiotic resistance development among different bacterial communities, new antimicrobial agents and nanoparticle-based strategies need to be designed foreseeing the slow discovery of new functioning antibiotics. Advanced research studies have revealed the significant disinfection potential of two-dimensional nanomaterials (2D NMs) to be severed as effective antibacterial agents due to their unique physicochemical properties. This review covers the current research progress of 2D NMs-based antibacterial strategies based on an inclusive explanation of 2D NMs' impact as antibacterial agents, including a detailed introduction to each possible well-known antibacterial mechanism. The impact of the physicochemical properties of 2D NMs on their antibacterial activities has been deliberated while explaining the toxic effects of 2D NMs and discussing their biomedical significance, dysbiosis, and cellular nanotoxicity. Adding to the challenges, we also discussed the major issues regarding the current quality and availability of nanotoxicity data. However, smart advancements are required to fabricate biocompatible 2D antibacterial NMs and exploit their potential to combat bacterial resistance clinically.
Collapse
Affiliation(s)
- Saima Hameed
- Institute for Advanced Study, Shenzhen University, Shenzhen, 518060, PR China
- School of Physics and Optoelectronic Engineering, Shenzhen University, Shenzhen, 518060, PR China
| | - Sumaira Sharif
- Institute of Molecular Biology and Biotechnology, The University of Lahore, Lahore, Pakistan
| | - Muhammad Ovais
- BGI Genomics, BGI Shenzhen, Shenzhen, 518083, Guangdong, PR China
| | - Hai Xiong
- Institute for Advanced Study, Shenzhen University, Shenzhen, 518060, PR China
| |
Collapse
|
14
|
Malenfant-Thuot O, Ryczko K, Tamblyn I, Côté M. Efficient determination of Born-effective charges, LO-TO splitting, and Raman tensors of solids with a real-space atom-centered deep learning approach. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2024; 36:425901. [PMID: 39019077 DOI: 10.1088/1361-648x/ad64a2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Accepted: 07/17/2024] [Indexed: 07/19/2024]
Abstract
We introduce a deep neural network (DNN) framework called theReal-spaceAtomicDecompositionNETwork (radnet), which is capable of making accurate predictions of polarization and of electronic dielectric permittivity tensors in solids and aims to address limitations of previously available machine learning models for Raman predictions in periodic systems. This framework builds on previous, atom-centered approaches while utilizing deep convolutional neural networks. We report excellent accuracies on direct predictions for two prototypical examples: GaAs and BN. We then use automatic differentiation to efficiently calculate the Born-effective charges, longitudinal optical-transverse optical (LO-TO) splitting frequencies, and Raman tensors of these materials. We compute the Raman spectra, and find agreement withab initioresults. Lastly, we explore ways to generalize the predictions of polarization while taking into account periodic boundary conditions and symmetries.
Collapse
Affiliation(s)
- Olivier Malenfant-Thuot
- Département de physique et Institut Courtois, Université de Montréal, Montréal, Québec, Canada
| | - Kevin Ryczko
- Department of Physics, University of Ottawa, Ottawa, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
- SandboxAQ, Palo Alto, CA, United States of America
| | - Isaac Tamblyn
- Department of Physics, University of Ottawa, Ottawa, Ontario, Canada
- Vector Institute for Artificial Intelligence, Toronto, Ontario, Canada
| | - Michel Côté
- Département de physique et Institut Courtois, Université de Montréal, Montréal, Québec, Canada
| |
Collapse
|
15
|
Biriukov D, Vácha R. Pathways to a Shiny Future: Building the Foundation for Computational Physical Chemistry and Biophysics in 2050. ACS PHYSICAL CHEMISTRY AU 2024; 4:302-313. [PMID: 39069976 PMCID: PMC11274290 DOI: 10.1021/acsphyschemau.4c00003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/07/2024] [Revised: 03/15/2024] [Accepted: 03/18/2024] [Indexed: 07/30/2024]
Abstract
In the last quarter-century, the field of molecular dynamics (MD) has undergone a remarkable transformation, propelled by substantial enhancements in software, hardware, and underlying methodologies. In this Perspective, we contemplate the future trajectory of MD simulations and their possible look at the year 2050. We spotlight the pivotal role of artificial intelligence (AI) in shaping the future of MD and the broader field of computational physical chemistry. We outline critical strategies and initiatives that are essential for the seamless integration of such technologies. Our discussion delves into topics like multiscale modeling, adept management of ever-increasing data deluge, the establishment of centralized simulation databases, and the autonomous refinement, cross-validation, and self-expansion of these repositories. The successful implementation of these advancements requires scientific transparency, a cautiously optimistic approach to interpreting AI-driven simulations and their analysis, and a mindset that prioritizes knowledge-motivated research alongside AI-enhanced big data exploration. While history reminds us that the trajectory of technological progress can be unpredictable, this Perspective offers guidance on preparedness and proactive measures, aiming to steer future advancements in the most beneficial and successful direction.
Collapse
Affiliation(s)
- Denys Biriukov
- CEITEC
− Central European Institute of Technology, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
- National
Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
| | - Robert Vácha
- CEITEC
− Central European Institute of Technology, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
- National
Centre for Biomolecular Research, Faculty of Science, Masaryk University, Kamenice 753/5, 625 00 Brno, Czech Republic
- Department
of Condensed Matter Physics, Faculty of Science, Masaryk University, Kotlářská 267/2, 611 37 Brno, Czech
Republic
| |
Collapse
|
16
|
McCandler C, Pihlajamäki A, Malola S, Häkkinen H, Persson KA. Gold-Thiolate Nanocluster Dynamics and Intercluster Reactions Enabled by a Machine Learned Interatomic Potential. ACS NANO 2024; 18:19014-19023. [PMID: 38986022 PMCID: PMC11271183 DOI: 10.1021/acsnano.4c03094] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 06/26/2024] [Accepted: 06/28/2024] [Indexed: 07/12/2024]
Abstract
Monolayer protected metal clusters comprise a rich class of molecular systems and are promising candidate materials for a variety of applications. While a growing number of protected nanoclusters have been synthesized and characterized in crystalline forms, their dynamical behavior in solution, including prenucleation cluster formation, is not well understood due to limitations both in characterization and first-principles modeling techniques. Recent advancements in machine-learned interatomic potentials are rapidly enabling the study of complex interactions such as dynamical behavior and reactivity on the nanoscale. Here, we develop an Au-S-C-H atomic cluster expansion (ACE) interatomic potential for efficient and accurate molecular dynamics simulations of thiolate-protected gold nanoclusters (Aun(SCH3)m). Trained on more than 30,000 density functional theory calculations of gold nanoclusters, the interatomic potential exhibits ab initio level accuracy in energies and forces and replicates nanocluster dynamics including thermal vibration and chiral inversion. Long dynamics simulations (up to 0.1 μs time scale) reveal a mechanism explaining the thermal instability of neutral Au25(SR)18 clusters. Specifically, we observe multiple stages of isomerization of the Au25(SR)18 cluster, including a chiral isomer. Additionally, we simulate coalescence of two Au25(SR)18 clusters and observe series of clusters where the formation mechanisms are critically mediated by ligand exchange in the form of [Au-S]n rings.
Collapse
Affiliation(s)
- Caitlin
A. McCandler
- Department
of Materials Science and Engineering, University
of California Berkeley, Berkeley, California 94720, United States
- Materials
Science Division, Lawrence Berkeley National
Laboratory, Berkeley, California 94720, United States
| | - Antti Pihlajamäki
- Department
of Physics, Nanoscience Center, University
of Jyväskylä, FI 40014 Jyväskylä, Finland
| | - Sami Malola
- Department
of Physics, Nanoscience Center, University
of Jyväskylä, FI 40014 Jyväskylä, Finland
| | - Hannu Häkkinen
- Department
of Physics, Nanoscience Center, University
of Jyväskylä, FI 40014 Jyväskylä, Finland
- Department
of Chemistry, Nanoscience Center, University
of Jyväskylä, FI 40014 Jyväskylä, Finland
| | - Kristin A. Persson
- Department
of Materials Science and Engineering, University
of California Berkeley, Berkeley, California 94720, United States
- Molecular
Foundry, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| |
Collapse
|
17
|
Xie JZ, Zhou XY, Jin B, Jiang H. Machine Learning Force Field-Aided Cluster Expansion Approach to Phase Diagram of Alloyed Materials. J Chem Theory Comput 2024; 20:6207-6217. [PMID: 38940547 DOI: 10.1021/acs.jctc.4c00463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024]
Abstract
First-principles approaches based on density functional theory (DFT) have played important roles in the theoretical study of multicomponent alloyed materials. Considering the highly demanding computational cost of direct DFT-based sampling of the configurational space, it is crucial to build efficient and low-cost surrogate Hamiltonian models with DFT accuracy for efficient simulation of alloyed systems with configurational disorder. Recently, the machine learning force field (MLFF) method has been proposed to tackle complicated multicomponent disordered systems. However, the importance of integrating significant physical considerations, including, in particular, convex hull preservation, which is the prerequisite for the accurate prediction of phase diagrams, into the training process of the MLFF remains rarely addressed. In this work, a workflow is proposed to train a convex-hull-preserved (CHP) MLFF for binary alloy systems, based on which the order-disorder phase boundary is predicted by using the Wang-Landau Monte Carlo (WLMC) technique. The predicted values for order-disorder phase transition temperatures agree well with the experiment. The CHP-MLFF is further used to build CE models with the same accuracy as the MLFF and higher efficiency in sampling configurational space. Using the results obtained from the MLFF-based WLMC simulation as a reference, the performances of different schemes for constructing CE models were evaluated in a transparent manner, which revealed the close correlation between the prediction accuracy of ground-state configurations and that of the order-disorder phase transition temperature. This work clearly indicates the great importance of reproducing the convex hull and energetics of ground-state configurations when constructing surrogate Hamiltonians for the statistical modeling of alloyed systems.
Collapse
Affiliation(s)
- Jun-Zhong Xie
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Rare Earth Material Chemistry and Application, Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, 100871 Beijing, China
| | - Xu-Yuan Zhou
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Rare Earth Material Chemistry and Application, Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, 100871 Beijing, China
| | - Bin Jin
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Rare Earth Material Chemistry and Application, Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, 100871 Beijing, China
| | - Hong Jiang
- Beijing National Laboratory for Molecular Sciences, State Key Laboratory of Rare Earth Material Chemistry and Application, Institute of Theoretical and Computational Chemistry, College of Chemistry and Molecular Engineering, Peking University, 100871 Beijing, China
| |
Collapse
|
18
|
Bas TG, Duarte V. Biosimilars in the Era of Artificial Intelligence-International Regulations and the Use in Oncological Treatments. Pharmaceuticals (Basel) 2024; 17:925. [PMID: 39065775 PMCID: PMC11279612 DOI: 10.3390/ph17070925] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 07/02/2024] [Accepted: 07/03/2024] [Indexed: 07/28/2024] Open
Abstract
This research is based on three fundamental aspects of successful biosimilar development in the challenging biopharmaceutical market. First, biosimilar regulations in eight selected countries: Japan, South Korea, the United States, Canada, Brazil, Argentina, Australia, and South Africa, represent the four continents. The regulatory aspects of the countries studied are analyzed, highlighting the challenges facing biosimilars, including their complex approval processes and the need for standardized regulatory guidelines. There is an inconsistency depending on whether the biosimilar is used in a developed or developing country. In the countries observed, biosimilars are considered excellent alternatives to patent-protected biological products for the treatment of chronic diseases. In the second aspect addressed, various analytical AI modeling methods (such as machine learning tools, reinforcement learning, supervised, unsupervised, and deep learning tools) were analyzed to observe patterns that lead to the prevalence of biosimilars used in cancer to model the behaviors of the most prominent active compounds with spectroscopy. Finally, an analysis of the use of active compounds of biosimilars used in cancer and approved by the FDA and EMA was proposed.
Collapse
Affiliation(s)
- Tomas Gabriel Bas
- Escuela de Ciencias Empresariales, Universidad Católica del Norte, Coquimbo 1781421, Chile;
| | | |
Collapse
|
19
|
Yu R, Wang R. Learning dynamical systems from data: An introduction to physics-guided deep learning. Proc Natl Acad Sci U S A 2024; 121:e2311808121. [PMID: 38913886 PMCID: PMC11228478 DOI: 10.1073/pnas.2311808121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/26/2024] Open
Abstract
Modeling complex physical dynamics is a fundamental task in science and engineering. Traditional physics-based models are first-principled, explainable, and sample-efficient. However, they often rely on strong modeling assumptions and expensive numerical integration, requiring significant computational resources and domain expertise. While deep learning (DL) provides efficient alternatives for modeling complex dynamics, they require a large amount of labeled training data. Furthermore, its predictions may disobey the governing physical laws and are difficult to interpret. Physics-guided DL aims to integrate first-principled physical knowledge into data-driven methods. It has the best of both worlds and is well equipped to better solve scientific problems. Recently, this field has gained great progress and has drawn considerable interest across discipline Here, we introduce the framework of physics-guided DL with a special emphasis on learning dynamical systems. We describe the learning pipeline and categorize state-of-the-art methods under this framework. We also offer our perspectives on the open challenges and emerging opportunities.
Collapse
Affiliation(s)
- Rose Yu
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093
| | - Rui Wang
- Department of Computer Science and Engineering, University of California, San Diego, CA 92093
- Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139
| |
Collapse
|
20
|
Dettmann LF, Kühn O, Ahmed AA. Martini-Based Coarse-Grained Soil Organic Matter Model Derived from Atomistic Simulations. J Chem Theory Comput 2024; 20:5291-5305. [PMID: 38831535 DOI: 10.1021/acs.jctc.4c00332] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
The significance of soil organic matter (SOM) in environmental contexts, particularly its role in pollutant adsorption, has prompted an increased utilization of molecular simulations to understand microscopic interactions. This study introduces a coarse-grained SOM model, parametrized within the framework of the versatile Martini 3 force field. Utilizing models generated by the Vienna Soil Organic Matter Modeler 2, which constructs humic substance systems from a fragment database, we employed Swarm-CG to parametrize the fragments and subsequently assembled them into macromolecules. Direct Boltzmann inversion (DBI) facilitated the determination of bonded parameters between fragments. The parametrization yielded favorable agreement in the radius of gyration and solvent-accessible surface area. Transfer free energies exhibited a strong correlation with hexadecane-water and chloroform-water values, albeit deviations were noted for octanol-water values. Comparing densities of modeled Leonardite humic acid systems at coarse-grained and atomistic levels revealed promising agreement, particularly at higher water concentrations. The DBI approach effectively reproduced average values of bonded interactions between fragments. Radial distribution functions between carboxylate groups and calcium ions showed partial agreement, however, reproducing certain peaks was challenging due to fixed bead sizes. Detailed analysis of atomistic systems revealed different configurations between the groups, explaining discrepancies. The present contribution provides a comprehensive insight into the properties, strengths, and weaknesses of the coarse-grained SOM model, serving as a foundation for future investigations encompassing pollutant interactions and varied SOM compositions.
Collapse
Affiliation(s)
- Lorenz F Dettmann
- Institute of Physics, University of Rostock, Albert-Einstein-Street 23-24, Rostock D-18059, Germany
| | - Oliver Kühn
- Institute of Physics, University of Rostock, Albert-Einstein-Street 23-24, Rostock D-18059, Germany
- Department of Life, Light and Matter (LLM), University of Rostock, Albert-Einstein-Street 25, Rostock D-18059, Germany
| | - Ashour A Ahmed
- Department of Life, Light and Matter (LLM), University of Rostock, Albert-Einstein-Street 25, Rostock D-18059, Germany
| |
Collapse
|
21
|
Mao M, Ahrens L, Luka J, Contreras F, Kurkina T, Bienstein M, Sárria Pereira de Passos M, Schirinzi G, Mehn D, Valsesia A, Desmet C, Serra MÁ, Gilliland D, Schwaneberg U. Material-specific binding peptides empower sustainable innovations in plant health, biocatalysis, medicine and microplastic quantification. Chem Soc Rev 2024; 53:6445-6510. [PMID: 38747901 DOI: 10.1039/d2cs00991a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2024]
Abstract
Material-binding peptides (MBPs) have emerged as a diverse and innovation-enabling class of peptides in applications such as plant-/human health, immobilization of catalysts, bioactive coatings, accelerated polymer degradation and analytics for micro-/nanoplastics quantification. Progress has been fuelled by recent advancements in protein engineering methodologies and advances in computational and analytical methodologies, which allow the design of, for instance, material-specific MBPs with fine-tuned binding strength for numerous demands in material science applications. A genetic or chemical conjugation of second (biological, chemical or physical property-changing) functionality to MBPs empowers the design of advanced (hybrid) materials, bioactive coatings and analytical tools. In this review, we provide a comprehensive overview comprising naturally occurring MBPs and their function in nature, binding properties of short man-made MBPs (<20 amino acids) mainly obtained from phage-display libraries, and medium-sized binding peptides (20-100 amino acids) that have been reported to bind to metals, polymers or other industrially produced materials. The goal of this review is to provide an in-depth understanding of molecular interactions between materials and material-specific binding peptides, and thereby empower the use of MBPs in material science applications. Protein engineering methodologies and selected examples to tailor MBPs toward applications in agriculture with a focus on plant health, biocatalysis, medicine and environmental monitoring serve as examples of the transformative power of MBPs for various industrial applications. An emphasis will be given to MBPs' role in detecting and quantifying microplastics in high throughput, distinguishing microplastics from other environmental particles, and thereby assisting to close an analytical gap in food safety and monitoring of environmental plastic pollution. In essence, this review aims to provide an overview among researchers from diverse disciplines in respect to material-(specific) binding of MBPs, protein engineering methodologies to tailor their properties to application demands, re-engineering for material science applications using MBPs, and thereby inspire researchers to employ MBPs in their research.
Collapse
Affiliation(s)
- Maochao Mao
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Leon Ahrens
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Julian Luka
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Francisca Contreras
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Tetiana Kurkina
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | - Marian Bienstein
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| | | | | | - Dora Mehn
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | - Andrea Valsesia
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | - Cloé Desmet
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| | | | | | - Ulrich Schwaneberg
- Lehrstuhl für Biotechnologie, RWTH Aachen University, Worringerweg 3, 52074 Aachen, Germany.
| |
Collapse
|
22
|
Fu W, Mo Y, Xiao Y, Liu C, Zhou F, Wang Y, Zhou J, Zhang YJ. Enhancing Molecular Energy Predictions with Physically Constrained Modifications to the Neural Network Potential. J Chem Theory Comput 2024; 20:4533-4544. [PMID: 38828925 DOI: 10.1021/acs.jctc.3c01181] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2024]
Abstract
Exclusively prioritizing the precision of energy prediction frequently proves inadequate in satisfying multifaceted requirements. A heightened focus is warranted on assessing the rationality of potential energy curves predicted by machine learning-based force fields (MLFFs), alongside evaluating the pragmatic utility of these MLFFs. This study introduces SWANI, an optimized neural network potential stemming from the ANI framework. Through the incorporation of supplementary physical constraints, SWANI aligns more cohesively with chemical expectations, yielding rational potential energy profiles. It also exhibits superior predictive precision compared with that of the ANI model. Additionally, a comprehensive comparison is conducted between SWANI and a prominent graph neural network-based model. The findings indicate that SWANI outperforms the latter, particularly for molecules exceeding the dimensions of the training set. This outcome underscores SWANI's exceptional capacity for generalization and its proficiency in handling larger molecular systems.
Collapse
Affiliation(s)
- Weiqiang Fu
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yujie Mo
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yi Xiao
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Chang Liu
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Feng Zhou
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yang Wang
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Jielong Zhou
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| | - Yingsheng J Zhang
- Beijing StoneWise Technology Co., Ltd., Haidian Street 15, Haidian District, Beijing 100080, China
| |
Collapse
|
23
|
Dong J, Wang S, Cui W, Sun X, Guo H, Yan H, Vogel H, Wang Z, Yuan S. Machine Learning Deciphered Molecular Mechanistics with Accurate Kinetic and Thermodynamic Prediction. J Chem Theory Comput 2024; 20:4499-4513. [PMID: 38394691 DOI: 10.1021/acs.jctc.3c01412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2024]
Abstract
Time-lagged independent component analysis (tICA) and the Markov state model (MSM) have been extensively employed for extracting conformational dynamics and kinetic community networks from unbiased trajectory ensembles. However, these techniques may not be the optimal choice for elucidating transition mechanisms within low-dimensional representations, especially for intricate biosystems. Unraveling the association mechanism in such complex systems always necessitates permutations of several essential independent components or collective variables, a process that is inherently obscure and may require empirical knowledge for selection. To address these challenges, we have implemented an integrated unsupervised dimension reduction model: uniform manifold approximation and projection (UMAP) with hierarchy density-based spatial clustering of applications with noise (HDBSCAN). This approach effectively generates low-dimensional configurational embeddings. The hierarchical application of this architecture, in conjunction with MSM, reveals global kinetic connectivity while identifying local conformational states. Consequently, our methodology establishes a multiscale mechanistic elucidation framework. Leveraging the benefits of the uniform sample distribution and a denoising approach, our model demonstrates robustness in preserving global and local data structures compared to traditional dimension reduction methods in the field of MD analysis area. The interpretability of hyperparameter selection and compatibility with downstream tasks are cross-validated across various simulation data sets, utilizing both computational evaluation metrics and experimental kinetic observables. Furthermore, the predicted Mcl1-BH3 association kinetics (0.76 s-1) is in close agreement with surface plasmon resonance experiments (0.12 s-1), affirming the plausibility of the identified pathway composed of representative conformations. We anticipate that the devised workflow will serve as a foundational framework for studying recognition patterns in complex biological systems. Its contributions extend to the exploration of protein functional dynamics and rational drug design, offering a potent avenue for advancing research in these domains.
Collapse
Affiliation(s)
- Junlin Dong
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Shiyu Wang
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100049, China
- AlphaMol Science Ltd, Shenzhen 518055, China
| | - Wenqiang Cui
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaolin Sun
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Haojie Guo
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Hailu Yan
- School of Biological Sciences, College of Science and Engineering, University of Edinburgh, Edinburgh EH8 9YL, U.K
| | - Horst Vogel
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
| | - Zhi Wang
- Artificial Intelligence Department, Zhejiang Financial College, Hangzhou 310018, China
| | - Shuguang Yuan
- Research Center for Computer-Aided Drug Discovery, Institute of Biomedicine and Biotechnology, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China
- AlphaMol Science Ltd, Shenzhen 518055, China
| |
Collapse
|
24
|
Cavasotto CN, Di Filippo JI, Scardino V. Lessons learnt from machine learning in early stages of drug discovery. Expert Opin Drug Discov 2024; 19:631-633. [PMID: 38727031 DOI: 10.1080/17460441.2024.2354279] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Accepted: 05/08/2024] [Indexed: 05/22/2024]
Affiliation(s)
- Claudio N Cavasotto
- Computational Drug Design and Biomedical Informatics Laboratory, Instituto de Investigaciones en Medicina Traslacional (IIMT), CONICET-Universidad Austral, Pilar, Buenos Aires, Argentina
- Facultad de Ciencias Biomédicas, Universidad Austral, Pilar, Buenos Aires, Argentina
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Argentina
| | - Juan I Di Filippo
- Facultad de Ciencias Biomédicas, Universidad Austral, Pilar, Buenos Aires, Argentina
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Argentina
- Meton AI, Inc, Wilmington, DE, USA
| | - Valeria Scardino
- Austral Institute for Applied Artificial Intelligence, Universidad Austral, Pilar, Argentina
- Meton AI, Inc, Wilmington, DE, USA
| |
Collapse
|
25
|
Vu MH, Robert PA, Akbar R, Swiatczak B, Sandve GK, Haug DTT, Greiff V. Linguistics-based formalization of the antibody language as a basis for antibody language models. NATURE COMPUTATIONAL SCIENCE 2024; 4:412-422. [PMID: 38877120 DOI: 10.1038/s43588-024-00642-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2022] [Accepted: 05/13/2024] [Indexed: 06/16/2024]
Abstract
Apparent parallels between natural language and antibody sequences have led to a surge in deep language models applied to antibody sequences for predicting cognate antigen recognition. However, a linguistic formal definition of antibody language does not exist, and insight into how antibody language models capture antibody-specific binding features remains largely uninterpretable. Here we describe how a linguistic formalization of the antibody language, by characterizing its tokens and grammar, could address current challenges in antibody language model rule mining.
Collapse
Affiliation(s)
- Mai Ha Vu
- Department of Linguistics and Scandinavian Studies, University of Oslo, Oslo, Norway.
| | - Philippe A Robert
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Rahmad Akbar
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway
| | - Bartlomiej Swiatczak
- Department of History of Science and Scientific Archeology, University of Science and Technology of China, Hefei, China
| | | | | | - Victor Greiff
- Department of Immunology, University of Oslo and Oslo University Hospital, Oslo, Norway.
| |
Collapse
|
26
|
Keller BG, Bolhuis PG. Dynamical Reweighting for Biased Rare Event Simulations. Annu Rev Phys Chem 2024; 75:137-162. [PMID: 38941527 DOI: 10.1146/annurev-physchem-083122-124538] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/30/2024]
Abstract
Dynamical reweighting techniques aim to recover the correct molecular dynamics from a simulation at a modified potential energy surface. They are important for unbiasing enhanced sampling simulations of molecular rare events. Here, we review the theoretical frameworks of dynamical reweighting for modified potentials. Based on an overview of kinetic models with increasing level of detail, we discuss techniques to reweight two-state dynamics, multistate dynamics, and path integrals. We explore the natural link to transition path sampling and how the effect of nonequilibrium forces can be reweighted. We end by providing an outlook on how dynamical reweighting integrates with techniques for optimizing collective variables and with modern potential energy surfaces.
Collapse
Affiliation(s)
- Bettina G Keller
- Department of Biology, Chemistry and Pharmacy, Freie Universität Berlin, Berlin, Germany;
| | - Peter G Bolhuis
- Van 't Hoff Institute for Molecular Sciences, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
27
|
Selloni A. Aqueous Titania Interfaces. Annu Rev Phys Chem 2024; 75:47-65. [PMID: 38271659 DOI: 10.1146/annurev-physchem-090722-015957] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2024]
Abstract
Water-metal oxide interfaces are central to many phenomena and applications, ranging from material corrosion and dissolution to photoelectrochemistry and bioengineering. In particular, the discovery of photocatalytic water splitting on TiO2 has motivated intensive studies of water-TiO2 interfaces for decades. So far, a broad understanding of the interaction of water vapor with several TiO2 surfaces has been obtained. However, much less is known about liquid water-TiO2 interfaces, which are more relevant to many practical applications. Probing these complex systems at the molecular level is experimentally challenging and is sometimes possible only through computational studies. This review summarizes recent advances in the atomistic understanding, mostly through computational simulations, of the structure and dynamics of interfacial water on TiO2 surfaces. The main focus is on the nature, molecular or dissociated, of water in direct contact with low-index defect-free crystalline surfaces. The hydroxyls resulting from water dissociation are essential in the photooxidation of water and critically affect the surface chemistry of TiO2.
Collapse
Affiliation(s)
- Annabella Selloni
- Department of Chemistry, Princeton University, Princeton, New Jersey, USA;
| |
Collapse
|
28
|
Yasuda I, Endo K, Arai N, Yasuoka K. In-layer inhomogeneity of molecular dynamics in quasi-liquid layers of ice. Commun Chem 2024; 7:117. [PMID: 38811834 PMCID: PMC11136980 DOI: 10.1038/s42004-024-01197-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2023] [Accepted: 05/02/2024] [Indexed: 05/31/2024] Open
Abstract
Quasi-liquid layers (QLLs) are present on the surface of ice and play a significant role in its distinctive chemical and physical properties. These layers exhibit considerable heterogeneity across different scales ranging from nanometers to millimeters. Although the formation of partially ice-like structures has been proposed, the molecular-level understanding of this heterogeneity remains unclear. Here, we examined the heterogeneity of molecular dynamics on QLLs based on molecular dynamics simulations and machine learning analysis of the simulation data. We demonstrated that the molecular dynamics of QLLs do not comprise a mixture of solid- and liquid water molecules. Rather, molecules having similar behaviors form dynamical domains that are associated with the dynamical heterogeneity of supercooled water. Nonetheless, molecules in the domains frequently switch their dynamical state. Furthermore, while there is no observable characteristic domain size, the long-range ordering strongly depends on the temperature and crystal face. Instead of a mixture of static solid- and liquid-like regions, our results indicate the presence of heterogeneous molecular dynamics in QLLs, which offers molecular-level insights into the surface properties of ice.
Collapse
Affiliation(s)
- Ikki Yasuda
- Department of Mechanical Engineering, Keio University, Yokohama, Japan
| | - Katsuhiro Endo
- Department of Mechanical Engineering, Keio University, Yokohama, Japan
| | - Noriyoshi Arai
- Department of Mechanical Engineering, Keio University, Yokohama, Japan
| | - Kenji Yasuoka
- Department of Mechanical Engineering, Keio University, Yokohama, Japan.
| |
Collapse
|
29
|
Grassano JS, Pickering I, Roitberg AE, González Lebrero MC, Estrin DA, Semelak JA. Assessment of Embedding Schemes in a Hybrid Machine Learning/Classical Potentials (ML/MM) Approach. J Chem Inf Model 2024; 64:4047-4058. [PMID: 38710065 DOI: 10.1021/acs.jcim.4c00478] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/08/2024]
Abstract
Machine learning (ML) methods have reached high accuracy levels for the prediction of in vacuo molecular properties. However, the simulation of large systems solely through ML methods (such as those based on neural network potentials) is still a challenge. In this context, one of the most promising frameworks for integrating ML schemes in the simulation of complex molecular systems are the so-called ML/MM methods. These multiscale approaches combine ML methods with classical force fields (MM), in the same spirit as the successful hybrid quantum mechanics-molecular mechanics methods (QM/MM). The key issue for such ML/MM methods is an adequate description of the coupling between the region of the system described by ML and the region described at the MM level. In the context of QM/MM schemes, the main ingredient of the interaction is electrostatic, and the state of the art is the so-called electrostatic-embedding. In this study, we analyze the quality of simpler mechanical embedding-based approaches, specifically focusing on their application within a ML/MM framework utilizing atomic partial charges derived in vacuo. Taking as reference electrostatic embedding calculations performed at a QM(DFT)/MM level, we explore different atomic charges schemes, as well as a polarization correction computed using atomic polarizabilites. Our benchmark data set comprises a set of about 80k small organic structures from the ANI-1x and ANI-2x databases, solvated in water. The results suggest that the minimal basis iterative stockholder (MBIS) atomic charges yield the best agreement with the reference coupling energy. Remarkable enhancements are achieved by including a simple polarization correction.
Collapse
Affiliation(s)
- Juan S Grassano
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| | - Ignacio Pickering
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Adrian E Roitberg
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
- Department of Chemistry, University of Florida, Gainesville, Florida 32611, United States
| | - Mariano C González Lebrero
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| | - Dario A Estrin
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| | - Jonathan A Semelak
- Facultad de Ciencias Exactas y Naturales, Departamento de Química Inorgánica, Analítica y Química Física, Universidad de Buenos Aires, Intendente Güiraldes 2160, Buenos Aires C1428EHA, Argentina
- CONICET─Universidad de Buenos Aires, Instituto de Química-Física de los Materiales, Medio Ambiente y Energía (INQUIMAE), Ciudad Universitaria, Pabellón 2, Buenos Aires C1428EHA, Argentina
| |
Collapse
|
30
|
Yan Z, Wei D, Li X, Chung LW. Accelerating reliable multiscale quantum refinement of protein-drug systems enabled by machine learning. Nat Commun 2024; 15:4181. [PMID: 38755151 PMCID: PMC11099068 DOI: 10.1038/s41467-024-48453-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2023] [Accepted: 04/24/2024] [Indexed: 05/18/2024] Open
Abstract
Biomacromolecule structures are essential for drug development and biocatalysis. Quantum refinement (QR) methods, which employ reliable quantum mechanics (QM) methods in crystallographic refinement, showed promise in improving the structural quality or even correcting the structure of biomacromolecules. However, vast computational costs and complex quantum mechanics/molecular mechanics (QM/MM) setups limit QR applications. Here we incorporate robust machine learning potentials (MLPs) in multiscale ONIOM(QM:MM) schemes to describe the core parts (e.g., drugs/inhibitors), replacing the expensive QM method. Additionally, two levels of MLPs are combined for the first time to overcome MLP limitations. Our unique MLPs+ONIOM-based QR methods achieve QM-level accuracy with significantly higher efficiency. Furthermore, our refinements provide computational evidence for the existence of bonded and nonbonded forms of the Food and Drug Administration (FDA)-approved drug nirmatrelvir in one SARS-CoV-2 main protease structure. This study highlights that powerful MLPs accelerate QRs for reliable protein-drug complexes, promote broader QR applications and provide more atomistic insights into drug development.
Collapse
Affiliation(s)
- Zeyin Yan
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Dacong Wei
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Xin Li
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Lung Wa Chung
- Shenzhen Grubbs Institute, Department of Chemistry and Guangdong Provincial Key Laboratory of Catalysis, Southern University of Science and Technology, Shenzhen, 518055, China.
| |
Collapse
|
31
|
da Hora GCA, Oh M, Nguyen JDM, Swanson JMJ. One Descriptor to Fold Them All: Harnessing Intuition and Machine Learning to Identify Transferable Lasso Peptide Reaction Coordinates. J Phys Chem B 2024; 128:4063-4075. [PMID: 38568862 PMCID: PMC11282586 DOI: 10.1021/acs.jpcb.3c08492] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2024]
Abstract
Identifying optimal reaction coordinates for complex conformational changes and protein folding remains an outstanding challenge. This study combines collective variable (CV) discovery based on chemical intuition and machine learning with enhanced sampling to converge the folding free energy landscape of lasso peptides, a unique class of natural products with knot-like tertiary structures. This knotted scaffold imparts remarkable stability, making lasso peptides resistant to proteolytic degradation, thermal denaturation, and extreme pH conditions. Although their direct synthesis would enable therapeutic design, it has not yet been possible due to the improbable occurrence of spontaneous lasso folding. Thus, simulations characterizing the folding propensity are needed to identify strategies for increasing access to the lasso architecture by stabilizing the pre-lasso ensemble before isopeptide bond formation. Herein, harmonic linear discriminant analysis (HLDA) is combined with metadynamics-enhanced sampling to discover CVs capable of distinguishing the pre-lasso fold and converging the folding propensity. Intuitive CVs are compared to iterative rounds of HLDA to identify CVs that not only accomplish these goals for one lasso peptide but also seem to be transferable to others, establishing a protocol for the identification of folding reaction coordinates for lasso peptides.
Collapse
Affiliation(s)
- Gabriel C A da Hora
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Myongin Oh
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - John D M Nguyen
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| | - Jessica M J Swanson
- Department of Chemistry, University of Utah, Salt Lake City, Utah 84112, United States
| |
Collapse
|
32
|
Shi Z, Lele AD, Jasper AW, Klippenstein SJ, Ju Y. Quasi-Classical Trajectory Calculation of Rate Constants Using an Ab Initio Trained Machine Learning Model (aML-MD) with Multifidelity Data. J Phys Chem A 2024; 128:3449-3457. [PMID: 38642065 DOI: 10.1021/acs.jpca.4c00750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/22/2024]
Abstract
Machine learning (ML) provides a great opportunity for the construction of models with improved accuracy in classical molecular dynamics (MD). However, the accuracy of a ML trained model is limited by the quality and quantity of the training data. Generating large sets of accurate ab initio training data can require significant computational resources. Furthermore, inconsistent or incompatible data with different accuracies obtained using different methods may lead to biased or unreliable ML models that do not accurately represent the underlying physics. Recently, transfer learning showed its potential for avoiding these problems as well as for improving the accuracy, efficiency, and generalization of ML models using multifidelity data. In this work, ab initio trained ML-based MD (aML-MD) models are developed through transfer learning using DFT and multireference data from multiple sources with varying accuracy within the Deep Potential MD framework. The accuracy of the force field is demonstrated by calculating rate constants for the H + HO2 → H2 + 3O2 reaction using quasi-classical trajectories. We show that the aML-MD model with transfer learning can accurately predict the rate constants while reducing the computational cost by more than five times compared to the use of more expensive quantum chemistry training data sets. Hence, the aML-MD model with transfer learning shows great potential in using multifidelity data to reduce the computational cost involved in generating the training set for these potentials.
Collapse
Affiliation(s)
- Zhiyu Shi
- Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Aditya Dilip Lele
- Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey 08544, United States
| | - Ahren W Jasper
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Stephen J Klippenstein
- Chemical Sciences and Engineering Division, Argonne National Laboratory, Lemont, Illinois 60439, United States
| | - Yiguang Ju
- Department of Mechanical and Aerospace Engineering, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
33
|
Janson G, Feig M. Transferable deep generative modeling of intrinsically disordered protein conformations. PLoS Comput Biol 2024; 20:e1012144. [PMID: 38781245 PMCID: PMC11152266 DOI: 10.1371/journal.pcbi.1012144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2024] [Revised: 06/05/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024] Open
Abstract
Intrinsically disordered proteins have dynamic structures through which they play key biological roles. The elucidation of their conformational ensembles is a challenging problem requiring an integrated use of computational and experimental methods. Molecular simulations are a valuable computational strategy for constructing structural ensembles of disordered proteins but are highly resource-intensive. Recently, machine learning approaches based on deep generative models that learn from simulation data have emerged as an efficient alternative for generating structural ensembles. However, such methods currently suffer from limited transferability when modeling sequences and conformations absent in the training data. Here, we develop a novel generative model that achieves high levels of transferability for intrinsically disordered protein ensembles. The approach, named idpSAM, is a latent diffusion model based on transformer neural networks. It combines an autoencoder to learn a representation of protein geometry and a diffusion model to sample novel conformations in the encoded space. IdpSAM was trained on a large dataset of simulations of disordered protein regions performed with the ABSINTH implicit solvent model. Thanks to the expressiveness of its neural networks and its training stability, idpSAM faithfully captures 3D structural ensembles of test sequences with no similarity in the training set. Our study also demonstrates the potential for generating full conformational ensembles from datasets with limited sampling and underscores the importance of training set size for generalization. We believe that idpSAM represents a significant progress in transferable protein ensemble modeling through machine learning.
Collapse
Affiliation(s)
- Giacomo Janson
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, United States of America
| | - Michael Feig
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan, United States of America
| |
Collapse
|
34
|
Maqsood A, Chen C, Jacobsson TJ. The Future of Material Scientists in an Age of Artificial Intelligence. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2401401. [PMID: 38477440 PMCID: PMC11109614 DOI: 10.1002/advs.202401401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Revised: 02/13/2024] [Indexed: 03/14/2024]
Abstract
Material science has historically evolved in tandem with advancements in technologies for characterization, synthesis, and computation. Another type of technology to add to this mix is machine learning (ML) and artificial intelligence (AI). Now increasingly sophisticated AI-models are seen that can solve progressively harder problems across a variety of fields. From a material science perspective, it is indisputable that machine learning and artificial intelligence offer a potent toolkit with the potential to substantially accelerate research efforts in areas such as the development and discovery of new functional materials. Less clear is how to best harness this development, what new skill sets will be required, and how it may affect established research practices. In this paper, those question are explored with respect to increasingly more sophisticated ML/AI-approaches. To structure the discussion, a conceptual framework of an AI-ladder is introduced. This AI-ladder ranges from basic data-fitting techniques to more advanced functionalities such as semi-autonomous experimentation, experimental design, knowledge generation, hypothesis formulation, and the orchestration of specialized AI modules as stepping-stones toward general artificial intelligence. This ladder metaphor provides a hierarchical framework for contemplating the opportunities, challenges, and evolving skill sets required to stay competitive in the age of artificial intelligence.
Collapse
Affiliation(s)
- Ayman Maqsood
- Institute of Photoelectronic Thin Film Devices and TechnologyKey Laboratory of Photoelectronic Thin Film Devices and Technology of TianjinCollege of Electronic Information and Optical EngineeringNankai UniversityTianjin300350China
| | - Chen Chen
- Institute of Photoelectronic Thin Film Devices and TechnologyKey Laboratory of Photoelectronic Thin Film Devices and Technology of TianjinCollege of Electronic Information and Optical EngineeringNankai UniversityTianjin300350China
| | - T. Jesper Jacobsson
- Institute of Photoelectronic Thin Film Devices and TechnologyKey Laboratory of Photoelectronic Thin Film Devices and Technology of TianjinCollege of Electronic Information and Optical EngineeringNankai UniversityTianjin300350China
- Department of PhysicsChemistry and Biology (IFM)Linköping UniversityLinköping581 83Sweden
| |
Collapse
|
35
|
Howard L, Parker GD, Yu XY. Progress and Challenges of Additive Manufacturing of Tungsten and Alloys as Plasma-Facing Materials. MATERIALS (BASEL, SWITZERLAND) 2024; 17:2104. [PMID: 38730911 PMCID: PMC11084790 DOI: 10.3390/ma17092104] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 04/25/2024] [Accepted: 04/25/2024] [Indexed: 05/13/2024]
Abstract
Tungsten (W) and W alloys are considered as primary candidates for plasma-facing components (PFCs) that must perform in severe environments in terms of temperature, neutron fluxes, plasma effects, and irradiation bombardment. These materials are notoriously difficult to produce using additive manufacturing (AM) methods due to issues inherent to these techniques. The progress on applying AM techniques to W-based PFC applications is reviewed and the technical issues in selected manufacturing methods are discussed in this review. Specifically, we focus on the recent development and applications of laser powder bed fusion (LPBF), electron beam melting (EBM), and direct energy deposition (DED) in W materials due to their abilities to preserve the properties of W as potential PFCs. Additionally, the existing literature on irradiation effects on W and W alloys is surveyed, with possible solutions to those issues therein addressed. Finally, the gaps in possible future research on additively manufactured W are identified and outlined.
Collapse
Affiliation(s)
- Logan Howard
- Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
- The Bredesen Center, 310 Ferris Hall 1508 Middle Dr, Knoxville, TN 37996, USA
| | - Gabriel D. Parker
- Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| | - Xiao-Ying Yu
- Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
- The Bredesen Center, 310 Ferris Hall 1508 Middle Dr, Knoxville, TN 37996, USA
| |
Collapse
|
36
|
Zhai Y, Rashmi R, Palos E, Paesani F. Many-body interactions and deep neural network potentials for water. J Chem Phys 2024; 160:144501. [PMID: 38587225 DOI: 10.1063/5.0203682] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 03/23/2024] [Indexed: 04/09/2024] Open
Abstract
We present a detailed assessment of deep neural network potentials developed within the Deep Potential Molecular Dynamics (DeePMD) framework and trained on the MB-pol data-driven many-body potential energy function. Specific focus is directed at the ability of DeePMD-based potentials to correctly reproduce the accuracy of MB-pol across various water systems. Analyses of bulk and interfacial properties as well as many-body interactions characteristic of water elucidate inherent limitations in the transferability and predictive accuracy of DeePMD-based potentials. These limitations can be traced back to an incomplete implementation of the "nearsightedness of electronic matter" principle, which may be common throughout machine learning potentials that do not include a proper representation of self-consistently determined long-range electric fields. These findings provide further support for the "short-blanket dilemma" faced by DeePMD-based potentials, highlighting the challenges in achieving a balance between computational efficiency and a rigorous, physics-based representation of the properties of water. Finally, we believe that our study contributes to the ongoing discourse on the development and application of machine learning models in simulating water systems, offering insights that could guide future improvements in the field.
Collapse
Affiliation(s)
- Yaoguang Zhai
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California 92093, USA
| | - Richa Rashmi
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Etienne Palos
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
| | - Francesco Paesani
- Department of Chemistry and Biochemistry, University of California San Diego, La Jolla, California 92093, USA
- Materials Science and Engineering, University of California San Diego, La Jolla, California 92093, USA
- Halicioğlu Data Science Institute, University of California San Diego, La Jolla, California 92093, USA
- San Diego Supercomputer Center, University of California San Diego, La Jolla, California 92093, USA
| |
Collapse
|
37
|
Vani BP, Aranganathan A, Tiwary P. Exploring Kinase Asp-Phe-Gly (DFG) Loop Conformational Stability with AlphaFold2-RAVE. J Chem Inf Model 2024; 64:2789-2797. [PMID: 37981824 PMCID: PMC11001530 DOI: 10.1021/acs.jcim.3c01436] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2023]
Abstract
Kinases compose one of the largest fractions of the human proteome, and their misfunction is implicated in many diseases, in particular, cancers. The ubiquitousness and structural similarities of kinases make specific and effective drug design difficult. In particular, conformational variability due to the evolutionarily conserved Asp-Phe-Gly (DFG) motif adopting in and out conformations and the relative stabilities thereof are key in structure-based drug design for ATP competitive drugs. These relative conformational stabilities are extremely sensitive to small changes in sequence and provide an important problem for sampling method development. Since the invention of AlphaFold2, the world of structure-based drug design has noticeably changed. In spite of it being limited to crystal-like structure prediction, several methods have also leveraged its underlying architecture to improve dynamics and enhanced sampling of conformational ensembles, including AlphaFold2-RAVE. Here, we extend AlphaFold2-RAVE and apply it to a set of kinases: the wild type DDR1 sequence and three mutants with single point mutations that are known to behave drastically differently. We show that AlphaFold2-RAVE is able to efficiently recover the changes in relative stability using transferable learned order parameters and potentials, thereby supplementing AlphaFold2 as a tool for exploration of Boltzmann-weighted protein conformations (Meller, A.; Bhakat, S.; Solieva, S.; Bowman, G. R. Accelerating Cryptic Pocket Discovery Using AlphaFold. J. Chem. Theory Comput. 2023, 19, 4355-4363).
Collapse
Affiliation(s)
- Bodhi P. Vani
- Institute for Physical Science and Technology, University of Maryland, College Park, Maryland 20742, USA
| | - Akashnathan Aranganathan
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| |
Collapse
|
38
|
Unke OT, Stöhr M, Ganscha S, Unterthiner T, Maennel H, Kashubin S, Ahlin D, Gastegger M, Medrano Sandonas L, Berryman JT, Tkatchenko A, Müller KR. Biomolecular dynamics with machine-learned quantum-mechanical force fields trained on diverse chemical fragments. SCIENCE ADVANCES 2024; 10:eadn4397. [PMID: 38579003 DOI: 10.1126/sciadv.adn4397] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/10/2023] [Accepted: 02/29/2024] [Indexed: 04/07/2024]
Abstract
The GEMS method enables molecular dynamics simulations of large heterogeneous systems at ab initio quality.
Collapse
Affiliation(s)
- Oliver T Unke
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
| | - Martin Stöhr
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Stefan Ganscha
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Thomas Unterthiner
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Hartmut Maennel
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Sergii Kashubin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Daniel Ahlin
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
| | - Michael Gastegger
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- DFG Cluster of Excellence "Unifying Systems in Catalysis" (UniSysCat), Technische Universität Berlin, 10623 Berlin, Germany
- BASLEARN - TU Berlin/BASF Joint Lab for Machine Learning, Technische Universität Berlin, 10587 Berlin, Germany
| | - Leonardo Medrano Sandonas
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Joshua T Berryman
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Alexandre Tkatchenko
- Department of Physics and Materials Science, University of Luxembourg, L-1511 Luxembourg City, Luxembourg
| | - Klaus-Robert Müller
- Google DeepMind, Tucholskystraße 2, 10117 Berlin, Germany and Brandschenkestrasse 110, 8002 Zürich, Switzerland
- Machine Learning Group, Technische Universität Berlin, 10587 Berlin, Germany
- Department of Artificial Intelligence, Korea University, Anam-dong, Seongbuk-gu, Seoul 02841, Korea
- Max Planck Institute for Informatics, Stuhlsatzenhausweg, 66123 Saarbrücken, Germany
- BIFOLD - Berlin Institute for the Foundations of Learning and Data, Berlin, Germany
| |
Collapse
|
39
|
Nandi S, Bhaduri S, Das D, Ghosh P, Mandal M, Mitra P. Deciphering the Lexicon of Protein Targets: A Review on Multifaceted Drug Discovery in the Era of Artificial Intelligence. Mol Pharm 2024; 21:1563-1590. [PMID: 38466810 DOI: 10.1021/acs.molpharmaceut.3c01161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Understanding protein sequence and structure is essential for understanding protein-protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure-activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein-protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.
Collapse
Affiliation(s)
- Suvendu Nandi
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Soumyadeep Bhaduri
- Centre for Computational and Data Sciences, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Debraj Das
- Centre for Computational and Data Sciences, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Priya Ghosh
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Mahitosh Mandal
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| |
Collapse
|
40
|
Wang H, Ying J, Liu J, Yu T, Huang D. Harnessing ResNet50 and SENet for enhanced ankle fracture identification. BMC Musculoskelet Disord 2024; 25:250. [PMID: 38561697 PMCID: PMC10983628 DOI: 10.1186/s12891-024-07355-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/29/2023] [Accepted: 03/13/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND Ankle fractures are prevalent injuries that necessitate precise diagnostic tools. Traditional diagnostic methods have limitations that can be addressed using machine learning techniques, with the potential to improve accuracy and expedite diagnoses. METHODS We trained various deep learning architectures, notably the Adapted ResNet50 with SENet capabilities, to identify ankle fractures using a curated dataset of radiographic images. Model performance was evaluated using common metrics like accuracy, precision, and recall. Additionally, Grad-CAM visualizations were employed to interpret model decisions. RESULTS The Adapted ResNet50 with SENet capabilities consistently outperformed other models, achieving an accuracy of 93%, AUC of 95%, and recall of 92%. Grad-CAM visualizations provided insights into areas of the radiographs that the model deemed significant in its decisions. CONCLUSIONS The Adapted ResNet50 model enhanced with SENet capabilities demonstrated superior performance in detecting ankle fractures, offering a promising tool to complement traditional diagnostic methods. However, continuous refinement and expert validation are essential to ensure optimal application in clinical settings.
Collapse
Grants
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
- 2020AS0031 Science and Technology Projects in the Field of Agriculture and Social Development in Yinzhou District, Ningbo City, Zhejiang Province, China
Collapse
Affiliation(s)
- Hua Wang
- Department of Medical Imaging, Ningbo No. 6 Hospital, Ningbo, China
| | - Jichong Ying
- Department of Orthopedics, Ningbo No. 6 Hospital, Ningbo, China
| | - Jianlei Liu
- Department of Orthopedics, Ningbo No. 6 Hospital, Ningbo, China
| | - Tianming Yu
- Department of Orthopedics, Ningbo No. 6 Hospital, Ningbo, China
| | - Dichao Huang
- Department of Orthopedics, Ningbo No. 6 Hospital, Ningbo, China.
| |
Collapse
|
41
|
Wen M, Chang X, Xu Y, Chen D, Chu Q. Determining the mechanical and decomposition properties of high energetic materials (α-RDX, β-HMX, and ε-CL-20) using a neural network potential. Phys Chem Chem Phys 2024; 26:9984-9997. [PMID: 38477375 DOI: 10.1039/d4cp00017j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
Molecular simulations of high energetic materials (HEMs) are limited by efficiency and accuracy. Recently, neural network potential (NNP) models have achieved molecular simulations of millions of atoms while maintaining the accuracy of density functional theory (DFT) levels. Herein, an NNP model covering typical HEMs containing C, H, N, and O elements is developed. The mechanical and decomposition properties of 1,3,5-trinitroperhydro-1,3,5-triazine (RDX), hexahydro-1,3,5-trinitro-1,3,5-triazine (HMX), and 2,4,6,8,10,12-hexanitrohexaazaisowurtzitane (CL-20) are determined by employing the molecular dynamics (MD) simulations based on the NNP model. The calculated results show that the mechanical properties of α-RDX, β-HMX, and ε-CL-20 agree with previous experiments and theoretical results, including cell parameters, equations of state, and elastic constants. In the thermal decomposition simulations, it is also found that the initial decomposition reactions of the three crystals are N-NO2 homolysis, corresponding radical intermediates formation, and NO2-induced reactions. This decomposition trajectory is mainly divided into two stages separating from the peak of NO2: pyrolysis and oxidation. Overall, the NNP model for C/H/N/O elements in this work is an alternative reactive force field for RDX, HMX, and CL-20 HEMs, and it opens up new potential for future kinetic study of nitramine explosives.
Collapse
Affiliation(s)
- Mingjie Wen
- State Key Laboratory of Explosion Science and Safety Protection, Beijing Institute of Technology, Beijing 100081, P. R. China.
| | - Xiaoya Chang
- State Key Laboratory of Explosion Science and Safety Protection, Beijing Institute of Technology, Beijing 100081, P. R. China.
| | - Yabei Xu
- State Key Laboratory of Explosion Science and Safety Protection, Beijing Institute of Technology, Beijing 100081, P. R. China.
| | - Dongping Chen
- State Key Laboratory of Explosion Science and Safety Protection, Beijing Institute of Technology, Beijing 100081, P. R. China.
| | - Qingzhao Chu
- State Key Laboratory of Explosion Science and Safety Protection, Beijing Institute of Technology, Beijing 100081, P. R. China.
| |
Collapse
|
42
|
Lelièvre T, Pigeon T, Stoltz G, Zhang W. Analyzing Multimodal Probability Measures with Autoencoders. J Phys Chem B 2024; 128:2607-2631. [PMID: 38466759 DOI: 10.1021/acs.jpcb.3c07075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Finding collective variables to describe some important coarse-grained information on physical systems, in particular metastable states, remains a key issue in molecular dynamics. Recently, machine learning techniques have been intensively used to complement and possibly bypass expert knowledge in order to construct collective variables. Our focus here is on neural network approaches based on autoencoders. We study some relevant mathematical properties of the loss function considered for training autoencoders and provide physical interpretations based on conditional variances and minimum energy paths. We also consider various extensions in order to better describe physical systems, by incorporating more information on transition states at saddle points, and/or allowing for multiple decoders in order to describe several transition paths. Our results are illustrated on toy two-dimensional systems and on alanine dipeptide.
Collapse
Affiliation(s)
- Tony Lelièvre
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
| | - Thomas Pigeon
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
- IFP Energies Nouvelles, Rond-Point de l'Echangeur de Solaize, BP 3, 69360 Solaize, France
| | - Gabriel Stoltz
- CERMICS, École des Ponts ParisTech, 6-8 Avenue Blaise Pascal, 77455 Marne-la-Vallée, France
- MATHERIALS Team-project, Inria Paris, 2 Rue Simone Iff, 75012 Paris, France
| | - Wei Zhang
- Department of Mathematics and Computer Science, Freie Universität Berlin, Arnimallee 14, 14195 Berlin, Germany
- Zuse Institute Berlin, Takustraße 7, 14195 Berlin, Germany
| |
Collapse
|
43
|
Zou Z, Tiwary P. Enhanced Sampling of Crystal Nucleation with Graph Representation Learnt Variables. J Phys Chem B 2024. [PMID: 38502931 DOI: 10.1021/acs.jpcb.4c00080] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/21/2024]
Abstract
In this study, we present a graph neural network (GNN)-based learning approach using an autoencoder setup to derive low-dimensional variables from features observed in experimental crystal structures. These variables are then biased in enhanced sampling to observe state-to-state transitions and reliable thermodynamic weights. In our approach, we used simple convolution and pooling methods. To verify the effectiveness of our protocol, we examined the nucleation of various allotropes and polymorphs of iron and glycine in their molten states. Our graph latent variables, when biased in well-tempered metadynamics, consistently show transitions between states and achieve accurate thermodynamic rankings in agreement with experiments, both of which are indicators of dependable sampling. This underscores the strength and promise of our GNN variables for improved sampling. The protocol shown here should be applicable for other systems and other sampling methods.
Collapse
Affiliation(s)
- Ziyue Zou
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry, University of Maryland, College Park 20742, Maryland, United States
- Institute for Physical Science and Technology, University of Maryland, College Park 20742, Maryland, United States
- University of Maryland Institute for Health Computing, Rockville, Maryland 20852, United States
| |
Collapse
|
44
|
Panchagnula K, Graf D, Albertani FEA, Thom AJW. Translational eigenstates of He@C60 from four-dimensional ab initio potential energy surfaces interpolated using Gaussian process regression. J Chem Phys 2024; 160:104303. [PMID: 38465682 DOI: 10.1063/5.0197903] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2024] [Accepted: 02/22/2024] [Indexed: 03/12/2024] Open
Abstract
We investigate the endofullerene system 3He@C60 with a four-dimensional potential energy surface (PES) to include the three He translational degrees of freedom and C60 cage radius. We compare second order Møller-Plesset perturbation theory (MP2), spin component scaled-MP2, scaled opposite spin-MP2, random phase approximation (RPA)@Perdew, Burke, and Ernzerhof (PBE), and corrected Hartree-Fock-RPA to calibrate and gain confidence in the choice of electronic structure method. Due to the high cost of these calculations, the PES is interpolated using Gaussian Process Regression (GPR), owing to its effectiveness with sparse training data. The PES is split into a two-dimensional radial surface, to which corrections are applied to achieve an overall four-dimensional surface. The nuclear Hamiltonian is diagonalized to generate the in-cage translational/vibrational eigenstates. The degeneracy of the three-dimensional harmonic oscillator energies with principal quantum number n is lifted due to the anharmonicity in the radial potential. The (2l + 1)-fold degeneracy of the angular momentum states is also weakly lifted, due to the angular dependence in the potential. We calculate the fundamental frequency to range between 96 and 110 cm-1 depending on the electronic structure method used. Error bars of the eigenstate energies were calculated from the GPR and are on the order of ∼±1.5 cm-1. Wavefunctions are also compared by considering their overlap and Hellinger distance to the one-dimensional empirical potential. As with the energies, the two ab initio methods MP2 and RPA@PBE show the best agreement. While MP2 has better agreement than RPA@PBE, due to its higher computational efficiency and comparable performance, we recommend RPA as an alternative electronic structure method of choice to MP2 for these systems.
Collapse
Affiliation(s)
- K Panchagnula
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - D Graf
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - F E A Albertani
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| | - A J W Thom
- Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
| |
Collapse
|
45
|
Noordhoek K, Bartel CJ. Accelerating the prediction of inorganic surfaces with machine learning interatomic potentials. NANOSCALE 2024. [PMID: 38470833 DOI: 10.1039/d3nr06468a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/14/2024]
Abstract
The surface properties of solid-state materials often dictate their functionality, especially for applications where nanoscale effects become important. The relevant surface(s) and their properties are determined, in large part, by the material's synthesis or operating conditions. These conditions dictate thermodynamic driving forces and kinetic rates responsible for yielding the observed surface structure and morphology. Computational surface science methods have long been applied to connect thermochemical conditions to surface phase stability, particularly in the heterogeneous catalysis and thin film growth communities. This review provides a brief introduction to first-principles approaches to compute surface phase diagrams before introducing emerging data-driven approaches. The remainder of the review focuses on the application of machine learning, predominantly in the form of learned interatomic potentials, to study complex surfaces. As machine learning algorithms and large datasets on which to train them become more commonplace in materials science, computational methods are poised to become even more predictive and powerful for modeling the complexities of inorganic surfaces at the nanoscale.
Collapse
Affiliation(s)
- Kyle Noordhoek
- Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN, 55455, USA.
| | - Christopher J Bartel
- Department of Chemical Engineering and Materials Science, University of Minnesota, Minneapolis, MN, 55455, USA.
| |
Collapse
|
46
|
Song Z, Han J, Henkelman G, Li L. Charge-Optimized Electrostatic Interaction Atom-Centered Neural Network Algorithm. J Chem Theory Comput 2024; 20:2088-2097. [PMID: 38380601 DOI: 10.1021/acs.jctc.3c01254] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/22/2024]
Abstract
Machine-learning algorithms have been proposed to capture electrostatic interactions by using effective partial charges. These algorithms often rely on a pretrained model for partial charge prediction using density functional theory-calculated partial charges as references, which introduces complexity to the force field model. The accuracy of the trained model also depends on the reliability of charge partition methods, which can be dependent on the specific system and methodology employed. In this study, we propose an atom-centered neural network (ANN) algorithm that eliminates the need for reference charges. Our algorithm requires only a single NN model for each element to obtain both atomic energy and charges. These atomic charges are then employed to compute electrostatic energies using the Ewald summation algorithm. Subsequently, the force field model is trained on total energy and forces, with the inclusion of electrostatic energy. To evaluate the performance of our algorithm, we conducted tests on three benchmark systems, including a Ge slab with an O adatom system, a TiO2 crystalline system, and a Pd-O nanoparticle system. Our results demonstrate reasonably accurate predictions of partial charges and electrostatic interactions. This algorithm provides a self-consistent charge prediction strategy and possibilities for robust and reliable modeling of electrostatic interactions in machine-learning potentials.
Collapse
Affiliation(s)
- Zichen Song
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
| | - Jian Han
- Department of Materials Science and Engineering, City University of Hong Kong, Kowloon, Hong Kong, China
| | - Graeme Henkelman
- Department of Chemistry, the University of Texas at Austin, Austin, Texas 78712, United States
- Institute for Computational Engineering and Sciences, the University of Texas at Austin, Austin, Texas 78712, United States
| | - Lei Li
- Shenzhen Key Laboratory of Micro/Nano-Porous Functional Materials (SKLPM), Department of Materials Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China
| |
Collapse
|
47
|
Beck TL, Carloni P, Asthagiri DN. All-Atom Biomolecular Simulation in the Exascale Era. J Chem Theory Comput 2024; 20:1777-1782. [PMID: 38382017 DOI: 10.1021/acs.jctc.3c01276] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
Exascale supercomputers have opened the door to dynamic simulations, facilitated by AI/ML techniques, that model biomolecular motions over unprecedented length and time scales. This new capability holds the potential to revolutionize our understanding of fundamental biological processes. Here we report on some of the major advances that were discussed at a recent CECAM workshop in Pisa, Italy, on the topic with a primary focus on atomic-level simulations. First, we highlight examples of current large-scale biomolecular simulations and the future possibilities enabled by crossing the exascale threshold. Next, we discuss challenges to be overcome in optimizing the usage of these powerful resources. Finally, we close by listing several grand challenge problems that could be investigated with this new computer architecture.
Collapse
Affiliation(s)
- Thomas L Beck
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| | - Paolo Carloni
- INM-9/IAS-5 Computational Biomedicine, Forschungszentrum Jülich, Wilhelm-Johnen-Straße, D-54245 Jülich, Germany
- Department of Physics, RWTH Aachen University, D-52078 Aachen, Germany
| | - Dilipkumar N Asthagiri
- National Center for Computational Sciences, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37830, United States
| |
Collapse
|
48
|
Bai F, Li S, Li H. AI enhances drug discovery and development. Natl Sci Rev 2024; 11:nwad303. [PMID: 38440073 PMCID: PMC10911811 DOI: 10.1093/nsr/nwad303] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2023] [Revised: 11/24/2023] [Accepted: 11/27/2023] [Indexed: 03/06/2024] Open
Affiliation(s)
- Fang Bai
- Shanghai Institute for Advanced Immunochemical Studies and School of Life Science and Technology, ShanghaiTech University, China
- Lingang Laboratory, China
| | - Shiliang Li
- Innovation Center for AI and Drug Discovery, East China Normal University, China
- Lingang Laboratory, China
| | - Honglin Li
- Innovation Center for AI and Drug Discovery, East China Normal University, China
- Lingang Laboratory, China
| |
Collapse
|
49
|
Borówko M. Special Issue "Third Edition: Advances in Molecular Simulation". Int J Mol Sci 2024; 25:2709. [PMID: 38473956 DOI: 10.3390/ijms25052709] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 02/19/2024] [Indexed: 03/14/2024] Open
Abstract
Molecular simulation is one of the fastest growing fields in science [...].
Collapse
Affiliation(s)
- Małgorzata Borówko
- Department of Theoretical Chemistry, Institute of Chemical Sciences, Faculty of Chemistry, Maria Curie-Skłodowska University, 20-031 Lublin, Poland
| |
Collapse
|
50
|
Montes de Oca-Estévez MJ, Valdés Á, Prosmiti R. A kernel-based machine learning potential and quantum vibrational state analysis of the cationic Ar hydride (Ar 2H +). Phys Chem Chem Phys 2024; 26:7060-7071. [PMID: 38345626 DOI: 10.1039/d3cp05865d] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2024]
Abstract
One of the most fascinating discoveries in recent years, in the cold and low pressure regions of the universe, was the detection of ArH+ and HeH+ species. The identification of such noble gas-containing molecules in space is the key to understanding noble gas chemistry. In the present work, we discuss the possibility of [Ar2H]+ existence as a potentially detectable molecule in the interstellar medium, providing new data on possible astronomical pathways and energetics of this compound. As a first step, a data-driven approach is proposed to construct a full 3D machine-learning potential energy surface (ML-PES) via the reproducing kernel Hilbert space (RKHS) method. The training and testing data sets are generated from CCSD(T)/CBS[56] computations, while a validation protocol is introduced to ensure the quality of the potential. In turn, the resulting ML-PES is employed to compute vibrational levels and molecular spectroscopic constants for the cation. In this way, the most common isotopologue in ISM, [36Ar2H]+, was characterized for the first time, while simultaneously, comparisons with previously reported values available for [40Ar2H]+ are discussed. Our present data could serve as a benchmark for future studies on this system, as well as on higher-order cationic Ar-hydrides of astrophysical interest.
Collapse
Affiliation(s)
- María Judit Montes de Oca-Estévez
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain.
- Atelgraphics S.L., Mota de Cuervo 42, 28043, Madrid, Spain
| | - Álvaro Valdés
- Escuela de Física, Universidad Nacional de Colombia, Sede Medellín, A. A., 3840, Medellín, Colombia
| | - Rita Prosmiti
- Institute of Fundamental Physics (IFF-CSIC), CSIC, Serrano 123, 28006 Madrid, Spain.
| |
Collapse
|