1
|
Xie P, E W. Coarse-Graining Conformational Dynamics with Multidimensional Generalized Langevin Equation: How, When, and Why. J Chem Theory Comput 2024. [PMID: 39258946 DOI: 10.1021/acs.jctc.4c00729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
A data-driven ab initio generalized Langevin equation (AIGLE) approach is developed to learn and simulate high-dimensional, heterogeneous, coarse-grained (CG) conformational dynamics. Constrained by the fluctuation-dissipation theorem, the approach can build CG models in dynamical consistency (DC) with all-atom molecular dynamics. We also propose practical criteria for AIGLE to enforce long-term DC. Case studies of a toy polymer, with 20 CG sites, and the alanine dipeptide, with two dihedral angles, elucidate why one should adopt AIGLE or its Markovian limit for modeling CG conformational dynamics in practice.
Collapse
Affiliation(s)
- Pinchen Xie
- Program in Applied and Computational Mathematics, Princeton University, Princeton, New Jersey 08544, United States
- Applied Mathematics and Computational Research Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, United States
| | - Weinan E
- AI for Science Institute, Beijing 100080, China
- Center for Machine Learning Research and School of Mathematical Sciences, Peking University, Beijing 100084, China
| |
Collapse
|
2
|
Patsch D, Schwander T, Voss M, Schaub D, Hüppi S, Eichenberger M, Stockinger P, Schelbert L, Giger S, Peccati F, Jiménez-Osés G, Mutný M, Krause A, Bornscheuer UT, Hilvert D, Buller RM. Enriching productive mutational paths accelerates enzyme evolution. Nat Chem Biol 2024:10.1038/s41589-024-01712-3. [PMID: 39261644 DOI: 10.1038/s41589-024-01712-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2024] [Accepted: 07/26/2024] [Indexed: 09/13/2024]
Abstract
Darwinian evolution has given rise to all the enzymes that enable life on Earth. Mimicking natural selection, scientists have learned to tailor these biocatalysts through recursive cycles of mutation, selection and amplification, often relying on screening large protein libraries to productively modulate the complex interplay between protein structure, dynamics and function. Here we show that by removing destabilizing mutations at the library design stage and taking advantage of recent advances in gene synthesis, we can accelerate the evolution of a computationally designed enzyme. In only five rounds of evolution, we generated a Kemp eliminase-an enzymatic model system for proton transfer from carbon-that accelerates the proton abstraction step >108-fold over the uncatalyzed reaction. Recombining the resulting variant with a previously evolved Kemp eliminase HG3.17, which exhibits similar activity but differs by 29 substitutions, allowed us to chart the topography of the designer enzyme's fitness landscape, highlighting that a given protein scaffold can accommodate several, equally viable solutions to a specific catalytic problem.
Collapse
Affiliation(s)
- David Patsch
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
- Department of Biotechnology and Enzyme Catalysis, University of Greifswald, Greifswald, Germany
| | - Thomas Schwander
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
| | - Moritz Voss
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
| | - Daniela Schaub
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
- Center for Functional Protein Assemblies & Department of Bioscience, TUM School of Natural Sciences, Technical University of Munich (TUM), Garching, Germany
| | - Sean Hüppi
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
- Department of Biotechnology, Delft University of Technology, Delft, The Netherlands
| | - Michael Eichenberger
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
| | - Peter Stockinger
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
| | - Lisa Schelbert
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
| | - Sandro Giger
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland
| | - Francesca Peccati
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Derio, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Gonzalo Jiménez-Osés
- Center for Cooperative Research in Biosciences (CIC bioGUNE), Basque Research and Technology Alliance (BRTA), Derio, Spain
- Ikerbasque, Basque Foundation for Science, Bilbao, Spain
| | - Mojmír Mutný
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Andreas Krause
- Department of Computer Science, ETH Zurich, Zurich, Switzerland
| | - Uwe T Bornscheuer
- Department of Biotechnology and Enzyme Catalysis, University of Greifswald, Greifswald, Germany
| | - Donald Hilvert
- Laboratory of Organic Chemistry, ETH Zurich, Zurich, Switzerland
| | - Rebecca M Buller
- Competence Center for Biocatalysis, Zurich University of Applied Sciences, Waedenswil, Switzerland.
| |
Collapse
|
3
|
Clark F, Robb GR, Cole DJ, Michel J. Automated Adaptive Absolute Binding Free Energy Calculations. J Chem Theory Comput 2024. [PMID: 39254715 DOI: 10.1021/acs.jctc.4c00806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/11/2024]
Abstract
Alchemical absolute binding free energy (ABFE) calculations have substantial potential in drug discovery, but are often prohibitively computationally expensive. To unlock their potential, efficient automated ABFE workflows are required to reduce both computational cost and human intervention. We present a fully automated ABFE workflow based on the automated selection of λ windows, the ensemble-based detection of equilibration, and the adaptive allocation of sampling time based on inter-replicate statistics. We find that the automated selection of intermediate states with consistent overlap is rapid, robust, and simple to implement. Robust detection of equilibration is achieved with a paired t-test between the free energy estimates at initial and final portions of a an ensemble of runs. We determine reasonable default parameters for all algorithms and show that the full workflow produces equivalent results to a nonadaptive scheme over a variety of test systems, while often accelerating equilibration. Our complete workflow is implemented in the open-source package A3FE (https://github.com/michellab/a3fe).
Collapse
Affiliation(s)
- Finlay Clark
- EaStCHEM School of Chemistry, University of Edinburgh, David Brewster Road, Edinburgh EH9 3FJ, United Kingdom
| | - Graeme R Robb
- Oncology R&D, AstraZeneca, Cambridge CB4 0WG, United Kingdom
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Julien Michel
- EaStCHEM School of Chemistry, University of Edinburgh, David Brewster Road, Edinburgh EH9 3FJ, United Kingdom
| |
Collapse
|
4
|
Takaba K, Friedman AJ, Cavender CE, Behara PK, Pulido I, Henry MM, MacDermott-Opeskin H, Iacovella CR, Nagle AM, Payne AM, Shirts MR, Mobley DL, Chodera JD, Wang Y. Machine-learned molecular mechanics force fields from large-scale quantum chemical data. Chem Sci 2024; 15:12861-12878. [PMID: 39148808 PMCID: PMC11322960 DOI: 10.1039/d4sc00690a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2024] [Accepted: 06/17/2024] [Indexed: 08/17/2024] Open
Abstract
The development of reliable and extensible molecular mechanics (MM) force fields-fast, empirical models characterizing the potential energy surface of molecular systems-is indispensable for biomolecular simulation and computer-aided drug design. Here, we introduce a generalized and extensible machine-learned MM force field, espaloma-0.3, and an end-to-end differentiable framework using graph neural networks to overcome the limitations of traditional rule-based methods. Trained in a single GPU-day to fit a large and diverse quantum chemical dataset of over 1.1 M energy and force calculations, espaloma-0.3 reproduces quantum chemical energetic properties of chemical domains highly relevant to drug discovery, including small molecules, peptides, and nucleic acids. Moreover, this force field maintains the quantum chemical energy-minimized geometries of small molecules and preserves the condensed phase properties of peptides and folded proteins, self-consistently parametrizing proteins and ligands to produce stable simulations leading to highly accurate predictions of binding free energies. This methodology demonstrates significant promise as a path forward for systematically building more accurate force fields that are easily extensible to new chemical domains of interest.
Collapse
Affiliation(s)
- Kenichiro Takaba
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Pharmaceuticals Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation Shizuoka 410-2321 Japan
| | - Anika J Friedman
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - Chapin E Cavender
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego 9500 Gilman Drive La Jolla CA 92093 USA
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, Department of Pathology and Laboratory Medicine, University of California Irvine CA 92697 USA
| | - Iván Pulido
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Michael M Henry
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | | | - Christopher R Iacovella
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Arnav M Nagle
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Department of Bioengineering, University of California, Berkeley Berkeley CA 94720 USA
| | - Alexander Matthew Payne
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
- Tri-Institutional PhD Program in Chemical Biology, Memorial Sloan Kettering Cancer Center New York 10065 USA
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder Boulder CO 80309 USA
| | - David L Mobley
- Department of Pharmaceutical Sciences, University of California Irvine California 92697 USA
| | - John D Chodera
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York University New York NY 10004 USA
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center New York NY 10065 USA
| |
Collapse
|
5
|
Li J, Zhou Y, Chen SJ. Embracing exascale computing in nucleic acid simulations. Curr Opin Struct Biol 2024; 87:102847. [PMID: 38815519 PMCID: PMC11283969 DOI: 10.1016/j.sbi.2024.102847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Revised: 04/18/2024] [Accepted: 05/09/2024] [Indexed: 06/01/2024]
Abstract
This mini-review reports the recent advances in biomolecular simulations, particularly for nucleic acids, and provides the potential effects of the emerging exascale computing on nucleic acid simulations, emphasizing the need for advanced computational strategies to fully exploit this technological frontier. Specifically, we introduce recent breakthroughs in computer architectures for large-scale biomolecular simulations and review the simulation protocols for nucleic acids regarding force fields, enhanced sampling methods, coarse-grained models, and interactions with ligands. We also explore the integration of machine learning methods into simulations, which promises to significantly enhance the predictive modeling of biomolecules and the analysis of complex data generated by the exascale simulations. Finally, we discuss the challenges and perspectives for biomolecular simulations as we enter the dawning exascale computing era.
Collapse
Affiliation(s)
- Jun Li
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, 223 Physics Bldg., Columbia, 65211, MO, USA
| | - Yuanzhe Zhou
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, 223 Physics Bldg., Columbia, 65211, MO, USA
| | - Shi-Jie Chen
- Department of Physics, Department of Biochemistry and Institute for Data Science and Informatics, University of Missouri, 223 Physics Bldg., Columbia, 65211, MO, USA.
| |
Collapse
|
6
|
Wang L, Behara PK, Thompson MW, Gokey T, Wang Y, Wagner JR, Cole DJ, Gilson MK, Shirts MR, Mobley DL. The Open Force Field Initiative: Open Software and Open Science for Molecular Modeling. J Phys Chem B 2024; 128:7043-7067. [PMID: 38989715 DOI: 10.1021/acs.jpcb.4c01558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Force fields are a key component of physics-based molecular modeling, describing the energies and forces in a molecular system as a function of the positions of the atoms and molecules involved. Here, we provide a review and scientific status report on the work of the Open Force Field (OpenFF) Initiative, which focuses on the science, infrastructure and data required to build the next generation of biomolecular force fields. We introduce the OpenFF Initiative and the related OpenFF Consortium, describe its approach to force field development and software, and discuss accomplishments to date as well as future plans. OpenFF releases both software and data under open and permissive licensing agreements to enable rapid application, validation, extension, and modification of its force fields and software tools. We discuss lessons learned to date in this new approach to force field development. We also highlight ways that other force field researchers can get involved, as well as some recent successes of outside researchers taking advantage of OpenFF tools and data.
Collapse
Affiliation(s)
- Lily Wang
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Pavan Kumar Behara
- Center for Neurotherapeutics, University of California, Irvine, California 92697, United States
| | - Matthew W Thompson
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Trevor Gokey
- Department of Chemistry, University of California, Irvine, California 92697, United States
| | - Yuanqing Wang
- Simons Center for Computational Physical Chemistry and Center for Data Science, New York, New York 10004, United States
| | - Jeffrey R Wagner
- Open Force Field, Open Molecular Software Foundation, Davis, California 95616, United States
| | - Daniel J Cole
- School of Natural and Environmental Sciences, Newcastle University, Newcastle upon Tyne NE1 7RU, United Kingdom
| | - Michael K Gilson
- Skaggs School of Pharmacy and Pharmaceutical Sciences, The University of California at San Diego, La Jolla, California 92093, United States
| | - Michael R Shirts
- Department of Chemical and Biological Engineering, University of Colorado Boulder, Boulder, Colorado 80305, United States
| | - David L Mobley
- Department of Chemistry, University of California, Irvine, California 92697, United States
- Department of Pharmaceutical Sciences, University of California, Irvine, California 92697, United States
| |
Collapse
|
7
|
Burns D, Venditti V, Potoyan DA. Illuminating Protein Allostery by Chemically Accurate Contact Response Analysis (ChACRA). J Chem Theory Comput 2024. [PMID: 39038177 DOI: 10.1021/acs.jctc.4c00414] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/24/2024]
Abstract
Decoding allostery at the atomic level is essential for understanding the relationship between a protein's sequence, structure, and dynamics. Recently, we have shown that decomposing temperature responses of inter-residue contacts can reveal allosteric couplings and provide useful insight into the functional dynamics of proteins. The details of this Chemically Accurate Contact Response Analysis (ChACRA) are presented here along with its application to two well-known allosteric proteins. The first protein, IGPS, is a model of ensemble allostery that lacks clear structural differences between the active and inactive states. We show that the application of ChACRA reveals the experimentally identified allosteric coupling between effector and active sites of IGPS. The second protein, ATCase, is a classic example of allostery with distinct active and inactive structural states. Using ChACRA, we directly identify the most significant residue level interactions underlying the enzyme's cooperative behavior. Both test cases demonstrate the utility of ChACRA's unsupervised machine learning approach for dissecting allostery at the residue level.
Collapse
Affiliation(s)
- Daniel Burns
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| | - Vincenzo Venditti
- Department of Chemistry, Iowa State University, Ames, Iowa 50011, United States
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| | - Davit A Potoyan
- Department of Chemistry, Iowa State University, Ames, Iowa 50011, United States
- Roy J. Carver Department of Biochemistry, Biophysics and Molecular Biology, Iowa State University, Ames, Iowa 50011, United States
| |
Collapse
|
8
|
Cerutti DS, Wiewiora R, Boothroyd S, Sherman W. STORMM: Structure and topology replica molecular mechanics for chemical simulations. J Chem Phys 2024; 161:032501. [PMID: 39007368 DOI: 10.1063/5.0211032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 06/26/2024] [Indexed: 07/16/2024] Open
Abstract
The Structure and TOpology Replica Molecular Mechanics (STORMM) code is a next-generation molecular simulation engine and associated libraries optimized for performance on fast, vectorized central processor units and graphics processing units (GPUs) with independent memory and tens of thousands of threads. STORMM is built to run thousands of independent molecular mechanical calculations on a single GPU with novel implementations that tune numerical precision, mathematical operations, and scarce on-chip memory resources to optimize throughput. The libraries are built around accessible classes with detailed documentation, supporting fine-grained parallelism and algorithm development as well as copying or swapping groups of systems on and off of the GPU. A primary intention of the STORMM libraries is to provide developers of atomic simulation methods with access to a high-performance molecular mechanics engine with extensive facilities to prototype and develop bespoke tools aimed toward drug discovery applications. In its present state, STORMM delivers molecular dynamics simulations of small molecules and small proteins in implicit solvent with tens to hundreds of times the throughput of conventional codes. The engineering paradigm transforms two of the most memory bandwidth-intensive aspects of condensed-phase dynamics, particle-mesh mapping, and valence interactions, into compute-bound problems for several times the scalability of existing programs. Numerical methods for compressing and streamlining the information present in stored coordinates and lookup tables are also presented, delivering improved accuracy over methods implemented in other molecular dynamics engines. The open-source code is released under the MIT license.
Collapse
Affiliation(s)
| | | | | | - Woody Sherman
- Psivant Therapeutics, Boston, Massachusetts 02210, USA
| |
Collapse
|
9
|
Karwounopoulos J, Wu Z, Tkaczyk S, Wang S, Baskerville A, Ranasinghe K, Langer T, Wood GPF, Wieder M, Boresch S. Insights and Challenges in Correcting Force Field Based Solvation Free Energies Using a Neural Network Potential. J Phys Chem B 2024; 128:6693-6703. [PMID: 38976601 PMCID: PMC11264272 DOI: 10.1021/acs.jpcb.4c01417] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Revised: 05/31/2024] [Accepted: 06/14/2024] [Indexed: 07/10/2024]
Abstract
We present a comprehensive study investigating the potential gain in accuracy for calculating absolute solvation free energies (ASFE) using a neural network potential to describe the intramolecular energy of the solute. We calculated the ASFE for most compounds from the FreeSolv database using the Open Force Field (OpenFF) and compared them to earlier results obtained with the CHARMM General Force Field (CGenFF). By applying a nonequilibrium (NEQ) switching approach between the molecular mechanics (MM) description (either OpenFF or CGenFF) and the neural net potential (NNP)/MM level of theory (using ANI-2x as the NNP potential), we attempted to improve the accuracy of the calculated ASFEs. The predictive performance of the results did not change when this approach was applied to all 589 small molecules in the FreeSolv database that ANI-2x can describe. When selecting a subset of 156 molecules, focusing on compounds where the force fields performed poorly, we saw a slight improvement in the root-mean-square error (RMSE) and mean absolute error (MAE). The majority of our calculations utilized unidirectional NEQ protocols based on Jarzynski's equation. Additionally, we conducted bidirectional NEQ switching for a subset of 156 solutes. Notably, only a small fraction (10 out of 156) exhibited statistically significant discrepancies between unidirectional and bidirectional NEQ switching free energy estimates.
Collapse
Affiliation(s)
- Johannes Karwounopoulos
- Faculty
of Chemistry, Institute of Computational Biological Chemistry, University Vienna, Währingerstr. 17, 1090 Vienna, Austria
- Vienna
Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstr. 42, 1090 Vienna, Austria
| | - Zhiyi Wu
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
| | - Sara Tkaczyk
- Department
of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna
Doctoral School of Pharmaceutical, Nutritional and Sport Sciences
(PhaNuSpo),University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Shuzhe Wang
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
| | - Adam Baskerville
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
| | | | - Thierry Langer
- Department
of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | | | - Marcus Wieder
- Exscientia
plc, Schroedinger Building, Oxford OX4 4GE, United Kingdom
- Open
Molecular Software Foundation, Davis, California 95616, United States
| | - Stefan Boresch
- Faculty
of Chemistry, Institute of Computational Biological Chemistry, University Vienna, Währingerstr. 17, 1090 Vienna, Austria
| |
Collapse
|
10
|
Antalík A, Levy A, Kvedaravičiūtė S, Johnson SK, Carrasco-Busturia D, Raghavan B, Mouvet F, Acocella A, Das S, Gavini V, Mandelli D, Ippoliti E, Meloni S, Carloni P, Rothlisberger U, Olsen JMH. MiMiC: A high-performance framework for multiscale molecular dynamics simulations. J Chem Phys 2024; 161:022501. [PMID: 38990116 DOI: 10.1063/5.0211053] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Accepted: 06/15/2024] [Indexed: 07/12/2024] Open
Abstract
MiMiC is a framework for performing multiscale simulations in which loosely coupled external programs describe individual subsystems at different resolutions and levels of theory. To make it highly efficient and flexible, we adopt an interoperable approach based on a multiple-program multiple-data (MPMD) paradigm, serving as an intermediary responsible for fast data exchange and interactions between the subsystems. The main goal of MiMiC is to avoid interfering with the underlying parallelization of the external programs, including the operability on hybrid architectures (e.g., CPU/GPU), and keep their setup and execution as close as possible to the original. At the moment, MiMiC offers an efficient implementation of electrostatic embedding quantum mechanics/molecular mechanics (QM/MM) that has demonstrated unprecedented parallel scaling in simulations of large biomolecules using CPMD and GROMACS as QM and MM engines, respectively. However, as it is designed for high flexibility with general multiscale models in mind, it can be straightforwardly extended beyond QM/MM. In this article, we illustrate the software design and the features of the framework, which make it a compelling choice for multiscale simulations in the upcoming era of exascale high-performance computing.
Collapse
Affiliation(s)
- Andrej Antalík
- Laboratory of Computational Chemistry and Biochemistry, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - Andrea Levy
- Laboratory of Computational Chemistry and Biochemistry, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | - Sonata Kvedaravičiūtė
- DTU Chemistry, Technical University of Denmark (DTU), DK-2800 Kongens Lyngby, Denmark
| | - Sophia K Johnson
- Laboratory of Computational Chemistry and Biochemistry, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | | | - Bharath Raghavan
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
- Department of Physics, RWTH Aachen University, Aachen 52074, Germany
| | - François Mouvet
- Laboratory of Computational Chemistry and Biochemistry, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | | | - Sambit Das
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Vikram Gavini
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
- Department of Materials Science and Engineering, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - Davide Mandelli
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Emiliano Ippoliti
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
| | - Simone Meloni
- Dipartimento di Scienze Chimiche, Farmaceutiche ed Agrarie (DOCPAS), Università degli Studi di Ferrara (Unife), I-44121 Ferrara, Italy
| | - Paolo Carloni
- Computational Biomedicine, Institute of Advanced Simulations IAS-5/Institute for Neuroscience and Medicine INM-9, Forschungszentrum Jülich GmbH, Jülich 52428, Germany
- Department of Physics, RWTH Aachen University, Aachen 52074, Germany
| | - Ursula Rothlisberger
- Laboratory of Computational Chemistry and Biochemistry, École Polytechnique Fédérale de Lausanne, CH-1015 Lausanne, Switzerland
| | | |
Collapse
|
11
|
Gogal RA, Nessler AJ, Thiel AC, Bernabe HV, Corrigan Grove RA, Cousineau LM, Litman JM, Miller JM, Qi G, Speranza MJ, Tollefson MR, Fenn TD, Michaelson JJ, Okada O, Piquemal JP, Ponder JW, Shen J, Smith RJH, Yang W, Ren P, Schnieders MJ. Force Field X: A computational microscope to study genetic variation and organic crystals using theory and experiment. J Chem Phys 2024; 161:012501. [PMID: 38958156 PMCID: PMC11223778 DOI: 10.1063/5.0214652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 06/17/2024] [Indexed: 07/04/2024] Open
Abstract
Force Field X (FFX) is an open-source software package for atomic resolution modeling of genetic variants and organic crystals that leverages advanced potential energy functions and experimental data. FFX currently consists of nine modular packages with novel algorithms that include global optimization via a many-body expansion, acid-base chemistry using polarizable constant-pH molecular dynamics, estimation of free energy differences, generalized Kirkwood implicit solvent models, and many more. Applications of FFX focus on the use and development of a crystal structure prediction pipeline, biomolecular structure refinement against experimental datasets, and estimation of the thermodynamic effects of genetic variants on both proteins and nucleic acids. The use of Parallel Java and OpenMM combines to offer shared memory, message passing, and graphics processing unit parallelization for high performance simulations. Overall, the FFX platform serves as a computational microscope to study systems ranging from organic crystals to solvated biomolecular systems.
Collapse
Affiliation(s)
- Rose A. Gogal
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Aaron J. Nessler
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Andrew C. Thiel
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Hernan V. Bernabe
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Rae A. Corrigan Grove
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico 87545, USA
| | - Leah M. Cousineau
- Department of Biochemistry and Molecular Biology, University of Iowa, Iowa City, Iowa 52242, USA
| | - Jacob M. Litman
- Department of Biochemistry and Molecular Biology, University of Iowa, Iowa City, Iowa 52242, USA
| | - Jacob M. Miller
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Guowei Qi
- Department of Biochemistry and Molecular Biology, University of Iowa, Iowa City, Iowa 52242, USA
| | - Matthew J. Speranza
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Mallory R. Tollefson
- Roy J. Carver Department of Biomedical Engineering, University of Iowa, Iowa City, Iowa 52242, USA
| | - Timothy D. Fenn
- Analytical Development, LEXEO Therapeutics, New York, New York 10010, USA
| | - Jacob J. Michaelson
- Department of Psychiatry, University of Iowa Hospitals and Clinics, Iowa City, Iowa 52242, USA
| | - Okimasa Okada
- Sohyaku Innovative Research Division, Mitsubishi Tanabe Pharma Corporation, 1000 Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan
| | | | - Jay W. Ponder
- Department of Chemistry, Washington University in St. Louis, St. Louis, Missouri 63130, USA
| | - Jana Shen
- Department of Pharmaceutical Sciences, University of Maryland School of Pharmacy, Baltimore, Maryland 21201, USA
| | - Richard J. H. Smith
- Molecular Otolaryngology and Renal Research Laboratories, Department of Otolaryngology, University of Iowa Hospitals and Clinics, Iowa City, Iowa 52242, USA
| | | | - Pengyu Ren
- Department of Biomedical Engineering, University of Texas, Austin, Texas 78712, USA
| | | |
Collapse
|
12
|
Kairys V, Baranauskiene L, Kazlauskiene M, Zubrienė A, Petrauskas V, Matulis D, Kazlauskas E. Recent advances in computational and experimental protein-ligand affinity determination techniques. Expert Opin Drug Discov 2024; 19:649-670. [PMID: 38715415 DOI: 10.1080/17460441.2024.2349169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2024] [Accepted: 04/25/2024] [Indexed: 05/22/2024]
Abstract
INTRODUCTION Modern drug discovery revolves around designing ligands that target the chosen biomolecule, typically proteins. For this, the evaluation of affinities of putative ligands is crucial. This has given rise to a multitude of dedicated computational and experimental methods that are constantly being developed and improved. AREAS COVERED In this review, the authors reassess both the industry mainstays and the newest trends among the methods for protein - small-molecule affinity determination. They discuss both computational affinity predictions and experimental techniques, describing their basic principles, main limitations, and advantages. Together, this serves as initial guide to the currently most popular and cutting-edge ligand-binding assays employed in rational drug design. EXPERT OPINION The affinity determination methods continue to develop toward miniaturization, high-throughput, and in-cell application. Moreover, the availability of data analysis tools has been constantly increasing. Nevertheless, cross-verification of data using at least two different techniques and careful result interpretation remain of utmost importance.
Collapse
Affiliation(s)
- Visvaldas Kairys
- Department of Bioinformatics, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Lina Baranauskiene
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | | | - Asta Zubrienė
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Vytautas Petrauskas
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Daumantas Matulis
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Egidijus Kazlauskas
- Department of Biothermodynamics and Drug Design, Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
13
|
Woods CJ, Hedges LO, Mulholland AJ, Malaisree M, Tosco P, Loeffler HH, Suruzhon M, Burman M, Bariami S, Bosisio S, Calabro G, Clark F, Mey ASJS, Michel J. Sire: An interoperability engine for prototyping algorithms and exchanging information between molecular simulation programs. J Chem Phys 2024; 160:202503. [PMID: 38814008 DOI: 10.1063/5.0200458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2024] [Accepted: 04/29/2024] [Indexed: 05/31/2024] Open
Abstract
Sire is a Python/C++ library that is used both to prototype new algorithms and as an interoperability engine for exchanging information between molecular simulation programs. It provides a collection of file parsers and information converters that together make it easier to combine and leverage the functionality of many other programs and libraries. This empowers researchers to use sire to write a single script that can, for example, load a molecule from a PDBx/mmCIF file via Gemmi, perform SMARTS searches via RDKit, parameterize molecules using BioSimSpace, run GPU-accelerated molecular dynamics via OpenMM, and then display the resulting dynamics trajectory in a NGLView Jupyter notebook 3D molecular viewer. This functionality is built on by BioSimSpace, which uses sire's molecular information engine to interconvert with programs such as GROMACS, NAMD, Amber, and AmberTools for automated molecular parameterization and the running of molecular dynamics, metadynamics, and alchemical free energy workflows. Sire comes complete with a powerful molecular information search engine, plus trajectory loading and editing, analysis, and energy evaluation engines. This, when combined with an in-built computer algebra system, gives substantial flexibility to researchers to load, search for, edit, and combine molecular information from multiple sources and use that to drive novel algorithms by combining functionality from other programs. Sire is open source (GPL3) and is available via conda and at a free Jupyter notebook server at https://try.openbiosim.org. Sire is supported by the not-for-profit OpenBioSim community interest company.
Collapse
Affiliation(s)
- Christopher J Woods
- Advanced Computing Research Centre, University of Bristol, Bristol, United Kingdom
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, United Kingdom
- OpenBioSim Community Interest Company, Edinburgh, United Kingdom
| | - Lester O Hedges
- Advanced Computing Research Centre, University of Bristol, Bristol, United Kingdom
- OpenBioSim Community Interest Company, Edinburgh, United Kingdom
| | - Adrian J Mulholland
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, United Kingdom
| | - Maturos Malaisree
- Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, United Kingdom
| | | | | | | | - Matthew Burman
- OpenBioSim Community Interest Company, Edinburgh, United Kingdom
| | - Sofia Bariami
- EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, United Kingdom
| | - Stefano Bosisio
- EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, United Kingdom
| | - Gaetano Calabro
- EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, United Kingdom
| | - Finlay Clark
- EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, United Kingdom
| | - Antonia S J S Mey
- EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, United Kingdom
| | - Julien Michel
- OpenBioSim Community Interest Company, Edinburgh, United Kingdom
- EaStCHEM School of Chemistry, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
14
|
Pelaez RP, Simeon G, Galvelis R, Mirarchi A, Eastman P, Doerr S, Thölke P, Markland TE, De Fabritiis G. TorchMD-Net 2.0: Fast Neural Network Potentials for Molecular Simulations. J Chem Theory Comput 2024; 20:4076-4087. [PMID: 38743033 DOI: 10.1021/acs.jctc.4c00253] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/16/2024]
Abstract
Achieving a balance between computational speed, prediction accuracy, and universal applicability in molecular simulations has been a persistent challenge. This paper presents substantial advancements in TorchMD-Net software, a pivotal step forward in the shift from conventional force fields to neural network-based potentials. The evolution of TorchMD-Net into a more comprehensive and versatile framework is highlighted, incorporating cutting-edge architectures such as TensorNet. This transformation is achieved through a modular design approach, encouraging customized applications within the scientific community. The most notable enhancement is a significant improvement in computational efficiency, achieving a very remarkable acceleration in the computation of energy and forces for TensorNet models, with performance gains ranging from 2× to 10× over previous, nonoptimized, iterations. Other enhancements include highly optimized neighbor search algorithms that support periodic boundary conditions and smooth integration with existing molecular dynamics frameworks. Additionally, the updated version introduces the capability to integrate physical priors, further enriching its application spectrum and utility in research. The software is available at https://github.com/torchmd/torchmd-net.
Collapse
Affiliation(s)
- Raul P Pelaez
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Guillem Simeon
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Raimondas Galvelis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
| | - Antonio Mirarchi
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Peter Eastman
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Stefan Doerr
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
| | | | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, California 94305, United States
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005 Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
15
|
Pelaez RP, Simeon G, Galvelis R, Mirarchi A, Eastman P, Doerr S, Thölke P, Markland TE, De Fabritiis G. TorchMD-Net 2.0: Fast Neural Network Potentials for Molecular Simulations. ARXIV 2024:arXiv:2402.17660v3. [PMID: 38463504 PMCID: PMC10925388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Achieving a balance between computational speed, prediction accuracy, and universal applicability in molecular simulations has been a persistent challenge. This paper presents substantial advancements in the TorchMD-Net software, a pivotal step forward in the shift from conventional force fields to neural network-based potentials. The evolution of TorchMD-Net into a more comprehensive and versatile framework is highlighted, incorporating cutting-edge architectures such as TensorNet. This transformation is achieved through a modular design approach, encouraging customized applications within the scientific community. The most notable enhancement is a significant improvement in computational efficiency, achieving a very remarkable acceleration in the computation of energy and forces for Tensor-Net models, with performance gains ranging from 2x to 10x over previous, non-optimized, iterations. Other enhancements include highly optimized neighbor search algorithms that support periodic boundary conditions and smooth integration with existing molecular dynamics frameworks. Additionally, the updated version introduces the capability to integrate physical priors, further enriching its application spectrum and utility in research. The software is available at https://github.com/torchmd/torchmd-net.
Collapse
Affiliation(s)
- Raul P Pelaez
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Guillem Simeon
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Raimondas Galvelis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005, Barcelona, Spain
| | - Antonio Mirarchi
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
| | - Peter Eastman
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA
| | - Stefan Doerr
- Acellera Labs, C Dr Trueta 183, 08005, Barcelona, Spain
| | | | - Thomas E Markland
- Department of Chemistry, Stanford University, Stanford, CA 94305, USA
| | - Gianni De Fabritiis
- Computational Science Laboratory, Universitat Pompeu Fabra, Barcelona Biomedical Research Park (PRBB), C Dr. Aiguader 88, 08003 Barcelona, Spain
- Acellera Labs, C Dr Trueta 183, 08005, Barcelona, Spain
- Institució Catalana de Recerca i Estudis Avançats (ICREA), Passeig Lluis Companys 23, 08010 Barcelona, Spain
| |
Collapse
|
16
|
Tkaczyk S, Karwounopoulos J, Schöller A, Woodcock HL, Langer T, Boresch S, Wieder M. Reweighting from Molecular Mechanics Force Fields to the ANI-2x Neural Network Potential. J Chem Theory Comput 2024; 20:2719-2728. [PMID: 38527958 DOI: 10.1021/acs.jctc.3c01274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
To achieve chemical accuracy in free energy calculations, it is necessary to accurately describe the system's potential energy surface and efficiently sample configurations from its Boltzmann distribution. While neural network potentials (NNPs) have shown significantly higher accuracy than classical molecular mechanics (MM) force fields, they have a limited range of applicability and are considerably slower than MM potentials, often by orders of magnitude. To address this challenge, Rufa et al. [Rufa et al. bioRxiv 2020, 10.1101/2020.07.29.227959.] suggested a two-stage approach that uses a fast and established MM alchemical energy protocol, followed by reweighting the results using NNPs, known as endstate correction or indirect free energy calculation. This study systematically investigates the accuracy and robustness of reweighting from an MM reference to a neural network target potential (ANI-2x) for an established data set in vacuum, using single-step free-energy perturbation (FEP) and nonequilibrium (NEQ) switching simulation. We assess the influence of longer switching lengths and the impact of slow degrees of freedom on outliers in the work distribution and compare the results to those of multistate equilibrium free energy simulations. Our results demonstrate that free energy calculations between NNPs and MM potentials should be preferably performed using NEQ switching simulations to obtain accurate free energy estimates. NEQ switching simulations between the MM potentials and NNPs are efficient, robust, and trivial to implement.
Collapse
Affiliation(s)
- Sara Tkaczyk
- Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
- Vienna Doctoral School of Pharmaceutical, Nutritional and Sport Sciences (PhaNuSpo), University of Vienna, 1090 Vienna, Austria
| | - Johannes Karwounopoulos
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
- Vienna Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstrasse 42, 1090 Vienna, Austria
| | - Andreas Schöller
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
- Vienna Doctoral School of Chemistry (DoSChem), University of Vienna, Währingerstrasse 42, 1090 Vienna, Austria
| | - H Lee Woodcock
- Department of Chemistry, University of South Florida, 4202 E. Fowler Ave., CHE205, Tampa, Florida 33620-5250, United States
| | - Thierry Langer
- Department of Pharmaceutical Sciences, Pharmaceutical Chemistry Division, University of Vienna, Josef-Holaubek-Platz 2, 1090 Vienna, Austria
| | - Stefan Boresch
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
| | - Marcus Wieder
- Faculty of Chemistry, Institute of Computational Biological Chemistry, University of Vienna, Währingerstrasse 17, 1090 Vienna, Austria
| |
Collapse
|
17
|
Gelman S, Johnson B, Freschlin C, D'Costa S, Gitter A, Romero PA. Biophysics-based protein language models for protein engineering. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.03.15.585128. [PMID: 38559182 PMCID: PMC10980077 DOI: 10.1101/2024.03.15.585128] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Protein language models trained on evolutionary data have emerged as powerful tools for predictive problems involving protein sequence, structure, and function. However, these models overlook decades of research into biophysical factors governing protein function. We propose Mutational Effect Transfer Learning (METL), a protein language model framework that unites advanced machine learning and biophysical modeling. Using the METL framework, we pretrain transformer-based neural networks on biophysical simulation data to capture fundamental relationships between protein sequence, structure, and energetics. We finetune METL on experimental sequence-function data to harness these biophysical signals and apply them when predicting protein properties like thermostability, catalytic activity, and fluorescence. METL excels in challenging protein engineering tasks like generalizing from small training sets and position extrapolation, although existing methods that train on evolutionary signals remain powerful for many types of experimental assays. We demonstrate METL's ability to design functional green fluorescent protein variants when trained on only 64 examples, showcasing the potential of biophysics-based protein language models for protein engineering.
Collapse
Affiliation(s)
- Sam Gelman
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
| | - Bryce Johnson
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
| | | | - Sameer D'Costa
- Department of Biochemistry, University of Wisconsin-Madison
| | - Anthony Gitter
- Department of Computer Sciences, University of Wisconsin-Madison
- Morgridge Institute for Research
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison
| | | |
Collapse
|
18
|
Ding Y, Huang J. Implementation and Validation of an OpenMM Plugin for the Deep Potential Representation of Potential Energy. Int J Mol Sci 2024; 25:1448. [PMID: 38338727 PMCID: PMC10855459 DOI: 10.3390/ijms25031448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Revised: 01/08/2024] [Accepted: 01/11/2024] [Indexed: 02/12/2024] Open
Abstract
Machine learning potentials, particularly the deep potential (DP) model, have revolutionized molecular dynamics (MD) simulations, striking a balance between accuracy and computational efficiency. To facilitate the DP model's integration with the popular MD engine OpenMM, we have developed a versatile OpenMM plugin. This plugin supports a range of applications, from conventional MD simulations to alchemical free energy calculations and hybrid DP/MM simulations. Our extensive validation tests encompassed energy conservation in microcanonical ensemble simulations, fidelity in canonical ensemble generation, and the evaluation of the structural, transport, and thermodynamic properties of bulk water. The introduction of this plugin is expected to significantly expand the application scope of DP models within the MD simulation community, representing a major advancement in the field.
Collapse
Affiliation(s)
- Ye Ding
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China;
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| | - Jing Huang
- School of Life Sciences, Westlake University, Hangzhou 310024, China
- Westlake AI Therapeutics Lab, Westlake Laboratory of Life Sciences and Biomedicine, Hangzhou 310024, China
| |
Collapse
|