1
|
Sarkar D, Surpeta B, Brezovsky J. Incorporating Prior Knowledge in the Seeds of Adaptive Sampling Molecular Dynamics Simulations of Ligand Transport in Enzymes with Buried Active Sites. J Chem Theory Comput 2024; 20:5807-5819. [PMID: 38978395 PMCID: PMC11270739 DOI: 10.1021/acs.jctc.4c00452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Revised: 06/26/2024] [Accepted: 07/01/2024] [Indexed: 07/10/2024]
Abstract
Because most proteins have buried active sites, protein tunnels or channels play a crucial role in the transport of small molecules into buried cavities for enzymatic catalysis. Tunnels can critically modulate the biological process of protein-ligand recognition. Various molecular dynamics methods have been developed for exploring and exploiting the protein-ligand conformational space to extract high-resolution details of the binding processes, a recent example being energetically unbiased high-throughput adaptive sampling simulations. The current study systematically contrasted the role of integrating prior knowledge while generating useful initial protein-ligand configurations, called seeds, for these simulations. Using a nontrivial system of a haloalkane dehalogenase mutant with multiple transport tunnels leading to a deeply buried active site, simulations were employed to derive kinetic models describing the process of association and dissociation of the substrate molecule. The most knowledge-based seed generation enabled high-throughput simulations that could more consistently capture the entire transport process, explore the complex network of transport tunnels, and predict equilibrium dissociation constants, koff/kon, on the same order of magnitude as experimental measurements. Overall, the infusion of more knowledge into the initial seeds of adaptive sampling simulations could render analyses of transport mechanisms in enzymes more consistent even for very complex biomolecular systems, thereby promoting drug development efforts and the rational design of enzymes with buried active sites.
Collapse
Affiliation(s)
- Dheeraj
Kumar Sarkar
- Laboratory
of Biomolecular Interactions and Transport, Department of Gene Expression,
Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznanskiego 6, Poznan 61-614, Poland
- International
Institute of Molecular and Cell Biology in Warsaw, Ks Trojdena 4, Warsaw 02-109, Poland
| | - Bartlomiej Surpeta
- Laboratory
of Biomolecular Interactions and Transport, Department of Gene Expression,
Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznanskiego 6, Poznan 61-614, Poland
- International
Institute of Molecular and Cell Biology in Warsaw, Ks Trojdena 4, Warsaw 02-109, Poland
| | - Jan Brezovsky
- Laboratory
of Biomolecular Interactions and Transport, Department of Gene Expression,
Institute of Molecular Biology and Biotechnology, Faculty of Biology, Adam Mickiewicz University, Uniwersytetu Poznanskiego 6, Poznan 61-614, Poland
- International
Institute of Molecular and Cell Biology in Warsaw, Ks Trojdena 4, Warsaw 02-109, Poland
| |
Collapse
|
2
|
Sarkar D, Lee H, Vant JW, Turilli M, Vermaas JV, Jha S, Singharoy A. Adaptive Ensemble Refinement of Protein Structures in High Resolution Electron Microscopy Density Maps with Radical Augmented Molecular Dynamics Flexible Fitting. J Chem Inf Model 2023; 63:5834-5846. [PMID: 37661856 DOI: 10.1021/acs.jcim.3c00350] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/05/2023]
Abstract
Recent advances in cryo-electron microscopy (cryo-EM) have enabled modeling macromolecular complexes that are essential components of the cellular machinery. The density maps derived from cryo-EM experiments are often integrated with manual, knowledge-driven or artificial intelligence-driven and physics-guided computational methods to build, fit, and refine molecular structures. Going beyond a single stationary-structure determination scheme, it is becoming more common to interpret the experimental data with an ensemble of models that contributes to an average observation. Hence, there is a need to decide on the quality of an ensemble of protein structures on-the-fly while refining them against the density maps. We introduce such an adaptive decision-making scheme during the molecular dynamics flexible fitting (MDFF) of biomolecules. Using RADICAL-Cybertools, the new RADICAL augmented MDFF implementation (R-MDFF) is examined in high-performance computing environments for refinement of two prototypical protein systems, adenylate kinase and carbon monoxide dehydrogenase. For these test cases, use of multiple replicas in flexible fitting with adaptive decision making in R-MDFF improves the overall correlation to the density by 40% relative to the refinements of the brute-force MDFF. The improvements are particularly significant at high, 2-3 Å map resolutions. More importantly, the ensemble model captures key features of biologically relevant molecular dynamics that are inaccessible to a single-model interpretation. Finally, the pipeline is applicable to systems of growing sizes, which is demonstrated using ensemble refinement of capsid proteins from the chimpanzee adenovirus. The overhead for decision making remains low and robust to computing environments. The software is publicly available on GitHub and includes a short user guide to install R-MDFF on different computing environments, from local Linux-based workstations to high-performance computing environments.
Collapse
Affiliation(s)
- Daipayan Sarkar
- MSU-DOE Plant Research Laboratory, East Lansing, Michigan 48824, United States
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
| | - Hyungro Lee
- Pacific Northwest National Laboratory, Richland, Washington 99354, United States
- Electrical & Computer Engineering, Rutgers University, New Brunswick, New Jersey 08854, United States
| | - John W Vant
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
| | - Matteo Turilli
- Electrical & Computer Engineering, Rutgers University, New Brunswick, New Jersey 08854, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Josh V Vermaas
- MSU-DOE Plant Research Laboratory, East Lansing, Michigan 48824, United States
| | - Shantenu Jha
- Electrical & Computer Engineering, Rutgers University, New Brunswick, New Jersey 08854, United States
- Computational Science Initiative, Brookhaven National Laboratory, Upton, New York 11973, United States
| | - Abhishek Singharoy
- School of Molecular Sciences, Arizona State University, Tempe, Arizona 85281, United States
| |
Collapse
|
3
|
Yasuda T, Morita R, Shigeta Y, Harada R. Protein Structure Validation Derives a Smart Conformational Search in a Physically Relevant Configurational Subspace. J Chem Inf Model 2022; 62:6217-6227. [PMID: 36449380 DOI: 10.1021/acs.jcim.2c01173] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Since proteins perform biological functions through their dynamic properties, molecular dynamics (MD) simulation is a sophisticated strategy for investigating their functions. Analyses of trajectories provide statistical information about a specific protein as a free-energy landscape (FEL). However, the timescale of normal MD is shorter than that of biological functions, resulting in statistically insufficient conformational sampling, finally leading to unreliable FEL calculation. To search for a broad configurational subspace, an external bias is imposed on a target protein as biased sampling. However, its regulation is challenging because the optimal strength of the perturbation is unknown. Furthermore, a physically irrelevant configurational subspace was searched when imposing an inappropriate external bias. To address this issue, we newly proposed an external biased regulation scheme known as the G-factor external bias limiter (GERBIL). In GERBIL, protein configurations generated by external bias are structurally validated by an indicator (G-factor), enabling the search for a physically relevant subspace. In addition to biased sampling, nonbiased sampling might search for a physically irrelevant configurational subspace because repeating multiple MD simulations from several initial structures tends to search for an overly broad configurational subspace. For this issue, the structural qualities of configurations generated by nonbiased sampling have not been investigated. Therefore, we confirmed whether the G-factor screened the collapsed (low-quality) configurations generated by nonbiased sampling. To address this issue, the outlier flooding method (OFLOOD) was adopted in GERBIL as a nonbiased sampling method, which is referred to as OFLOOD-GERBIL. OFLOOD rapidly expands a configurational subspace by resampling the rarely occurring states of a given protein and tends to search an overly broad subspace. Thus, we considered that GERBIL might improve the excessive conformational search of OFLOOD for a physically irrelevant configurational subspace. As a demonstration, OFLOOD and OFLOOD-GERBIL were applied to a globular protein (T4 lysozyme) and their conformational search qualities were assessed. Based on our assessment, normal OFLOOD without the outlier validation frequently sampled low-quality configurations, whereas OFLOOD-GERBIL with the outlier validation intensively sampled high-quality configurations. In conclusion, OFLOOD-GERBIL derives a smart conformational search in a physically relevant configurational subspace, indicating that protein structure validation works in both nonbiased and biased sampling methods.
Collapse
Affiliation(s)
- Takunori Yasuda
- College of Biological Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki305-0821, Japan
| | - Rikuri Morita
- Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki305-8577, Japan
| | - Yasuteru Shigeta
- Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki305-8577, Japan
| | - Ryuhei Harada
- Center for Computational Sciences, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki305-8577, Japan
| |
Collapse
|
4
|
Xi K, Zhu L. Automated Path Searching Reveals the Mechanism of Hydrolysis Enhancement by T4 Lysozyme Mutants. Int J Mol Sci 2022; 23:ijms232314628. [PMID: 36498954 PMCID: PMC9736071 DOI: 10.3390/ijms232314628] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 11/16/2022] [Accepted: 11/19/2022] [Indexed: 11/25/2022] Open
Abstract
Bacteriophage T4 lysozyme (T4L) is a glycosidase that is widely applied as a natural antimicrobial agent in the food industry. Due to its wide applications and small size, T4L has been regarded as a model system for understanding protein dynamics and for large-scale protein engineering. Through structural insights from the single conformation of T4L, a series of mutations (L99A,G113A,R119P) have been introduced, which have successfully raised the fractional population of its only hydrolysis-competent excited state to 96%. However, the actual impact of these substitutions on its dynamics remains unclear, largely due to the lack of highly efficient sampling algorithms. Here, using our recently developed travelling-salesman-based automated path searching (TAPS), we located the minimum-free-energy path (MFEP) for the transition of three T4L mutants from their ground states to their excited states. All three mutants share a three-step transition: the flipping of F114, the rearrangement of α0/α1 helices, and final refinement. Remarkably, the MFEP revealed that the effects of the mutations are drastically beyond the expectations of their original design: (a) the G113A substitution not only enhances helicity but also fills the hydrophobic Cavity I and reduces the free energy barrier for flipping F114; (b) R119P barely changes the stability of the ground state but stabilizes the excited state through rarely reported polar contacts S117OG:N132ND2, E11OE1:R145NH1, and E11OE2:Q105NE2; (c) the residue W138 flips into Cavity I and further stabilizes the excited state for the triple mutant L99A,G113A,R119P. These novel insights that were unexpected in the original mutant design indicated the necessity of incorporating path searching into the workflow of rational protein engineering.
Collapse
|
5
|
Kleiman DE, Shukla D. Multiagent Reinforcement Learning-Based Adaptive Sampling for Conformational Dynamics of Proteins. J Chem Theory Comput 2022; 18:5422-5434. [PMID: 36044642 DOI: 10.1021/acs.jctc.2c00683] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Machine learning is increasingly applied to improve the efficiency and accuracy of molecular dynamics (MD) simulations. Although the growth of distributed computer clusters has allowed researchers to obtain higher amounts of data, unbiased MD simulations have difficulty sampling rare states, even under massively parallel adaptive sampling schemes. To address this issue, several algorithms inspired by reinforcement learning (RL) have arisen to promote exploration of the slow collective variables (CVs) of complex systems. Nonetheless, most of these algorithms are not well-suited to leverage the information gained by simultaneously sampling a system from different initial states (e.g., a protein in different conformations associated with distinct functional states). To fill this gap, we propose two algorithms inspired by multiagent RL that extend the functionality of closely related techniques (REAP and TSLC) to situations where the sampling can be accelerated by learning from different regions of the energy landscape through coordinated agents. Essentially, the algorithms work by remembering which agent discovered each conformation and sharing this information with others at the action-space discretization step. A stakes function is introduced to modulate how different agents sense rewards from discovered states of the system. The consequences are three-fold: (i) agents learn to prioritize CVs using only relevant data, (ii) redundant exploration is reduced, and (iii) agents that obtain higher stakes are assigned more actions. We compare our algorithm with other adaptive sampling techniques (least counts, REAP, TSLC, and AdaptiveBandit) to show and rationalize the gain in performance.
Collapse
Affiliation(s)
- Diego E Kleiman
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| | - Diwakar Shukla
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Chemical and Biomolecular Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Bioengineering, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States.,Department of Plant Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, United States
| |
Collapse
|
6
|
Wieczór M, Genna V, Aranda J, Badia RM, Gelpí JL, Gapsys V, de Groot BL, Lindahl E, Municoy M, Hospital A, Orozco M. Pre-exascale HPC approaches for molecular dynamics simulations. Covid-19 research: A use case. WILEY INTERDISCIPLINARY REVIEWS. COMPUTATIONAL MOLECULAR SCIENCE 2022; 13:e1622. [PMID: 35935573 PMCID: PMC9347456 DOI: 10.1002/wcms.1622] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/25/2022] [Accepted: 04/28/2022] [Indexed: 06/15/2023]
Abstract
Exascale computing has been a dream for ages and is close to becoming a reality that will impact how molecular simulations are being performed, as well as the quantity and quality of the information derived for them. We review how the biomolecular simulations field is anticipating these new architectures, making emphasis on recent work from groups in the BioExcel Center of Excellence for High Performance Computing. We exemplified the power of these simulation strategies with the work done by the HPC simulation community to fight Covid-19 pandemics. This article is categorized under:Data Science > Computer Algorithms and ProgrammingData Science > Databases and Expert SystemsMolecular and Statistical Mechanics > Molecular Dynamics and Monte-Carlo Methods.
Collapse
Affiliation(s)
- Miłosz Wieczór
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and TechnologyBarcelonaSpain
- Department of Physical ChemistryGdansk University of TechnologyGdańskPoland
| | - Vito Genna
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Juan Aranda
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | | | - Josep Lluís Gelpí
- Barcelona Supercomputing CenterBarcelonaSpain
- Department of Biochemistry and BiomedicineUniversity of BarcelonaBarcelonaSpain
| | - Vytautas Gapsys
- Max Planck Institute for Multidisciplinary SciencesComputational Biomolecular Dynamics GroupGoettingenGermany
| | - Bert L. de Groot
- Max Planck Institute for Multidisciplinary SciencesComputational Biomolecular Dynamics GroupGoettingenGermany
| | - Erik Lindahl
- Department of Applied PhysicsSwedish e‐Science Research Center, KTH Royal Institute of TechnologyStockholmSweden
- Department of Biochemistry and Biophysics, Science for Life LaboratoryStockholm UniversityStockholmSweden
| | | | - Adam Hospital
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and TechnologyBarcelonaSpain
| | - Modesto Orozco
- Institute for Research in Biomedicine (IRB Barcelona). The Barcelona Institute of Science and TechnologyBarcelonaSpain
- Department of Biochemistry and BiomedicineUniversity of BarcelonaBarcelonaSpain
| |
Collapse
|
7
|
Ni D, Chai Z, Wang Y, Li M, Yu Z, Liu Y, Lu S, Zhang J. Along the allostery stream: Recent advances in computational methods for allosteric drug discovery. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2021. [DOI: 10.1002/wcms.1585] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Affiliation(s)
- Duan Ni
- College of Pharmacy Ningxia Medical University Yinchuan China
- The Charles Perkins Centre University of Sydney Sydney New South Wales Australia
| | - Zongtao Chai
- Department of Hepatic Surgery VI, Eastern Hepatobiliary Surgery Hospital Second Military Medical University Shanghai China
| | - Ying Wang
- State Key Laboratory of Oncogenes and Related Genes, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education Shanghai Jiao Tong University School of Medicine Shanghai China
| | - Mingyu Li
- State Key Laboratory of Oncogenes and Related Genes, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education Shanghai Jiao Tong University School of Medicine Shanghai China
| | | | - Yaqin Liu
- Medicinal Chemistry and Bioinformatics Center Shanghai Jiao Tong University School of Medicine Shanghai China
| | - Shaoyong Lu
- College of Pharmacy Ningxia Medical University Yinchuan China
- State Key Laboratory of Oncogenes and Related Genes, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education Shanghai Jiao Tong University School of Medicine Shanghai China
- Medicinal Chemistry and Bioinformatics Center Shanghai Jiao Tong University School of Medicine Shanghai China
| | - Jian Zhang
- College of Pharmacy Ningxia Medical University Yinchuan China
- State Key Laboratory of Oncogenes and Related Genes, Key Laboratory of Cell Differentiation and Apoptosis of Chinese Ministry of Education Shanghai Jiao Tong University School of Medicine Shanghai China
- Medicinal Chemistry and Bioinformatics Center Shanghai Jiao Tong University School of Medicine Shanghai China
- School of Pharmaceutical Sciences Zhengzhou University Zhengzhou China
| |
Collapse
|
8
|
Chen M. Collective variable-based enhanced sampling and machine learning. THE EUROPEAN PHYSICAL JOURNAL. B 2021; 94:211. [PMID: 34697536 PMCID: PMC8527828 DOI: 10.1140/epjb/s10051-021-00220-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Accepted: 10/03/2021] [Indexed: 05/14/2023]
Abstract
ABSTRACT Collective variable-based enhanced sampling methods have been widely used to study thermodynamic properties of complex systems. Efficiency and accuracy of these enhanced sampling methods are affected by two factors: constructing appropriate collective variables for enhanced sampling and generating accurate free energy surfaces. Recently, many machine learning techniques have been developed to improve the quality of collective variables and the accuracy of free energy surfaces. Although machine learning has achieved great successes in improving enhanced sampling methods, there are still many challenges and open questions. In this perspective, we shall review recent developments on integrating machine learning techniques and collective variable-based enhanced sampling approaches. We also discuss challenges and future research directions including generating kinetic information, exploring high-dimensional free energy surfaces, and efficiently sampling all-atom configurations. GRAPHIC ABSTRACT
Collapse
Affiliation(s)
- Ming Chen
- Department of Chemistry, Purdue University, West Lafayette, IN 47907 USA
| |
Collapse
|
9
|
Torrillo PA, Bogetti AT, Chong LT. A Minimal, Adaptive Binning Scheme for Weighted Ensemble Simulations. J Phys Chem A 2021; 125:1642-1649. [PMID: 33577732 PMCID: PMC8091492 DOI: 10.1021/acs.jpca.0c10724] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
A promising approach for simulating rare events with rigorous kinetics is the weighted ensemble path sampling strategy. One challenge of this strategy is the division of configurational space into bins for sampling. Here we present a minimal adaptive binning (MAB) scheme for the automated, adaptive placement of bins along a progress coordinate within the framework of the weighted ensemble strategy. Results reveal that the MAB binning scheme, despite its simplicity, is more efficient than a manual, fixed binning scheme in generating transitions over large free energy barriers, generating a diversity of pathways, estimating rate constants, and sampling conformations. The scheme is general and extensible to any rare-events sampling strategy that employs progress coordinates.
Collapse
Affiliation(s)
- Paul A Torrillo
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Anthony T Bogetti
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Lillian T Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|