1
|
Lee S, Wang D, Seeliger MA, Tiwary P. Calculating Protein-Ligand Residence Times Through State Predictive Information Bottleneck based Enhanced Sampling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.16.589710. [PMID: 38659748 PMCID: PMC11042289 DOI: 10.1101/2024.04.16.589710] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/26/2024]
Abstract
Understanding drug residence times in target proteins is key to improving drug efficacy and understanding target recognition in biochemistry. While drug residence time is just as important as binding affinity, atomic-level understanding of drug residence times through molecular dynamics (MD) simulations has been difficult primarily due to the extremely long timescales. Recent advances in rare event sampling have allowed us to reach these timescales, yet predicting protein-ligand residence times remains a significant challenge. Here we present a semi-automated protocol to calculate the ligand residence times across 12 orders of magnitudes of timescales. In our proposed framework, we integrate a deep learning-based method, the state predictive information bottleneck (SPIB), to learn an approximate reaction coordinate (RC) and use it to guide the enhanced sampling method metadynamics. We demonstrate the performance of our algorithm by applying it to six different protein-ligand complexes with available benchmark residence times, including the dissociation of the widely studied anti-cancer drug Imatinib (Gleevec) from both wild-type Abl kinase and drug-resistant mutants. We show how our protocol can recover quantitatively accurate residence times, potentially opening avenues for deeper insights into drug development possibilities and ligand recognition mechanisms.
Collapse
Affiliation(s)
- Suemin Lee
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Dedi Wang
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
| | - Markus A. Seeliger
- Department of Pharmacological Sciences, Stony Brook University, Stony Brook, NY 11794-8651, USA
| | - Pratyush Tiwary
- Biophysics Program and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, USA
- University of Maryland Institute for Health Computing, Rockville, United States
| |
Collapse
|
2
|
Li J, Wang L, Zhu Z, Song C. Exploring the Alternative Conformation of a Known Protein Structure Based on Contact Map Prediction. J Chem Inf Model 2024; 64:301-315. [PMID: 38117138 PMCID: PMC10777399 DOI: 10.1021/acs.jcim.3c01381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2023] [Revised: 12/03/2023] [Accepted: 12/05/2023] [Indexed: 12/21/2023]
Abstract
The rapid development of deep learning-based methods has considerably advanced the field of protein structure prediction. The accuracy of predicting the 3D structures of simple proteins is comparable to that of experimentally determined structures, providing broad possibilities for structure-based biological studies. Another critical question is whether and how multistate structures can be predicted from a given protein sequence. In this study, analysis of tens of two-state proteins demonstrated that deep learning-based contact map predictions contain structural information on both states, which suggests that it is probably appropriate to change the target of deep learning-based protein structure prediction from one specific structure to multiple likely structures. Furthermore, by combining deep learning- and physics-based computational methods, we developed a protocol for exploring alternative conformations from a known structure of a given protein, by which we successfully approached the holo-state conformations of multiple representative proteins from their apo-state structures.
Collapse
Affiliation(s)
- Jiaxuan Li
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Lei Wang
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Zefeng Zhu
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Chen Song
- Center
for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
- Peking-Tsinghua
Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
3
|
Zheng LE, Barethiya S, Nordquist E, Chen J. Machine Learning Generation of Dynamic Protein Conformational Ensembles. Molecules 2023; 28:4047. [PMID: 37241789 PMCID: PMC10220786 DOI: 10.3390/molecules28104047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2023] [Revised: 05/04/2023] [Accepted: 05/09/2023] [Indexed: 05/28/2023] Open
Abstract
Machine learning has achieved remarkable success across a broad range of scientific and engineering disciplines, particularly its use for predicting native protein structures from sequence information alone. However, biomolecules are inherently dynamic, and there is a pressing need for accurate predictions of dynamic structural ensembles across multiple functional levels. These problems range from the relatively well-defined task of predicting conformational dynamics around the native state of a protein, which traditional molecular dynamics (MD) simulations are particularly adept at handling, to generating large-scale conformational transitions connecting distinct functional states of structured proteins or numerous marginally stable states within the dynamic ensembles of intrinsically disordered proteins. Machine learning has been increasingly applied to learn low-dimensional representations of protein conformational spaces, which can then be used to drive additional MD sampling or directly generate novel conformations. These methods promise to greatly reduce the computational cost of generating dynamic protein ensembles, compared to traditional MD simulations. In this review, we examine recent progress in machine learning approaches towards generative modeling of dynamic protein ensembles and emphasize the crucial importance of integrating advances in machine learning, structural data, and physical principles to achieve these ambitious goals.
Collapse
Affiliation(s)
- Li-E Zheng
- Department of Gynecology, The First Affiliated Hospital of Fujian Medical University, Fuzhou 350005, China;
| | - Shrishti Barethiya
- Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA; (S.B.); (E.N.)
| | - Erik Nordquist
- Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA; (S.B.); (E.N.)
| | - Jianhan Chen
- Department of Chemistry, University of Massachusetts Amherst, Amherst, MA 01003, USA; (S.B.); (E.N.)
| |
Collapse
|
4
|
González-Paz L, Lossada C, Hurtado-León ML, Fernández-Materán FV, Paz JL, Parvizi S, Cardenas Castillo RE, Romero F, Alvarado YJ. Intrinsic Dynamics of the ClpXP Proteolytic Machine Using Elastic Network Models. ACS OMEGA 2023; 8:7302-7318. [PMID: 36873006 PMCID: PMC9979342 DOI: 10.1021/acsomega.2c04347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/09/2022] [Accepted: 10/25/2022] [Indexed: 06/18/2023]
Abstract
ClpXP complex is an ATP-dependent mitochondrial matrix protease that binds, unfolds, translocates, and subsequently degrades specific protein substrates. Its mechanisms of operation are still being debated, and several have been proposed, including the sequential translocation of two residues (SC/2R), six residues (SC/6R), and even long-pass probabilistic models. Therefore, it has been suggested to employ biophysical-computational approaches that can determine the kinetics and thermodynamics of the translocation. In this sense, and based on the apparent inconsistency between structural and functional studies, we propose to apply biophysical approaches based on elastic network models (ENM) to study the intrinsic dynamics of the theoretically most probable hydrolysis mechanism. The proposed models ENM suggest that the ClpP region is decisive for the stabilization of the ClpXP complex, contributing to the flexibility of the residues adjacent to the pore, favoring the increase in pore size and, therefore, with the energy of interaction of its residues with a larger portion of the substrate. It is predicted that the complex may undergo a stable configurational change once assembled and that the deformability of the system once assembled is oriented, to increase the rigidity of the domains of each region (ClpP and ClpX) and to gain flexibility of the pore. Our predictions could suggest under the conditions of this study the mechanism of the interaction of the system, of which the substrate passes through the unfolding of the pore in parallel with a folding of the bottleneck. The variations in the distance calculated by molecular dynamics could allow the passage of a substrate with a size equivalent to ∼3 residues. The theoretical behavior of the pore and the stability and energy of binding to the substrate based on ENM models suggest that in this system, there are thermodynamic, structural, and configurational conditions that allow a possible translocation mechanism that is not strictly sequential.
Collapse
Affiliation(s)
- Lenin González-Paz
- Facultad
Experimental de Ciencias (FEC), Departamento de Biología, Laboratorio
de Genética y Biología Molecular (LGBM), Universidad del Zulia (LUZ), 4001 Maracaibo, Zulia, República Bolivariana
de Venezuela
- Centro
de Biomedicina Molecular (CBM). Laboratorio de Biocomputación
(LB), Instituto Venezolano de Investigaciones
Científicas (IVIC), 4001 Maracaibo, Zulia, República Bolivariana de Venezuela
| | - Carla Lossada
- Centro
de Biomedicina Molecular (CBM). Laboratorio de Biocomputación
(LB), Instituto Venezolano de Investigaciones
Científicas (IVIC), 4001 Maracaibo, Zulia, República Bolivariana de Venezuela
| | - Maria Laura Hurtado-León
- Facultad
Experimental de Ciencias (FEC), Departamento de Biología, Laboratorio
de Genética y Biología Molecular (LGBM), Universidad del Zulia (LUZ), 4001 Maracaibo, Zulia, República Bolivariana
de Venezuela
| | - Francelys V. Fernández-Materán
- Centro
de Biomedicina Molecular (CBM). Laboratorio de Biocomputación
(LB), Instituto Venezolano de Investigaciones
Científicas (IVIC), 4001 Maracaibo, Zulia, República Bolivariana de Venezuela
| | - José Luis Paz
- Departamento
Académico de Química Inorgánica, Facultad de
Química e Ingeniería Química, Universidad Nacional Mayor de San Marcos, 15081 Lima, Perú
| | - Shayan Parvizi
- Pulmonary,
Critical Care and Sleep Medicine, Baylor
College of Medicine, Houston, Texas 77030, United States
| | | | - Freddy Romero
- Pulmonary,
Critical Care and Sleep Medicine, Baylor
College of Medicine, Houston, Texas 77030, United States
| | - Ysaias J. Alvarado
- Centro
de Biomedicina Molecular (CBM), Laboratorio de Química Biofísica
Teórica y Experimental (LQBTE), Instituto
Venezolano de Investigaciones Cientificas (IVIC), 4001 Maracaibo, Zulia, República Bolivariana de Venezuela
| |
Collapse
|
5
|
Xi K, Zhu L. Automated Path Searching Reveals the Mechanism of Hydrolysis Enhancement by T4 Lysozyme Mutants. Int J Mol Sci 2022; 23:ijms232314628. [PMID: 36498954 PMCID: PMC9736071 DOI: 10.3390/ijms232314628] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2022] [Revised: 11/16/2022] [Accepted: 11/19/2022] [Indexed: 11/25/2022] Open
Abstract
Bacteriophage T4 lysozyme (T4L) is a glycosidase that is widely applied as a natural antimicrobial agent in the food industry. Due to its wide applications and small size, T4L has been regarded as a model system for understanding protein dynamics and for large-scale protein engineering. Through structural insights from the single conformation of T4L, a series of mutations (L99A,G113A,R119P) have been introduced, which have successfully raised the fractional population of its only hydrolysis-competent excited state to 96%. However, the actual impact of these substitutions on its dynamics remains unclear, largely due to the lack of highly efficient sampling algorithms. Here, using our recently developed travelling-salesman-based automated path searching (TAPS), we located the minimum-free-energy path (MFEP) for the transition of three T4L mutants from their ground states to their excited states. All three mutants share a three-step transition: the flipping of F114, the rearrangement of α0/α1 helices, and final refinement. Remarkably, the MFEP revealed that the effects of the mutations are drastically beyond the expectations of their original design: (a) the G113A substitution not only enhances helicity but also fills the hydrophobic Cavity I and reduces the free energy barrier for flipping F114; (b) R119P barely changes the stability of the ground state but stabilizes the excited state through rarely reported polar contacts S117OG:N132ND2, E11OE1:R145NH1, and E11OE2:Q105NE2; (c) the residue W138 flips into Cavity I and further stabilizes the excited state for the triple mutant L99A,G113A,R119P. These novel insights that were unexpected in the original mutant design indicated the necessity of incorporating path searching into the workflow of rational protein engineering.
Collapse
|
6
|
Conformational Stability and Denaturation Processes of Proteins Investigated by Electrophoresis under Extreme Conditions. Molecules 2022; 27:molecules27206861. [PMID: 36296453 PMCID: PMC9610776 DOI: 10.3390/molecules27206861] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/10/2022] [Accepted: 10/10/2022] [Indexed: 11/17/2022] Open
Abstract
The functional structure of proteins results from marginally stable folded conformations. Reversible unfolding, irreversible denaturation, and deterioration can be caused by chemical and physical agents due to changes in the physicochemical conditions of pH, ionic strength, temperature, pressure, and electric field or due to the presence of a cosolvent that perturbs the delicate balance between stabilizing and destabilizing interactions and eventually induces chemical modifications. For most proteins, denaturation is a complex process involving transient intermediates in several reversible and eventually irreversible steps. Knowledge of protein stability and denaturation processes is mandatory for the development of enzymes as industrial catalysts, biopharmaceuticals, analytical and medical bioreagents, and safe industrial food. Electrophoresis techniques operating under extreme conditions are convenient tools for analyzing unfolding transitions, trapping transient intermediates, and gaining insight into the mechanisms of denaturation processes. Moreover, quantitative analysis of electrophoretic mobility transition curves allows the estimation of the conformational stability of proteins. These approaches include polyacrylamide gel electrophoresis and capillary zone electrophoresis under cold, heat, and hydrostatic pressure and in the presence of non-ionic denaturing agents or stabilizers such as polyols and heavy water. Lastly, after exposure to extremes of physical conditions, electrophoresis under standard conditions provides information on irreversible processes, slow conformational drifts, and slow renaturation processes. The impressive developments of enzyme technology with multiple applications in fine chemistry, biopharmaceutics, and nanomedicine prompted us to revisit the potentialities of these electrophoretic approaches. This feature review is illustrated with published and unpublished results obtained by the authors on cholinesterases and paraoxonase, two physiologically and toxicologically important enzymes.
Collapse
|
7
|
Abstract
![]()
AlphaFold has burst into our lives. A powerful algorithm
that underscores
the strength of biological sequence data and artificial intelligence
(AI). AlphaFold has appended projects and research directions. The
database it has been creating promises an untold number of applications
with vast potential impacts that are still difficult to surmise. AI
approaches can revolutionize personalized treatments and usher in
better-informed clinical trials. They promise to make giant leaps
toward reshaping and revamping drug discovery strategies, selecting
and prioritizing combinations of drug targets. Here, we briefly overview
AI in structural biology, including in molecular dynamics simulations
and prediction of microbiota–human protein–protein interactions.
We highlight the advancements accomplished by the deep-learning-powered
AlphaFold in protein structure prediction and their powerful impact
on the life sciences. At the same time, AlphaFold does not resolve
the decades-long protein folding challenge, nor does it identify the
folding pathways. The models that AlphaFold provides do not capture
conformational mechanisms like frustration and allostery, which are
rooted in ensembles, and controlled by their dynamic distributions.
Allostery and signaling are properties of populations. AlphaFold also
does not generate ensembles of intrinsically disordered proteins and
regions, instead describing them by their low structural probabilities.
Since AlphaFold generates single ranked structures, rather than conformational
ensembles, it cannot elucidate the mechanisms of allosteric activating
driver hotspot mutations nor of allosteric drug resistance. However,
by capturing key features, deep learning techniques can use the single
predicted conformation as the basis for generating a diverse ensemble.
Collapse
Affiliation(s)
- Ruth Nussinov
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States.,Department of Human Molecular Genetics and Biochemistry, Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| | - Mingzhen Zhang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| | - Yonglan Liu
- Cancer Innovation Laboratory, National Cancer Institute, Frederick, Maryland 21702, United States
| | - Hyunbum Jang
- Computational Structural Biology Section, Frederick National Laboratory for Cancer Research, Frederick, Maryland 21702, United States
| |
Collapse
|
8
|
|
9
|
Bogetti AT, Presti MF, Loh SN, Chong LT. The Next Frontier for Designing Switchable Proteins: Rational Enhancement of Kinetics. J Phys Chem B 2021; 125:9069-9077. [PMID: 34324338 DOI: 10.1021/acs.jpcb.1c04082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Designing proteins that can switch between active (ON) and inactive (OFF) conformations in response to signals such as ligand binding and incident light has been a tantalizing endeavor in protein engineering for over a decade. While such designs have yielded novel biosensors, therapeutic agents, and smart biomaterials, the response times (times for switching ON and OFF) of many switches have been too slow to be of practical use. Among the defining properties of such switches, the kinetics of switching has been the most challenging to optimize. This is largely due to the difficulty of characterizing the structures of transient states, which are required for manipulating the height of the effective free energy barrier between the ON and OFF states. We share our perspective of the most promising new experimental and computational strategies over the past several years for tackling this next frontier for designing switchable proteins.
Collapse
Affiliation(s)
- Anthony T Bogetti
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| | - Maria F Presti
- Department of Biochemistry and Molecular Biology, State University of New York Upstate Medical University, Syracuse, New York 13210, United States
| | - Stewart N Loh
- Department of Biochemistry and Molecular Biology, State University of New York Upstate Medical University, Syracuse, New York 13210, United States
| | - Lillian T Chong
- Department of Chemistry, University of Pittsburgh, Pittsburgh, Pennsylvania 15260, United States
| |
Collapse
|
10
|
Shen R, Crean RM, Johnson SJ, Kamerlin SCL, Hengge AC. Single Residue on the WPD-Loop Affects the pH Dependency of Catalysis in Protein Tyrosine Phosphatases. JACS AU 2021; 1:646-659. [PMID: 34308419 PMCID: PMC8297725 DOI: 10.1021/jacsau.1c00054] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/09/2021] [Indexed: 05/08/2023]
Abstract
Catalysis by protein tyrosine phosphatases (PTPs) relies on the motion of a flexible protein loop (the WPD-loop) that carries a residue acting as a general acid/base catalyst during the PTP-catalyzed reaction. The orthogonal substitutions of a noncatalytic residue in the WPD-loops of YopH and PTP1B result in shifted pH-rate profiles from an altered kinetic pK a of the nucleophilic cysteine. Compared to wild type, the G352T YopH variant has a broadened pH-rate profile, similar activity at optimal pH, but significantly higher activity at low pH. Changes in the corresponding PTP1B T177G variant are more modest and in the opposite direction, with a narrowed pH profile and less activity in the most acidic range. Crystal structures of the variants show no structural perturbations but suggest an increased preference for the WPD-loop-closed conformation. Computational analysis confirms a shift in loop conformational equilibrium in favor of the closed conformation, arising from a combination of increased stability of the closed state and destabilization of the loop-open state. Simulations identify the origins of this population shift, revealing differences in the flexibility of the WPD-loop and neighboring regions. Our results demonstrate that changes to the pH dependency of catalysis by PTPs can result from small changes in amino acid composition in their WPD-loops affecting only loop dynamics and conformational equilibrium. The perturbation of kinetic pK a values of catalytic residues by nonchemical processes affords a means for nature to alter an enzyme's pH dependency by a less disruptive path than altering electrostatic networks around catalytic residues themselves.
Collapse
Affiliation(s)
- Ruidan Shen
- Department
of Chemistry and Biochemistry, Utah State
University, Logan, Utah 84322-0300, United States
| | - Rory M. Crean
- Science
for Life Laboratory, Department of Chemistry − BMC, Uppsala University, Box 576, S-751 23 Uppsala, Sweden
| | - Sean J. Johnson
- Department
of Chemistry and Biochemistry, Utah State
University, Logan, Utah 84322-0300, United States
| | - Shina C. L. Kamerlin
- Science
for Life Laboratory, Department of Chemistry − BMC, Uppsala University, Box 576, S-751 23 Uppsala, Sweden
| | - Alvan C. Hengge
- Department
of Chemistry and Biochemistry, Utah State
University, Logan, Utah 84322-0300, United States
| |
Collapse
|
11
|
Ferguson AL, Hachmann J, Miller TF, Pfaendtner J. The Journal of Physical Chemistry A/ B/ C Virtual Special Issue on Machine Learning in Physical Chemistry. J Phys Chem A 2021; 124:9113-9118. [PMID: 33147969 DOI: 10.1021/acs.jpca.0c09205] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
12
|
Ferguson AL, Hachmann J, Miller TF, Pfaendtner J. The Journal of Physical Chemistry A/ B/ C Virtual Special Issue on Machine Learning in Physical Chemistry. J Phys Chem B 2021; 124:9767-9772. [PMID: 33147970 DOI: 10.1021/acs.jpcb.0c09206] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
13
|
Ramanathan A, Ma H, Parvatikar A, Chennubhotla SC. Artificial intelligence techniques for integrative structural biology of intrinsically disordered proteins. Curr Opin Struct Biol 2021; 66:216-224. [PMID: 33421906 DOI: 10.1016/j.sbi.2020.12.001] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2020] [Revised: 12/01/2020] [Accepted: 12/03/2020] [Indexed: 12/16/2022]
Abstract
We outline recent developments in artificial intelligence (AI) and machine learning (ML) techniques for integrative structural biology of intrinsically disordered proteins (IDP) ensembles. IDPs challenge the traditional protein structure-function paradigm by adapting their conformations in response to specific binding partners leading them to mediate diverse, and often complex cellular functions such as biological signaling, self-organization and compartmentalization. Obtaining mechanistic insights into their function can therefore be challenging for traditional structural determination techniques. Often, scientists have to rely on piecemeal evidence drawn from diverse experimental techniques to characterize their functional mechanisms. Multiscale simulations can help bridge critical knowledge gaps about IDP structure-function relationships-however, these techniques also face challenges in resolving emergent phenomena within IDP conformational ensembles. We posit that scalable statistical inference techniques can effectively integrate information gleaned from multiple experimental techniques as well as from simulations, thus providing access to atomistic details of these emergent phenomena.
Collapse
Affiliation(s)
- Arvind Ramanathan
- Data Science & Learning Division, Argonne National Laboratory, Lemont, IL 60439, United States; Consortium for Advanced Science and Engineering (CASE), University of Chicago, Hyde Park, IL, United States.
| | - Heng Ma
- Data Science & Learning Division, Argonne National Laboratory, Lemont, IL 60439, United States
| | - Akash Parvatikar
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, United States
| | - S Chakra Chennubhotla
- Department of Computational and Systems Biology, University of Pittsburgh, Pittsburgh, PA 15260, United States
| |
Collapse
|
14
|
Pant S, Smith Z, Wang Y, Tajkhorshid E, Tiwary P. Confronting pitfalls of AI-augmented molecular dynamics using statistical physics. J Chem Phys 2020; 153:234118. [PMID: 33353347 PMCID: PMC7863682 DOI: 10.1063/5.0030931] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 11/29/2020] [Indexed: 12/31/2022] Open
Abstract
Artificial intelligence (AI)-based approaches have had indubitable impact across the sciences through the ability to extract relevant information from raw data. Recently, AI has also found use in enhancing the efficiency of molecular simulations, wherein AI derived slow modes are used to accelerate the simulation in targeted ways. However, while typical fields where AI is used are characterized by a plethora of data, molecular simulations, per construction, suffer from limited sampling and thus limited data. As such, the use of AI in molecular simulations can suffer from a dangerous situation where the AI-optimization could get stuck in spurious regimes, leading to incorrect characterization of the reaction coordinate (RC) for the problem at hand. When such an incorrect RC is then used to perform additional simulations, one could start to deviate progressively from the ground truth. To deal with this problem of spurious AI-solutions, here, we report a novel and automated algorithm using ideas from statistical mechanics. It is based on the notion that a more reliable AI-solution will be one that maximizes the timescale separation between slow and fast processes. To learn this timescale separation even from limited data, we use a maximum caliber-based framework. We show the applicability of this automatic protocol for three classic benchmark problems, namely, the conformational dynamics of a model peptide, ligand-unbinding from a protein, and folding/unfolding energy landscape of the C-terminal domain of protein G. We believe that our work will lead to increased and robust use of trustworthy AI in molecular simulations of complex systems.
Collapse
Affiliation(s)
- Shashank Pant
- NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | | | | | - Emad Tajkhorshid
- NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, Department of Biochemistry, Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, Illinois 61801, USA
| | | |
Collapse
|