1
|
Saharkhiz S, Mostafavi M, Birashk A, Karimian S, Khalilollah S, Jaferian S, Yazdani Y, Alipourfard I, Huh YS, Farani MR, Akhavan-Sigari R. The State-of-the-Art Overview to Application of Deep Learning in Accurate Protein Design and Structure Prediction. Top Curr Chem (Cham) 2024; 382:23. [PMID: 38965117 PMCID: PMC11224075 DOI: 10.1007/s41061-024-00469-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2024] [Accepted: 06/09/2024] [Indexed: 07/06/2024]
Abstract
In recent years, there has been a notable increase in the scientific community's interest in rational protein design. The prospect of designing an amino acid sequence that can reliably fold into a desired three-dimensional structure and exhibit the intended function is captivating. However, a major challenge in this endeavor lies in accurately predicting the resulting protein structure. The exponential growth of protein databases has fueled the advancement of the field, while newly developed algorithms have pushed the boundaries of what was previously achievable in structure prediction. In particular, using deep learning methods instead of brute force approaches has emerged as a faster and more accurate strategy. These deep-learning techniques leverage the vast amount of data available in protein databases to extract meaningful patterns and predict protein structures with improved precision. In this article, we explore the recent developments in the field of protein structure prediction. We delve into the newly developed methods that leverage deep learning approaches, highlighting their significance and potential for advancing our understanding of protein design.
Collapse
Affiliation(s)
- Saber Saharkhiz
- Division of Neuroscience, Department of Cellular and Molecular Medicine, Faculty of Medicine, University of Ottawa, Ottawa, ON, Canada
| | - Mehrnaz Mostafavi
- Faculty of Allied Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
| | - Amin Birashk
- Department of Computer Science, The University of Texas at Dallas, Richardson, TX, USA
| | - Shiva Karimian
- Electrical and Computer Research Center, Sanandaj Azad University, Sanandaj, Iran
| | - Shayan Khalilollah
- Department of Neurosurgery, Faculty of Medicine, Tehran Medical Sciences, Islamic Azad University, Tehran, Iran
| | - Sohrab Jaferian
- Goergen Institute for Data Science, University of Rochester, Rochester, NY, USA
| | - Yalda Yazdani
- Immunology Research Center, Tabriz University of Medical Sciences, Tabriz, Iran.
| | - Iraj Alipourfard
- Institute of Physical Chemistry, Polish Academy of Sciences, Marcina Kasprzaka 44/52, 01-224, Warsaw, Poland.
| | - Yun Suk Huh
- Department of Biological Engineering, Inha University, Incheon, Republic of Korea
| | | | | |
Collapse
|
2
|
Turzo SMBA, Seffernick JT, Lyskov S, Lindert S. Predicting ion mobility collision cross sections using projection approximation with ROSIE-PARCS webserver. Brief Bioinform 2023; 24:bbad308. [PMID: 37609950 PMCID: PMC10516336 DOI: 10.1093/bib/bbad308] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Revised: 07/03/2023] [Accepted: 08/08/2023] [Indexed: 08/24/2023] Open
Abstract
Ion mobility coupled to mass spectrometry informs on the shape and size of protein structures in the form of a collision cross section (CCSIM). Although there are several computational methods for predicting CCSIM based on protein structures, including our previously developed projection approximation using rough circular shapes (PARCS), the process usually requires prior experience with the command-line interface. To overcome this challenge, here we present a web application on the Rosetta Online Server that Includes Everyone (ROSIE) webserver to predict CCSIM from protein structure using projection approximation with PARCS. In this web interface, the user is only required to provide one or more PDB files as input. Results from our case studies suggest that CCSIM predictions (with ROSIE-PARCS) are highly accurate with an average error of 6.12%. Furthermore, the absolute difference between CCSIM and CCSPARCS can help in distinguishing accurate from inaccurate AlphaFold2 protein structure predictions. ROSIE-PARCS is designed with a user-friendly interface, is available publicly and is free to use. The ROSIE-PARCS web interface is supported by all major web browsers and can be accessed via this link (https://rosie.graylab.jhu.edu).
Collapse
Affiliation(s)
- S M Bargeen Alam Turzo
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH 43210, USA
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH 43210, USA
| | - Sergey Lyskov
- Department of Chemical and Biomolecular Engineering, Johns Hopkins University, Baltimore, MD 21218, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
3
|
Gadanecz M, Fazekas Z, Pálfy G, Karancsiné Menyhárd D, Perczel A. NMR-Chemical-Shift-Driven Protocol Reveals the Cofactor-Bound, Complete Structure of Dynamic Intermediates of the Catalytic Cycle of Oncogenic KRAS G12C Protein and the Significance of the Mg 2+ Ion. Int J Mol Sci 2023; 24:12101. [PMID: 37569478 PMCID: PMC10418480 DOI: 10.3390/ijms241512101] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 07/22/2023] [Accepted: 07/25/2023] [Indexed: 08/13/2023] Open
Abstract
In this work, catalytically significant states of the oncogenic G12C variant of KRAS, those of Mg2+-free and Mg2+-bound GDP-loaded forms, have been determined using CS-Rosetta software and NMR-data-driven molecular dynamics simulations. There are several Mg2+-bound G12C KRAS/GDP structures deposited in the Protein Data Bank (PDB), so this system was used as a reference, while the structure of the Mg2+-free but GDP-bound state of the RAS cycle has not been determined previously. Due to the high flexibility of the Switch-I and Switch-II regions, which also happen to be the catalytically most significant segments, only chemical shift information could be collected for the most important regions of both systems. CS-Rosetta was used to derive an "NMR ensemble" based on the measured chemical shifts, which, however, did not contain the nonprotein components of the complex. We developed a torsional restraint set for backbone torsions based on the CS-Rosetta ensembles for MD simulations, overriding the force-field-based parametrization in the presence of the reinserted cofactors. This protocol (csdMD) resulted in complete models for both systems that also retained the structural features and heterogeneity defined by the measured chemical shifts and allowed a detailed comparison of the Mg2+-bound and Mg2+-free states of G12C KRAS/GDP.
Collapse
Affiliation(s)
- Márton Gadanecz
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary; (M.G.); (D.K.M.)
- Hevesy György PhD School of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary
| | - Zsolt Fazekas
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary; (M.G.); (D.K.M.)
- Hevesy György PhD School of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary
| | - Gyula Pálfy
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary; (M.G.); (D.K.M.)
- ELKH-ELTE Protein Modeling Research Group, Eötvös Loránd Research Network (ELKH), Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary
- Department of Biology, Institute of Biochemistry, ETH Zürich, 8093 Zürich, Switzerland
| | - Dóra Karancsiné Menyhárd
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary; (M.G.); (D.K.M.)
- ELKH-ELTE Protein Modeling Research Group, Eötvös Loránd Research Network (ELKH), Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary
| | - András Perczel
- Laboratory of Structural Chemistry and Biology, Institute of Chemistry, Eötvös Loránd University, Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary; (M.G.); (D.K.M.)
- ELKH-ELTE Protein Modeling Research Group, Eötvös Loránd Research Network (ELKH), Pázmány Péter stny. 1/A, H-1117 Budapest, Hungary
| |
Collapse
|
4
|
Chang L, Mondal A, MacCallum JL, Perez A. CryoFold 2.0: Cryo-EM Structure Determination with MELD. J Phys Chem A 2023; 127:3906-3913. [PMID: 37084537 DOI: 10.1021/acs.jpca.3c01731] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/23/2023]
Abstract
Cryo-electron microscopy data are becoming more prevalent and accessible at higher resolution levels, leading to the development of new computational tools to determine the atomic structure of macromolecules. However, while existing tools adapted from X-ray crystallography are suitable for the highest-resolution maps, new tools are needed for lower-resolution levels and to account for map heterogeneity. In this article, we introduce CryoFold 2.0, an integrative physics-based approach that combines Bayesian inference and the ability to handle multiple data sources with the molecular dynamics flexible fitting (MDFF) approach to determine the structures of macromolecules by using cryo-EM data. CryoFold 2.0 is incorporated into the MELD (modeling employing limited data) plugin, resulting in a pipeline that is more computationally efficient and accurate than running MELD or MDFF alone. The approach requires fewer computational resources and shorter simulation times than the original CryoFold, and it minimizes manual intervention. We demonstrate the effectiveness of the approach on eight different systems, highlighting its various benefits.
Collapse
Affiliation(s)
- Liwei Chang
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Arup Mondal
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| | - Justin L MacCallum
- Department of Chemistry, University of Calgary, Calgary, AB T2N 1N4, Canada
| | - Alberto Perez
- Department of Chemistry and Quantum Theory Project, University of Florida, Gainesville, Florida 32611, United States
| |
Collapse
|
5
|
Wells NGM, Smith CA. Predicting binding affinity changes from long-distance mutations using molecular dynamics simulations and Rosetta. Proteins 2023. [PMID: 36757060 DOI: 10.1002/prot.26477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 01/20/2023] [Accepted: 02/07/2023] [Indexed: 02/10/2023]
Abstract
Computationally modeling how mutations affect protein-protein binding not only helps uncover the biophysics of protein interfaces, but also enables the redesign and optimization of protein interactions. Traditional high-throughput methods for estimating binding free energy changes are currently limited to mutations directly at the interface due to difficulties in accurately modeling how long-distance mutations propagate their effects through the protein structure. However, the modeling and design of such mutations is of substantial interest as it allows for greater control and flexibility in protein design applications. We have developed a method that combines high-throughput Rosetta-based side-chain optimization with conformational sampling using classical molecular dynamics simulations, finding significant improvements in our ability to accurately predict long-distance mutational perturbations to protein binding. Our approach uses an analytical framework grounded in alchemical free energy calculations while enabling exploration of a vastly larger sequence space. When comparing to experimental data, we find that our method can predict internal long-distance mutational perturbations with a level of accuracy similar to that of traditional methods in predicting the effects of mutations at the protein-protein interface. This work represents a new and generalizable approach to optimize protein free energy landscapes for desired biological functions.
Collapse
Affiliation(s)
- Nicholas G M Wells
- Department of Chemistry, Wesleyan University, Middletown, Connecticut, USA
| | - Colin A Smith
- Department of Chemistry, Wesleyan University, Middletown, Connecticut, USA
| |
Collapse
|
6
|
Drake ZC, Seffernick JT, Lindert S. Protein complex prediction using Rosetta, AlphaFold, and mass spectrometry covalent labeling. Nat Commun 2022; 13:7846. [PMID: 36543826 PMCID: PMC9772387 DOI: 10.1038/s41467-022-35593-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Accepted: 12/09/2022] [Indexed: 12/24/2022] Open
Abstract
Covalent labeling (CL) in combination with mass spectrometry can be used as an analytical tool to study and determine structural properties of protein-protein complexes. However, data from these experiments is sparse and does not unambiguously elucidate protein structure. Thus, computational algorithms are needed to deduce structure from the CL data. In this work, we present a hybrid method that combines models of protein complex subunits generated with AlphaFold with differential CL data via a CL-guided protein-protein docking in Rosetta. In a benchmark set, the RMSD (root-mean-square deviation) of the best-scoring models was below 3.6 Å for 5/5 complexes with inclusion of CL data, whereas the same quality was only achieved for 1/5 complexes without CL data. This study suggests that our integrated approach can successfully use data obtained from CL experiments to distinguish between nativelike and non-nativelike models.
Collapse
Affiliation(s)
- Zachary C Drake
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, US
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, US
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, US.
| |
Collapse
|
7
|
Turzo SMBA, Seffernick JT, Rolland AD, Donor MT, Heinze S, Prell JS, Wysocki VH, Lindert S. Protein shape sampled by ion mobility mass spectrometry consistently improves protein structure prediction. Nat Commun 2022; 13:4377. [PMID: 35902583 PMCID: PMC9334640 DOI: 10.1038/s41467-022-32075-9] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2021] [Accepted: 07/14/2022] [Indexed: 11/09/2022] Open
Abstract
Ion mobility (IM) mass spectrometry provides structural information about protein shape and size in the form of an orientationally-averaged collision cross-section (CCSIM). While IM data have been used with various computational methods, they have not yet been utilized to predict monomeric protein structure from sequence. Here, we show that IM data can significantly improve protein structure determination using the modelling suite Rosetta. We develop the Rosetta Projection Approximation using Rough Circular Shapes (PARCS) algorithm that allows for fast and accurate prediction of CCSIM from structure. Following successful testing of the PARCS algorithm, we use an integrative modelling approach to utilize IM data for protein structure prediction. Additionally, we propose a confidence metric that identifies near native models in the absence of a known structure. The results of this study demonstrate the ability of IM data to consistently improve protein structure prediction.
Collapse
Affiliation(s)
- S M Bargeen Alam Turzo
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH, 43210, USA
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH, 43210, USA
| | - Amber D Rolland
- Department of Chemistry and Biochemistry and Materials Science Institute, University of Oregon, Eugene, OR, 97403, USA
| | - Micah T Donor
- Department of Chemistry and Biochemistry and Materials Science Institute, University of Oregon, Eugene, OR, 97403, USA
| | - Sten Heinze
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH, 43210, USA
| | - James S Prell
- Department of Chemistry and Biochemistry and Materials Science Institute, University of Oregon, Eugene, OR, 97403, USA
| | - Vicki H Wysocki
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH, 43210, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry and Resource for Native Mass Spectrometry Guided Structural Biology, Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
8
|
Shekhar M, Terashi G, Gupta C, Sarkar D, Debussche G, Sisco NJ, Nguyen J, Mondal A, Vant J, Fromme P, Van Horn WD, Tajkhorshid E, Kihara D, Dill K, Perez A, Singharoy A. CryoFold: determining protein structures and data-guided ensembles from cryo-EM density maps. MATTER 2021; 4:3195-3216. [PMID: 35874311 PMCID: PMC9302471 DOI: 10.1016/j.matt.2021.09.004] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
Cryo-electron microscopy (EM) requires molecular modeling to refine structural details from data. Ensemble models arrive at low free-energy molecular structures, but are computationally expensive and limited to resolving only small proteins that cannot be resolved by cryo-EM. Here, we introduce CryoFold - a pipeline of molecular dynamics simulations that determines ensembles of protein structures directly from sequence by integrating density data of varying sparsity at 3-5 Å resolution with coarse-grained topological knowledge of the protein folds. We present six examples showing its broad applicability for folding proteins between 72 to 2000 residues, including large membrane and multi-domain systems, and results from two EMDB competitions. Driven by data from a single state, CryoFold discovers ensembles of common low-energy models together with rare low-probability structures that capture the equilibrium distribution of proteins constrained by the density maps. Many of these conformations, unseen by traditional methods, are experimentally validated and functionally relevant. We arrive at a set of best practices for data-guided protein folding that are controlled using a Python GUI.
Collapse
Affiliation(s)
- Mrinal Shekhar
- Center for Biophysics and Quantitative Biology, Department of Biochemistry, NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Chitrak Gupta
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
| | - Gaspard Debussche
- Department of Mathematics and Computer Sciences, Grenoble INP, 38000 Grenoble, France
| | - Nicholas J Sisco
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Virginia G. Piper Center for Personalized Diagnostics, Arizona State University, Tempe, AZ 85281, USA
| | - Jonathan Nguyen
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Arup Mondal
- Chemistry Department, Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | - John Vant
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Petra Fromme
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| | - Wade D Van Horn
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Virginia G. Piper Center for Personalized Diagnostics, Arizona State University, Tempe, AZ 85281, USA
| | - Emad Tajkhorshid
- Center for Biophysics and Quantitative Biology, Department of Biochemistry, NIH Center for Macromolecular Modeling and Bioinformatics, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, Illinois, 61801, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN 47907, USA
| | - Ken Dill
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York 11794, United States
| | - Alberto Perez
- Chemistry Department, Quantum Theory Project, University of Florida, Gainesville, Florida, 32611, USA
| | - Abhishek Singharoy
- The School of Molecular Sciences, Arizona State University, Tempe, AZ 85287, USA
- The Biodesign Institute Center for Structural Discovery, Arizona State University, Tempe, AZ 85281, USA
| |
Collapse
|
9
|
Palermo G, Sugita Y, Wriggers W, Amaro RE. Faces of Contemporary CryoEM Information and Modeling. J Chem Inf Model 2021; 60:2407-2409. [PMID: 32452204 DOI: 10.1021/acs.jcim.0c00481] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Giulia Palermo
- Department of Bioengineering, University of California Riverside, Riverside, California 92521, United States
| | - Yuji Sugita
- Theoretical Molecular Science Laboratory, RIKEN Cluster for Pioneering Research, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan.,Computational Biophysics Research Team, RIKEN Center for Computational Science, 7-1-26 Minatojima-Minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan.,Laboratory for Biomolecular Function Simulation, RIKEN Center for Biosystems Dynamics Research, 1-6-5 Minatojima-Minamimachi, Chuo-ku, Kobe, Hyogo 650-0047, Japan
| | - Willy Wriggers
- Department of Mechanical and Aerospace Engineering, Old Dominion University, Norfolk, Virginia 23529, United States
| | - Rommie E Amaro
- Department of Chemistry and Biochemistry, University of California San Diego, San Diego, California 92093-0340, United States
| |
Collapse
|
10
|
Biehn SE, Limpikirati P, Vachet RW, Lindert S. Utilization of Hydrophobic Microenvironment Sensitivity in Diethylpyrocarbonate Labeling for Protein Structure Prediction. Anal Chem 2021; 93:8188-8195. [PMID: 34061512 DOI: 10.1021/acs.analchem.1c00395] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Diethylpyrocarbonate (DEPC) labeling analyzed with mass spectrometry can provide important insights into higher order protein structures. It has been previously shown that neighboring hydrophobic residues promote a local increase in DEPC concentration such that serine, threonine, and tyrosine residues are more likely to be labeled despite low solvent exposure. In this work, we developed a Rosetta algorithm that used the knowledge of labeled and unlabeled serine, threonine, and tyrosine residues and assessed their local hydrophobic environment to improve protein structure prediction. Additionally, DEPC-labeled histidine and lysine residues with higher relative solvent accessible surface area values (i.e., more exposed) were scored favorably. Application of our score term led to reductions of the root-mean-square deviations (RMSDs) of the lowest scoring models. Additionally, models that scored well tended to have lower RMSDs. A detailed tutorial describing our protocol and required command lines is included. Our work demonstrated the considerable potential of DEPC covalent labeling data to be used for accurate higher order structure determination.
Collapse
Affiliation(s)
- Sarah E Biehn
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| | - Patanachai Limpikirati
- Department of Food and Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Chulalongkorn University, Bangkok 10330, Thailand
| | - Richard W Vachet
- Department of Chemistry, University of Massachusetts, Amherst, Massachusetts 01003, United States
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|
11
|
Marzolf DR, Seffernick JT, Lindert S. Protein Structure Prediction from NMR Hydrogen-Deuterium Exchange Data. J Chem Theory Comput 2021; 17:2619-2629. [PMID: 33780620 DOI: 10.1021/acs.jctc.1c00077] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Amide hydrogen-deuterium exchange (HDX) has long been used to determine regional flexibility and binding sites in proteins; however, the data are too sparse for full structural characterization. Experiments that measure HDX rates, such as HDX-NMR, have far higher throughput compared to structure determination via X-ray crystallography, cryo-EM, or a full suite of NMR experiments. Data from HDX-NMR experiments encode information on the protein structure, making HDX a prime candidate to be supplemented by computational algorithms for protein structure prediction. We have developed a methodology to incorporate HDX-NMR data into ab initio protein structure prediction using the Rosetta software framework to predict structures based on experimental agreement. To demonstrate the efficacy of our algorithm, we examined 38 proteins with HDX-NMR data available, comparing the predicted model with and without the incorporation of HDX data into scoring. The root-mean-square deviation (rmsd, a measure of the average atomic distance between superimposed models) of the predicted model improved by 1.42 Å on average after incorporating the HDX-NMR data into scoring. The average rmsd improvement for the proteins where the selected model rmsd changed after incorporating HDX data was 3.63 Å, including one improvement of more than 11 Å and seven proteins improving by greater than 4 Å, with 12/15 proteins improving overall. Additionally, for independent verification, two proteins that were not part of the original benchmark were scored including HDX data, with a dramatic improvement of the selected model rmsd of nearly 9 Å for one of the proteins. Moreover, we have developed a confidence metric allowing us to successfully identify near-native models in the absence of a native structure. Improvement in model selection with a strong confidence measure demonstrates that protein structure prediction with HDX-NMR is a powerful tool which can be performed with minimal additional computational strain and expense.
Collapse
Affiliation(s)
- Daniel R Marzolf
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| | - Justin T Seffernick
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, United States
| |
Collapse
|
12
|
Zhang Y, Krieger J, Mikulska-Ruminska K, Kaynak B, Sorzano COS, Carazo JM, Xing J, Bahar I. State-dependent sequential allostery exhibited by chaperonin TRiC/CCT revealed by network analysis of Cryo-EM maps. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2021; 160:104-120. [PMID: 32866476 PMCID: PMC7914283 DOI: 10.1016/j.pbiomolbio.2020.08.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Revised: 06/25/2020] [Accepted: 08/16/2020] [Indexed: 12/17/2022]
Abstract
The eukaryotic chaperonin TRiC/CCT plays a major role in assisting the folding of many proteins through an ATP-driven allosteric cycle. Recent structures elucidated by cryo-electron microscopy provide a broad view of the conformations visited at various stages of the chaperonin cycle, including a sequential activation of its subunits in response to nucleotide binding. But we lack a thorough mechanistic understanding of the structure-based dynamics and communication properties that underlie the TRiC/CCT machinery. In this study, we present a computational methodology based on elastic network models adapted to cryo-EM density maps to gain a deeper understanding of the structure-encoded allosteric dynamics of this hexadecameric machine. We have analysed several structures of the chaperonin resolved in different states toward mapping its conformational landscape. Our study indicates that the overall architecture intrinsically favours cooperative movements that comply with the structural variabilities observed in experiments. Furthermore, the individual subunits CCT1-CCT8 exhibit state-dependent sequential events at different states of the allosteric cycle. For example, in the ATP-bound state, subunits CCT5 and CCT4 selectively initiate the lid closure motions favoured by the overall architecture; whereas in the apo form of the heteromer, the subunit CCT7 exhibits the highest predisposition to structural change. The changes then propagate through parallel fluxes of allosteric signals to neighbours on both rings. The predicted state-dependent mechanisms of sequential activation provide new insights into TRiC/CCT intra- and inter-ring signal transduction events.
Collapse
Affiliation(s)
- Yan Zhang
- Department of Computational and Systems Biology, University of Pittsburgh, 800 Murdoch Building, 3420 Forbes Avenue, Pittsburgh, PA, 15261, USA
| | - James Krieger
- Department of Computational and Systems Biology, University of Pittsburgh, 800 Murdoch Building, 3420 Forbes Avenue, Pittsburgh, PA, 15261, USA
| | - Karolina Mikulska-Ruminska
- Department of Computational and Systems Biology, University of Pittsburgh, 800 Murdoch Building, 3420 Forbes Avenue, Pittsburgh, PA, 15261, USA
| | - Burak Kaynak
- Department of Computational and Systems Biology, University of Pittsburgh, 800 Murdoch Building, 3420 Forbes Avenue, Pittsburgh, PA, 15261, USA
| | | | - José-María Carazo
- Centro Nacional de Biotecnología (CSIC), Darwin, 3, 28049, Madrid, Spain
| | - Jianhua Xing
- Department of Computational and Systems Biology, University of Pittsburgh, 800 Murdoch Building, 3420 Forbes Avenue, Pittsburgh, PA, 15261, USA
| | - Ivet Bahar
- Department of Computational and Systems Biology, University of Pittsburgh, 800 Murdoch Building, 3420 Forbes Avenue, Pittsburgh, PA, 15261, USA.
| |
Collapse
|
13
|
Antila HS, M. Ferreira T, Ollila OHS, Miettinen MS. Using Open Data to Rapidly Benchmark Biomolecular Simulations: Phospholipid Conformational Dynamics. J Chem Inf Model 2021; 61:938-949. [PMID: 33496579 PMCID: PMC7903423 DOI: 10.1021/acs.jcim.0c01299] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Indexed: 01/08/2023]
Abstract
Molecular dynamics (MD) simulations are widely used to monitor time-resolved motions of biomacromolecules, although it often remains unknown how closely the conformational dynamics correspond to those occurring in real life. Here, we used a large set of open-access MD trajectories of phosphatidylcholine (PC) lipid bilayers to benchmark the conformational dynamics in several contemporary MD models (force fields) against nuclear magnetic resonance (NMR) data available in the literature: effective correlation times and spin-lattice relaxation rates. We found none of the tested MD models to fully reproduce the conformational dynamics. That said, the dynamics in CHARMM36 and Slipids are more realistic than in the Amber Lipid14, OPLS-based MacRog, and GROMOS-based Berger force fields, whose sampling of the glycerol backbone conformations is too slow. The performance of CHARMM36 persists when cholesterol is added to the bilayer, and when the hydration level is reduced. However, for conformational dynamics of the PC headgroup, both with and without cholesterol, Slipids provides the most realistic description because CHARMM36 overestimates the relative weight of ∼1 ns processes in the headgroup dynamics. We stress that not a single new simulation was run for the present work. This demonstrates the worth of open-access MD trajectory databanks for the indispensable step of any serious MD study: benchmarking the available force fields. We believe this proof of principle will inspire other novel applications of MD trajectory databanks and thus aid in developing biomolecular MD simulations into a true computational microscope-not only for lipid membranes but for all biomacromolecular systems.
Collapse
Affiliation(s)
- Hanne S. Antila
- Department
of Theory and Bio-Systems, Max Planck Institute
of Colloids and Interfaces, 14424 Potsdam, Germany
| | - Tiago M. Ferreira
- NMR
Group−Institute for Physics, Martin-Luther
University Halle-Wittenberg, 06120 Halle (Saale), Germany
| | | | - Markus S. Miettinen
- Department
of Theory and Bio-Systems, Max Planck Institute
of Colloids and Interfaces, 14424 Potsdam, Germany
| |
Collapse
|
14
|
Pesek M, Juvan A, Jakoš J, Košmrlj J, Marolt M, Gazvoda M. Database Independent Automated Structure Elucidation of Organic Molecules Based on IR, 1H NMR, 13C NMR, and MS Data. J Chem Inf Model 2021; 61:756-763. [PMID: 33378192 PMCID: PMC7903418 DOI: 10.1021/acs.jcim.0c01332] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2020] [Indexed: 01/21/2023]
Abstract
Herein, we report a computational algorithm that follows a spectroscopist-driven elucidation process of the structure of an organic molecule based on IR, 1H and 13C NMR, and MS tabular data. The algorithm is independent from database searching and is based on a bottom-up approach, building the molecular structure from small structural fragments visible in spectra. It employs an analytical combinatorial approach with a graph search technique to determine the connectivity of structural fragments that is based on the analysis of the NMR spectra, to connect the identified structural fragments into a molecular structure. After the process is completed, the interface lists the compound candidates, which are visualized by the WolframAlpha computational knowledge engine within the interface. The candidates are ranked according to the predefined rules for analyzing the spectral data. The developed elucidator has a user-friendly web interface and is publicly available (http://schmarnica.si).
Collapse
Affiliation(s)
- Matevž Pesek
- Faculty
of Computer and Information Science, University
of Ljubljana, Večna Pot 113, SI-1000 Ljubljana, Slovenia
| | - Andraž Juvan
- Faculty
of Computer and Information Science, University
of Ljubljana, Večna Pot 113, SI-1000 Ljubljana, Slovenia
| | - Jure Jakoš
- Faculty
of Chemistry and Chemical Technology, University
of Ljubljana, Večna
Pot 113, SI-1000 Ljubljana, Slovenia
| | - Janez Košmrlj
- Faculty
of Chemistry and Chemical Technology, University
of Ljubljana, Večna
Pot 113, SI-1000 Ljubljana, Slovenia
| | - Matija Marolt
- Faculty
of Computer and Information Science, University
of Ljubljana, Večna Pot 113, SI-1000 Ljubljana, Slovenia
| | - Martin Gazvoda
- Faculty
of Chemistry and Chemical Technology, University
of Ljubljana, Večna
Pot 113, SI-1000 Ljubljana, Slovenia
| |
Collapse
|
15
|
Biehn SE, Lindert S. Accurate protein structure prediction with hydroxyl radical protein footprinting data. Nat Commun 2021; 12:341. [PMID: 33436604 PMCID: PMC7804018 DOI: 10.1038/s41467-020-20549-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 12/08/2020] [Indexed: 01/10/2023] Open
Abstract
Hydroxyl radical protein footprinting (HRPF) in combination with mass spectrometry reveals the relative solvent exposure of labeled residues within a protein, thereby providing insight into protein tertiary structure. HRPF labels nineteen residues with varying degrees of reliability and reactivity. Here, we are presenting a dynamics-driven HRPF-guided algorithm for protein structure prediction. In a benchmark test of our algorithm, usage of the dynamics data in a score term resulted in notable improvement of the root-mean-square deviations of the lowest-scoring ab initio models and improved the funnel-like metric Pnear for all benchmark proteins. We identified models with accurate atomic detail for three of the four benchmark proteins. This work suggests that HRPF data along with side chain dynamics sampled by a Rosetta mover ensemble can be used to accurately predict protein structure.
Collapse
Affiliation(s)
- Sarah E Biehn
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH, 43210, USA.
| |
Collapse
|
16
|
Seffernick JT, Lindert S. Hybrid methods for combined experimental and computational determination of protein structure. J Chem Phys 2020; 153:240901. [PMID: 33380110 PMCID: PMC7773420 DOI: 10.1063/5.0026025] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/10/2020] [Indexed: 02/04/2023] Open
Abstract
Knowledge of protein structure is paramount to the understanding of biological function, developing new therapeutics, and making detailed mechanistic hypotheses. Therefore, methods to accurately elucidate three-dimensional structures of proteins are in high demand. While there are a few experimental techniques that can routinely provide high-resolution structures, such as x-ray crystallography, nuclear magnetic resonance (NMR), and cryo-EM, which have been developed to determine the structures of proteins, these techniques each have shortcomings and thus cannot be used in all cases. However, additionally, a large number of experimental techniques that provide some structural information, but not enough to assign atomic positions with high certainty have been developed. These methods offer sparse experimental data, which can also be noisy and inaccurate in some instances. In cases where it is not possible to determine the structure of a protein experimentally, computational structure prediction methods can be used as an alternative. Although computational methods can be performed without any experimental data in a large number of studies, inclusion of sparse experimental data into these prediction methods has yielded significant improvement. In this Perspective, we cover many of the successes of integrative modeling, computational modeling with experimental data, specifically for protein folding, protein-protein docking, and molecular dynamics simulations. We describe methods that incorporate sparse data from cryo-EM, NMR, mass spectrometry, electron paramagnetic resonance, small-angle x-ray scattering, Förster resonance energy transfer, and genetic sequence covariation. Finally, we highlight some of the major challenges in the field as well as possible future directions.
Collapse
Affiliation(s)
- Justin T. Seffernick
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, Ohio 43210, USA
| |
Collapse
|
17
|
Kim DN, Gront D, Sanbonmatsu KY. Practical Considerations for Atomistic Structure Modeling with Cryo-EM Maps. J Chem Inf Model 2020; 60:2436-2442. [PMID: 32422044 PMCID: PMC7891309 DOI: 10.1021/acs.jcim.0c00090] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
We describe common approaches to atomistic structure modeling with single particle analysis derived cryo-EM maps. Several strategies for atomistic model building and atomistic model fitting methods are discussed, including selection criteria and implementation procedures. In covering basic concepts and caveats, this short perspective aims to help facilitate active discussion between scientists at different levels with diverse backgrounds.
Collapse
Affiliation(s)
- Doo Nam Kim
- Computational Biology Team, Biological Science Division, Pacific Northwest National Laboratory, Richland, Washington, 99354, United States
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | - Karissa Y. Sanbonmatsu
- Theoretical Biology and Biophysics Group, Los Alamos National Laboratory, Los Alamos, New Mexico, 87545, United States
- New Mexico Consortium, Los Alamos, New Mexico, 87544, United States
| |
Collapse
|