1
|
Omar H, Hein A, Cole CA, Valafar H. Concurrent Identification and Characterization of Protein Structure and Continuous Internal Dynamics with REDCRAFT. Front Mol Biosci 2022; 9:806584. [PMID: 35187082 PMCID: PMC8856112 DOI: 10.3389/fmolb.2022.806584] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2021] [Accepted: 01/10/2022] [Indexed: 11/13/2022] Open
Abstract
Internal dynamics of proteins can play a critical role in the biological function of some proteins. Several well documented instances have been reported such as MBP, DHFR, hTS, DGCR8, and NSP1 of the SARS-CoV family of viruses. Despite the importance of internal dynamics of proteins, there currently are very few approaches that allow for meaningful separation of internal dynamics from structural aspects using experimental data. Here we present a computational approach named REDCRAFT that allows for concurrent characterization of protein structure and dynamics. Here, we have subjected DHFR (PDB-ID 1RX2), a 159-residue protein, to a fictitious, mixed mode model of internal dynamics. In this simulation, DHFR was segmented into 7 regions where 4 of the fragments were fixed with respect to each other, two regions underwent rigid-body dynamics, and one region experienced uncorrelated and melting event. The two dynamical and rigid-body segments experienced an average orientational modification of 7° and 12° respectively. Observable RDC data for backbone C′-N, N-HN, and C′-HN were generated from 102 uniformly sampled frames that described the molecular trajectory. The structure calculation of DHFR with REDCRAFT by using traditional Ramachandran restraint produced a structure with 29 Å of structural difference measured over the backbone atoms (bb-rmsd) over the entire length of the protein and an average bb-rmsd of more than 4.7 Å over each of the dynamical fragments. The same exercise repeated with context-specific dihedral restraints generated by PDBMine produced a structure with bb-rmsd of 21 Å over the entire length of the protein but with bb-rmsd of less than 3 Å over each of the fragments. Finally, utilization of the Dynamic Profile generated by REDCRAFT allowed for the identification of different dynamical regions of the protein and the recovery of individual fragments with bb-rmsd of less than 1 Å. Following the recovery of the fragments, our assembly procedure of domains (larger segments consisting of multiple fragments with a common dynamical profile) correctly assembled the four fragments that are rigid with respect to each other, categorized the two domains that underwent rigid-body dynamics, and identified one dynamical region for which no conserved structure could be defined. In conclusion, our approach was successful in identifying the dynamical domains, recovery of structure where it is meaningful, and relative assembly of the domains when possible.
Collapse
|
2
|
REDCRAFT: A computational platform using residual dipolar coupling NMR data for determining structures of perdeuterated proteins in solution. PLoS Comput Biol 2021; 17:e1008060. [PMID: 33524015 PMCID: PMC7877757 DOI: 10.1371/journal.pcbi.1008060] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2020] [Revised: 02/11/2021] [Accepted: 01/05/2021] [Indexed: 01/10/2023] Open
Abstract
Nuclear Magnetic Resonance (NMR) spectroscopy is one of the three primary experimental means of characterizing macromolecular structures, including protein structures. Structure determination by solution NMR spectroscopy has traditionally relied heavily on distance restraints derived from nuclear Overhauser effect (NOE) measurements. While structure determination of proteins from NOE-based restraints is well understood and broadly used, structure determination from Residual Dipolar Couplings (RDCs) is relatively less well developed. Here, we describe the new features of the protein structure modeling program REDCRAFT and focus on the new Adaptive Decimation (AD) feature. The AD plays a critical role in improving the robustness of REDCRAFT to missing or noisy data, while allowing structure determination of larger proteins from less data. In this report we demonstrate the successful application of REDCRAFT in structure determination of proteins ranging in size from 50 to 145 residues using experimentally collected data, and of larger proteins (145 to 573 residues) using simulated RDC data. In both cases, REDCRAFT uses only RDC data that can be collected from perdeuterated proteins. Finally, we compare the accuracy of structure determination from RDCs alone with traditional NOE-based methods for the structurally novel PF.2048.1 protein. The RDC-based structure of PF.2048.1 exhibited 1.0 Å BB-RMSD with respect to a high-quality NOE-based structure. Although optimal strategies would include using RDC data together with chemical shift, NOE, and other NMR data, these studies provide proof-of-principle for robust structure determination of largely-perdeuterated proteins from RDC data alone using REDCRAFT. Residual Dipolar Couplings have the potential to improve the accuracy and reduce the time needed to characterize protein structures. In addition, RDC data have been demonstrated to concurrently elucidate structure of proteins, provide assignment of resonances, and characterize the internal dynamics of proteins. Given all the advantages associated with the study of proteins from RDC data, based on the statistics provided by the Protein Databank (PDB), surprisingly only 124 proteins (out of nearly 150,000 proteins) have utilized RDCs as part of their structure determination. Even a smaller subset of these proteins (approximately 7) have utilized RDCs as the primary source of data for structure determination. One key factor in the use of RDCs is the challenging computational and analytical aspects of this source of data. In this report, we demonstrate the success of the REDCRAFT software package in structure determination of proteins using RDC data that can be collected from small and large proteins in a routine fashion. REDCRAFT accomplishes the challenging task of structure determination from RDCs by introducing a unique search and optimization technique that is both robust and computationally tractable. Structure determination from routinely collectable RDC data using REDCRAFT can complement existing methods to provide faster and more accurate studies of larger and more complex protein structures by NMR spectroscopy in solution state.
Collapse
|
3
|
Cole C, Parks C, Rachele J, Valafar H. Increased usability, algorithmic improvements and incorporation of data mining for structure calculation of proteins with REDCRAFT software package. BMC Bioinformatics 2020; 21:204. [PMID: 33272215 PMCID: PMC7712608 DOI: 10.1186/s12859-020-3522-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Accepted: 04/29/2020] [Indexed: 02/08/2023] Open
Abstract
Background Traditional approaches to elucidation of protein structures by Nuclear Magnetic Resonance spectroscopy (NMR) rely on distance restraints also known as Nuclear Overhauser effects (NOEs). The use of NOEs as the primary source of structure determination by NMR spectroscopy is time consuming and expensive. Residual Dipolar Couplings (RDCs) have become an alternate approach for structure calculation by NMR spectroscopy. In previous works, the software package REDCRAFT has been presented as a means of harnessing the information containing in RDCs for structure calculation of proteins. However, to meet its full potential, several improvements to REDCRAFT must be made. Results In this work, we present improvements to REDCRAFT that include increased usability, better interoperability, and a more robust core algorithm. We have demonstrated the impact of the improved core algorithm in the successful folding of the protein 1A1Z with as high as ±4 Hz of added error. The REDCRAFT computed structure from the highly corrupted data exhibited less than 1.0 Å with respect to the X-ray structure. We have also demonstrated the interoperability of REDCRAFT in a few instances including with PDBMine to reduce the amount of required data in successful folding of proteins to unprecedented levels. Here we have demonstrated the successful folding of the protein 1D3Z (to within 2.4 Å of the X-ray structure) using only N-H RDCs from one alignment medium. Conclusions The additional GUI features of REDCRAFT combined with the NEF compliance have significantly increased the flexibility and usability of this software package. The improvements of the core algorithm have substantially improved the robustness of REDCRAFT in utilizing less experimental data both in quality and quantity.
Collapse
Affiliation(s)
- Casey Cole
- Department of Computer Science and Engineering, University of South Carolina, M. Bert Storey Engineering and Innovation Center, 550 Assembly St, Columbia, SC, 29201, USA
| | - Caleb Parks
- Department of Computer Science and Engineering, University of South Carolina, M. Bert Storey Engineering and Innovation Center, 550 Assembly St, Columbia, SC, 29201, USA
| | - Julian Rachele
- Department of Computer Science and Engineering, University of South Carolina, M. Bert Storey Engineering and Innovation Center, 550 Assembly St, Columbia, SC, 29201, USA
| | - Homayoun Valafar
- Department of Computer Science and Engineering, University of South Carolina, M. Bert Storey Engineering and Innovation Center, 550 Assembly St, Columbia, SC, 29201, USA.
| |
Collapse
|
4
|
Sala D, Huang YJ, Cole CA, Snyder DA, Liu G, Ishida Y, Swapna GVT, Brock KP, Sander C, Fidelis K, Kryshtafovych A, Inouye M, Tejero R, Valafar H, Rosato A, Montelione GT. Protein structure prediction assisted with sparse NMR data in CASP13. Proteins 2019; 87:1315-1332. [PMID: 31603581 DOI: 10.1002/prot.25837] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2019] [Revised: 09/26/2019] [Accepted: 09/27/2019] [Indexed: 01/05/2023]
Abstract
CASP13 has investigated the impact of sparse NMR data on the accuracy of protein structure prediction. NOESY and 15 N-1 H residual dipolar coupling data, typical of that obtained for 15 N,13 C-enriched, perdeuterated proteins up to about 40 kDa, were simulated for 11 CASP13 targets ranging in size from 80 to 326 residues. For several targets, two prediction groups generated models that are more accurate than those produced using baseline methods. Real NMR data collected for a de novo designed protein were also provided to predictors, including one data set in which only backbone resonance assignments were available. Some NMR-assisted prediction groups also did very well with these data. CASP13 also assessed whether incorporation of sparse NMR data improves the accuracy of protein structure prediction relative to nonassisted regular methods. In most cases, incorporation of sparse, noisy NMR data results in models with higher accuracy. The best NMR-assisted models were also compared with the best regular predictions of any CASP13 group for the same target. For six of 13 targets, the most accurate model provided by any NMR-assisted prediction group was more accurate than the most accurate model provided by any regular prediction group; however, for the remaining seven targets, one or more regular prediction method provided a more accurate model than even the best NMR-assisted model. These results suggest a novel approach for protein structure determination, in which advanced prediction methods are first used to generate structural models, and sparse NMR data is then used to validate and/or refine these models.
Collapse
Affiliation(s)
- Davide Sala
- Magnetic Resonance Center, University of Florence, Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Sesto Fiorentino, Italy
| | - Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York
| | - Casey A Cole
- Department of Computer Science & Engineering, University of South Carolina, Columbia, South Carolina
| | - David A Snyder
- Department of Chemistry, College of Science and Health, William Paterson University, Wayne, New Jersey
| | - Gaohua Liu
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Nexomics Biosciences, Bordentown, New Jersey
| | - Yojiro Ishida
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - G V T Swapna
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - Kelly P Brock
- Department of Systems Biology, Harvard Medical School, Boston, Massachusetts
| | - Chris Sander
- Department of Cell Biology, Harvard Medical School, Boston, Massachusetts.,cBio Center, Dana-Farber Cancer Institute, Boston, Massachusetts
| | | | | | - Masayori Inouye
- Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| | - Roberto Tejero
- Departamento de Quimica Fisica, Universidad de Valencia, Valencia, Spain
| | - Homayoun Valafar
- Department of Computer Science & Engineering, University of South Carolina, Columbia, South Carolina
| | - Antonio Rosato
- Magnetic Resonance Center, University of Florence, Sesto Fiorentino, Italy.,Department of Chemistry, University of Florence, Sesto Fiorentino, Italy
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, and Department of Molecular Biology and Biochemistry, Rutgers, The State University of New Jersey, Piscataway, New Jersey.,Department of Chemistry and Center for Biotechnology and Interdisciplinary Studies, Rensselaer Polytechnic Institute, Troy, New York.,Department of Biochemistry and Molecular Biology, The Robert Wood Johnson Medical School, Rutgers, The State University of New Jersey, Piscataway, New Jersey
| |
Collapse
|
5
|
Cole CA, Mukhopadhyay R, Omar H, Hennig M, Valafar H. Structure Calculation and Reconstruction of Discrete-State Dynamics from Residual Dipolar Couplings. J Chem Theory Comput 2016; 12:1408-22. [PMID: 26984680 DOI: 10.1021/acs.jctc.5b01091] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Residual dipolar couplings (RDCs) acquired by nuclear magnetic resonance (NMR) spectroscopy are an indispensable source of information in investigation of molecular structures and dynamics. Here, we present a comprehensive strategy for structure calculation and reconstruction of discrete-state dynamics from RDC data that is based on the singular value decomposition (SVD) method of order tensor estimation. In addition to structure determination, we provide a mechanism of producing an ensemble of conformations for the dynamical regions of a protein from RDC data. The developed methodology has been tested on simulated RDC data with ±1 Hz of error from an 83 residue α protein (PDB ID 1A1Z ) and a 213 residue α/β protein DGCR8 (PDB ID 2YT4 ). In nearly all instances, our method reproduced the structure of the protein including the conformational ensemble to within less than 2 Å. On the basis of our investigations, arc motions with more than 30° of rotation are identified as internal dynamics and are reconstructed with sufficient accuracy. Furthermore, states with relative occupancies above 20% are consistently recognized and reconstructed successfully. Arc motions with a magnitude of 15° or relative occupancy of less than 10% are consistently unrecognizable as dynamical regions within the context of ±1 Hz of error.
Collapse
Affiliation(s)
- Casey A Cole
- Department of Computer Science & Engineering, University of South Carolina , Columbia, South Carolina 29208, United States
| | - Rishi Mukhopadhyay
- Department of Computer Science & Engineering, University of South Carolina , Columbia, South Carolina 29208, United States
| | - Hanin Omar
- Department of Computer Science & Engineering, University of South Carolina , Columbia, South Carolina 29208, United States
| | - Mirko Hennig
- Nutrition Research Institute, University of North Carolina at Chapel Hill , Kannapolis, North Carolina 27514, United States
| | - Homayoun Valafar
- Department of Computer Science & Engineering, University of South Carolina , Columbia, South Carolina 29208, United States
| |
Collapse
|
6
|
Andrałojć W, Berlin K, Fushman D, Luchinat C, Parigi G, Ravera E, Sgheri L. Information content of long-range NMR data for the characterization of conformational heterogeneity. JOURNAL OF BIOMOLECULAR NMR 2015; 62:353-71. [PMID: 26044033 PMCID: PMC4782772 DOI: 10.1007/s10858-015-9951-6] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/03/2015] [Accepted: 05/25/2015] [Indexed: 05/16/2023]
Abstract
Long-range NMR data, namely residual dipolar couplings (RDCs) from external alignment and paramagnetic data, are becoming increasingly popular for the characterization of conformational heterogeneity of multidomain biomacromolecules and protein complexes. The question addressed here is how much information is contained in these averaged data. We have analyzed and compared the information content of conformationally averaged RDCs caused by steric alignment and of both RDCs and pseudocontact shifts caused by paramagnetic alignment, and found that, despite the substantial differences, they contain a similar amount of information. Furthermore, using several synthetic tests we find that both sets of data are equally good towards recovering the major state(s) in conformational distributions.
Collapse
Affiliation(s)
- Witold Andrałojć
- Center for Magnetic Resonance (CERM), University of Florence, Via
L. Sacconi 6, 50019, Sesto Fiorentino, Italy
| | - Konstantin Berlin
- Department of Chemistry and Biochemistry, Center for Biomolecular
Structure and Organization, University of Maryland, College Park, MD 20742, USA
| | - David Fushman
- Department of Chemistry and Biochemistry, Center for Biomolecular
Structure and Organization, University of Maryland, College Park, MD 20742, USA
- Corresponding authors: David Fushman, ,
Claudio Luchinat,
| | - Claudio Luchinat
- Center for Magnetic Resonance (CERM), University of Florence, Via
L. Sacconi 6, 50019, Sesto Fiorentino, Italy
- Department of Chemistry "Ugo Schiff", University
of Florence, Via della Lastruccia 3, 50019, Sesto Fiorentino, Italy
- Corresponding authors: David Fushman, ,
Claudio Luchinat,
| | - Giacomo Parigi
- Center for Magnetic Resonance (CERM), University of Florence, Via
L. Sacconi 6, 50019, Sesto Fiorentino, Italy
- Department of Chemistry "Ugo Schiff", University
of Florence, Via della Lastruccia 3, 50019, Sesto Fiorentino, Italy
| | - Enrico Ravera
- Center for Magnetic Resonance (CERM), University of Florence, Via
L. Sacconi 6, 50019, Sesto Fiorentino, Italy
- Department of Chemistry "Ugo Schiff", University
of Florence, Via della Lastruccia 3, 50019, Sesto Fiorentino, Italy
| | - Luca Sgheri
- Istituto per le Applicazioni del Calcolo, Sezione di Firenze,
CNR, Via Madonna del Piano 10, 50019 Sesto Fiorentino, Italy
| |
Collapse
|