1
|
Fowler NJ, Sljoka A, Williamson MP. A method for validating the accuracy of NMR protein structures. Nat Commun 2020; 11:6321. [PMID: 33339822 PMCID: PMC7749147 DOI: 10.1038/s41467-020-20177-1] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Accepted: 11/13/2020] [Indexed: 01/13/2023] Open
Abstract
We present a method that measures the accuracy of NMR protein structures. It compares random coil index [RCI] against local rigidity predicted by mathematical rigidity theory, calculated from NMR structures [FIRST], using a correlation score (which assesses secondary structure), and an RMSD score (which measures overall rigidity). We test its performance using: structures refined in explicit solvent, which are much better than unrefined structures; decoy structures generated for 89 NMR structures; and conventional predictors of accuracy such as number of restraints per residue, restraint violations, energy of structure, ensemble RMSD, Ramachandran distribution, and clashscore. Restraint violations and RMSD are poor measures of accuracy. Comparisons of NMR to crystal structures show that secondary structure is equally accurate, but crystal structures are typically too rigid in loops, whereas NMR structures are typically too floppy overall. We show that the method is a useful addition to existing measures of accuracy.
Collapse
Affiliation(s)
- Nicholas J Fowler
- Dept of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK
| | - Adnan Sljoka
- RIKEN Center for Advanced Intelligence Project, RIKEN, 1-4-1 Nihombashi, Chuo-ku, Tokyo, 103-0027, Japan.
- Dept of Chemistry, University of Toronto, UTM, 3359 Mississauga Road North, Mississauga, ON, L5L 1C6, Canada.
| | - Mike P Williamson
- Dept of Molecular Biology and Biotechnology, University of Sheffield, Sheffield, UK.
| |
Collapse
|
2
|
De Sanctis S, Wenzler M, Kröger N, Malloni WM, Sumper M, Deutzmann R, Zadravec P, Brunner E, Kremer W, Kalbitzer HR. PSCD Domains of Pleuralin-1 from the Diatom Cylindrotheca fusiformis: NMR Structures and Interactions with Other Biosilica-Associated Proteins. Structure 2016; 24:1178-91. [PMID: 27320836 DOI: 10.1016/j.str.2016.04.021] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2015] [Revised: 04/27/2016] [Accepted: 04/27/2016] [Indexed: 10/21/2022]
Abstract
Diatoms are eukaryotic unicellular algae characterized by silica cell walls and associated with three unique protein families, the pleuralins, frustulins, and silaffins. The NMR structure of the PSCD4 domain of pleuralin-1 from Cylindrotheca fusiformis contains only three short helical elements and is stabilized by five unique disulfide bridges. PSCD4 contains two binding sites for Ca(2+) ions with millimolar affinity. NMR-based interaction studies show an interaction of the domain with native silaffin-1A as well as with α-frustulins. The interaction sites of the two proteins mapped on the PSCD4 structure are contiguous and show only a small overlap. A plausible functional role of pleuralin could be to bind simultaneously silaffin-1A located inside the cell wall and α-frustulin coating the cell wall, thus connecting the interfaces between hypotheca and epitheca at the girdle bands. Restrained molecular dynamics calculations suggest a bead-chain-like structure of the central part of pleuralin-1.
Collapse
Affiliation(s)
- Silvia De Sanctis
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany
| | - Michael Wenzler
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany; Bruker BioSpin AG, 8117 Fällanden, Switzerland
| | - Nils Kröger
- Institute of Biochemistry, Microbiology and Genetics, University of Regensburg, 93040 Regensburg, Germany; Department of Chemistry and Food Chemistry, B CUBE Center for Molecular Bioengineering, TU Dresden, 01307 Dresden, Germany
| | - Wilhelm M Malloni
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany
| | - Manfred Sumper
- Institute of Biochemistry, Microbiology and Genetics, University of Regensburg, 93040 Regensburg, Germany
| | - Rainer Deutzmann
- Institute of Biochemistry, Microbiology and Genetics, University of Regensburg, 93040 Regensburg, Germany
| | - Patrick Zadravec
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany
| | - Eike Brunner
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany; Bioanalytical Chemistry, Department of Chemistry and Food Chemistry, TU Dresden, 01062 Dresden, Germany
| | - Werner Kremer
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany
| | - Hans Robert Kalbitzer
- Institute of Biophysics und Physical Biochemistry, Centre of Magnetic Resonance in Chemistry and Biomedicine, University of Regensburg, 93040 Regensburg, Germany.
| |
Collapse
|
3
|
Vranken WF. NMR structure validation in relation to dynamics and structure determination. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2014; 82:27-38. [PMID: 25444697 DOI: 10.1016/j.pnmrs.2014.08.001] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Revised: 08/14/2014] [Accepted: 08/14/2014] [Indexed: 06/04/2023]
Abstract
NMR spectroscopy is a key technique for understanding the behaviour of proteins, especially highly dynamic proteins that adopt multiple conformations in solution. Overall, protein structures determined from NMR spectroscopy data constitute just over 10% of the Protein Data Bank archive. This review covers the validation of these NMR protein structures, but rather than describing currently available methodology, it focuses on concepts that are important for understanding where and how validation is most relevant. First, the inherent characteristics of the protein under study have an influence on quality and quantity of the distinct types of data that can be acquired from NMR experiments. Second, these NMR data are necessarily transformed into a model for use in a structure calculation protocol, and the protein structures that result from this reflect the types of NMR data used as well as the protein characteristics. The validation of NMR protein structures should therefore take account, wherever possible, of the inherent behavioural characteristics of the protein, the types of available NMR data, and the calculation protocol. These concepts are discussed in the context of 'knowledge based' and 'model versus data' validation, with suggestions for questions to ask and different validation categories to consider. The principal aim of this review is to stimulate discussion and to help the reader understand the relationships between the above elements in order to make informed decisions on which validation approaches are the most relevant in particular cases.
Collapse
Affiliation(s)
- Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050 Brussels, Belgium; Department of Structural Biology, VIB, 1050 Brussels, Belgium; Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, La Plaine Campus, Triomflaan, BC Building, 6th Floor, CP 263, 1050 Brussels, Belgium.
| |
Collapse
|
4
|
Rosato A, Tejero R, Montelione GT. Quality assessment of protein NMR structures. Curr Opin Struct Biol 2013; 23:715-24. [PMID: 24060334 DOI: 10.1016/j.sbi.2013.08.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2013] [Accepted: 08/14/2013] [Indexed: 10/26/2022]
Abstract
Biomolecular NMR structures are now routinely used in biology, chemistry, and bioinformatics. Methods and metrics for assessing the accuracy and precision of protein NMR structures are beginning to be standardized across the biological NMR community. These include both knowledge-based assessment metrics, parameterized from the database of protein structures, and model versus data assessment metrics. On line servers are available that provide comprehensive protein structure quality assessment reports, and efforts are in progress by the world-wide Protein Data Bank (wwPDB) to develop a biomolecular NMR structure quality assessment pipeline as part of the structure deposition process. These quality assessment metrics and standards will aid NMR spectroscopists in determining more accurate structures, and increase the value and utility of these structures for the broad scientific community.
Collapse
Affiliation(s)
- Antonio Rosato
- Magnetic Resonance Center and Department of Chemistry, University of Florence, 50019 Sesto Fiorentino, Italy
| | | | | |
Collapse
|
5
|
Berjanskii M, Zhou J, Liang Y, Lin G, Wishart DS. Resolution-by-proxy: a simple measure for assessing and comparing the overall quality of NMR protein structures. JOURNAL OF BIOMOLECULAR NMR 2012; 53:167-180. [PMID: 22678091 DOI: 10.1007/s10858-012-9637-2] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Accepted: 02/15/2012] [Indexed: 06/01/2023]
Abstract
In protein X-ray crystallography, resolution is often used as a good indicator of structural quality. Diffraction resolution of protein crystals correlates well with the number of X-ray observables that are used in structure generation and, therefore, with protein coordinate errors. In protein NMR, there is no parameter identical to X-ray resolution. Instead, resolution is often used as a synonym of NMR model quality. Resolution of NMR structures is often deduced from ensemble precision, torsion angle normality and number of distance restraints per residue. The lack of common techniques to assess the resolution of X-ray and NMR structures complicates the comparison of structures solved by these two methods. This problem is sometimes approached by calculating "equivalent resolution" from structure quality metrics. However, existing protocols do not offer a comprehensive assessment of protein structure as they calculate equivalent resolution from a relatively small number (<5) of protein parameters. Here, we report a development of a protocol that calculates equivalent resolution from 25 measurable protein features. This new method offers better performance (correlation coefficient of 0.92, mean absolute error of 0.28 Å) than existing predictors of equivalent resolution. Because the method uses coordinate data as a proxy for X-ray diffraction data, we call this measure "Resolution-by-Proxy" or ResProx. We demonstrate that ResProx can be used to identify under-restrained, poorly refined or inaccurate NMR structures, and can discover structural defects that the other equivalent resolution methods cannot detect. The ResProx web server is available at http://www.resprox.ca.
Collapse
Affiliation(s)
- Mark Berjanskii
- Department of Computing Science, University of Alberta, Edmonton, AB, Canada
| | | | | | | | | |
Collapse
|
6
|
Abstract
Around half of all protein structures solved nowadays using solution-state nuclear magnetic resonance (NMR) spectroscopy have been because of automated data analysis. The pervasiveness of computational approaches in general hides, however, a more nuanced view in which the full variety and richness of the field appears. This review is structured around a comparison of methods associated with three NMR observables: classical nuclear Overhauser effect (NOE) constraint gathering in contrast with more recent chemical shift and residual dipole coupling (RDC) based protocols. In each case, the emphasis is placed on the latest research, covering mainly the past 5 years. By describing both general concepts and representative programs, the objective is to map out a field in which--through the very profusion of approaches--it is all too easy to lose one's bearings.
Collapse
|
7
|
Abstract
The main drawback of protein NMR spectroscopy today is still the extensive amount of time required for solving a single structure. The main bottleneck in this respect is the manual evaluation of the experimental spectra. A clear solution to this challenge is the development of automated methods for this purpose. At the current stage of development, this goal has been almost or in a few cases fully reached for favorable cases such as well-behaved, stably folding smaller proteins below the 25 kDa range. For larger and/or more difficult molecules, the input of a human expert is still required. However, even here, automated routines will substantially speed up the structure determination process. In this report, we will summarize recent developments in this field and especially emphasize practical aspects important for a successful automated protein structure determination in solution. An important aspect closely related to structure determination is structure validation. Therefore, we devote a section to automated approaches for this topic.
Collapse
Affiliation(s)
- Wolfram Gronwald
- Institute for Biophysics and Physical Biochemistry, University of Regensburg, Regensburg, Germany
| | | |
Collapse
|
8
|
Cano C, Brunner K, Baskaran K, Elsner R, Munte CE, Kalbitzer HR. Protein structure calculation with data imputation: the use of substitute restraints. JOURNAL OF BIOMOLECULAR NMR 2009; 45:397-411. [PMID: 19838807 DOI: 10.1007/s10858-009-9379-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/09/2009] [Accepted: 09/17/2009] [Indexed: 05/28/2023]
Abstract
The amount of experimental restraints e.g., NOEs is often too small for calculating high quality three-dimensional structures by restrained molecular dynamics. Considering this as a typical missing value problem we propose here a model based data imputation technique that should lead to an improved estimation of the correct structure. The novel automated method implemented in AUREMOL makes a more efficient use of the experimental information to obtain NMR structures with higher accuracy. It creates a large set of substitute restraints that are used either alone or together with the experimental restraints. The new approach was successfully tested on three examples: firstly, the Ras-binding domain of Byr2 from Schizosaccharomyces pombe, the mutant HPr (H15A) from Staphylococcus aureus, and a X-ray structure of human ubiquitin. In all three examples, the quality of the resulting final bundles was improved considerably by the use of additional substitute restraints, as assessed quantitatively by the calculation of RMSD values to the "true" structure and NMR R-factors directly calculated from the original NOESY spectra or the published diffraction data.
Collapse
Affiliation(s)
- Carolina Cano
- Institut für Biophysik und physikalische Biochemie, University of Regensburg, Universitätstr. Regensburg, Germany
| | | | | | | | | | | |
Collapse
|
9
|
Schneider M, Fu X, Keating AE. X-ray vs. NMR structures as templates for computational protein design. Proteins 2009; 77:97-110. [PMID: 19422060 DOI: 10.1002/prot.22421] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Certain protein-design calculations involve using an experimentally determined high-resolution structure as a template to identify new sequences that can adopt the same fold. This approach has led to the successful design of many novel, well-folded, native-like proteins. Although any atomic-resolution structure can serve as a template in such calculations, most successful designs have used high-resolution crystal structures. Because there are many proteins for which crystal structures are not available, it is of interest whether nuclear magnetic resonance (NMR) templates are also appropriate. We have analyzed differences between using X-ray and NMR templates in side-chain repacking and design calculations. We assembled a database of 29 proteins for which both a high-resolution X-ray structure and an ensemble of NMR structures are available. Using these pairs, we compared the rotamericity, chi(1)-angle recovery, and native-sequence recovery of X-ray and NMR templates. We carried out design using RosettaDesign on both types of templates, and compared the energies and packing qualities of the resulting structures. Overall, the X-ray structures were better templates for use with Rosetta. However, for approximately 20% of proteins, a member of the reported NMR ensemble gave rise to designs with similar properties. Re-evaluating RosettaDesign structures with other energy functions indicated much smaller differences between the two types of templates. Ultimately, experiments are required to confirm the utility of particular X-ray and NMR templates. But our data suggest that the lack of a high-resolution X-ray structure should not preclude attempts at computational design if an NMR ensemble is available.
Collapse
|
10
|
Baskaran K, Kirchhöfer R, Huber F, Trenner J, Brunner K, Gronwald W, Neidig KP, Kalbitzer HR. Chemical shift optimization in multidimensional NMR spectra by AUREMOL-SHIFTOPT. JOURNAL OF BIOMOLECULAR NMR 2009; 43:197-210. [PMID: 19234673 DOI: 10.1007/s10858-009-9304-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2008] [Accepted: 12/18/2008] [Indexed: 05/27/2023]
Abstract
A problem often encountered in multidimensional NMR-spectroscopy is that an existing chemical shift list of a protein has to be used to assign an experimental spectrum but does not fit sufficiently well for a safe assignment. A similar problem occurs when temperature or pressure series of n-dimensional spectra are to be evaluated automatically. We have developed two different algorithms, AUREMOL-SHIFTOPT1 and AUREMOL-SHIFTOPT2 that fulfill this task. In the present contribution their performance is analyzed employing a set of simulated and experimental two-dimensional and three-dimensional spectra obtained from three different proteins. A new z-score based on atom and amino acid specific chemical shift distributions is introduced to weight the chemical shift contributions in different dimensions properly.
Collapse
Affiliation(s)
- Kumaran Baskaran
- Department of Biophysics and Physical Biochemistry, University of Regensburg, Postfach, 93040, Regensburg, Federal Republic of Germany
| | | | | | | | | | | | | | | |
Collapse
|
11
|
Gronwald W, Bomke J, Maurer T, Domogalla B, Huber F, Schumann F, Kremer W, Fink F, Rysiok T, Frech M, Kalbitzer HR. Structure of the Leech Protein Saratin and Characterization of Its Binding to Collagen. J Mol Biol 2008; 381:913-27. [DOI: 10.1016/j.jmb.2008.06.034] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2008] [Revised: 06/09/2008] [Accepted: 06/11/2008] [Indexed: 11/25/2022]
|
12
|
Cui F, Jernigan R, Wu Z. Knowledge-based versus experimentally acquired distance and angle constraints for NMR structure refinement. J Bioinform Comput Biol 2008; 6:283-300. [PMID: 18464323 DOI: 10.1142/s0219720008003448] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2007] [Revised: 10/30/2007] [Accepted: 11/17/2007] [Indexed: 11/18/2022]
Abstract
Nuclear Overhauser effects (NOE) distance constraints and torsion angle constraints are major conformational constraints for nuclear magnetic resonance (NMR) structure refinement. In particular, the number of NOE constraints has been considered as an important determinant for the quality of NMR structures. Of course, the availability of torsion angle constraints is also critical for the formation of correct local conformations. In our recent work, we have shown how a set of knowledge-based short-range distance constraints can also be utilized for NMR structure refinement, as a complementary set of conformational constraints to the NOE and torsion angle constraints. In this paper, we show the results from a series of structure refinement experiments by using different types of conformational constraints--NOE, torsion angle, or knowledge-based constraints--or their combinations, and make a quantitative assessment on how the experimentally acquired constraints contribute to the quality of structural models and whether or not they can be combined with or substituted by the knowledge-based constraints. We have carried out the experiments on a small set of NMR structures. Our preliminary calculations have revealed that the torsion angle constraints contribute substantially to the quality of the structures, but require to be combined with the NOE constraints to be fully effective. The knowledge-based constraints can be functionally as crucial as the torsion angle constraints, although they are statistical constraints after all and are not meant to be able to replace the latter.
Collapse
Affiliation(s)
- Feng Cui
- Laboratory of Cell Biology, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | |
Collapse
|
13
|
Andrec M, Snyder DA, Zhou Z, Young J, Montelione GT, Levy RM. A large data set comparison of protein structures determined by crystallography and NMR: statistical test for structural differences and the effect of crystal packing. Proteins 2007; 69:449-65. [PMID: 17623851 DOI: 10.1002/prot.21507] [Citation(s) in RCA: 93] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The existence of a large number of proteins for which both nuclear magnetic resonance (NMR) and X-ray crystallographic coordinates have been deposited into the Protein Data Bank (PDB) makes the statistical comparison of the corresponding crystal and NMR structural models over a large data set possible, and facilitates the study of the effect of the crystal environment and other factors on structure. We present an approach for detecting statistically significant structural differences between crystal and NMR structural models which is based on structural superposition and the analysis of the distributions of atomic positions relative to a mean structure. We apply this to a set of 148 protein structure pairs (crystal vs NMR), and analyze the results in terms of methodological and physical sources of structural difference. For every one of the 148 structure pairs, the backbone root-mean-square distance (RMSD) over core atoms of the crystal structure to the mean NMR structure is larger than the average RMSD of the members of the NMR ensemble to the mean, with 76% of the structure pairs having an RMSD of the crystal structure to the mean more than a factor of two larger than the average RMSD of the NMR ensemble. On average, the backbone RMSD over core atoms of crystal structure to the mean NMR is approximately 1 A. If non-core atoms are included, this increases to 1.4 A due to the presence of variability in loops and similar regions of the protein. The observed structural differences are only weakly correlated with the age and quality of the structural model and differences in conditions under which the models were determined. We examine steric clashes when a putative crystalline lattice is constructed using a representative NMR structure, and find that repulsive crystal packing plays a minor role in the observed differences between crystal and NMR structures. The observed structural differences likely have a combination of physical and methodological causes. Stabilizing attractive interactions arising from intermolecular crystal contacts which shift the equilibrium of the crystal structure relative to the NMR structure is a likely physical source which can account for some of the observed differences. Methodological sources of apparent structural difference include insufficient sampling or other issues which could give rise to errors in the estimates of the precision and/or accuracy.
Collapse
Affiliation(s)
- Michael Andrec
- BioMaPS Institute for Quantitative Biology, Northeast Structural Genomics Consortium and Department of Chemistry and Chemical Biology, The State University of New Jersey, Piscataway, New Jersey 08854, USA
| | | | | | | | | | | |
Collapse
|
14
|
Bhattacharya A, Tejero R, Montelione GT. Evaluating protein structures determined by structural genomics consortia. Proteins 2007; 66:778-95. [PMID: 17186527 DOI: 10.1002/prot.21165] [Citation(s) in RCA: 580] [Impact Index Per Article: 34.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Structural genomics projects are providing large quantities of new 3D structural data for proteins. To monitor the quality of these data, we have developed the protein structure validation software suite (PSVS), for assessment of protein structures generated by NMR or X-ray crystallographic methods. PSVS is broadly applicable for structure quality assessment in structural biology projects. The software integrates under a single interface analyses from several widely-used structure quality evaluation tools, including PROCHECK (Laskowski et al., J Appl Crystallog 1993;26:283-291), MolProbity (Lovell et al., Proteins 2003;50:437-450), Verify3D (Luthy et al., Nature 1992;356:83-85), ProsaII (Sippl, Proteins 1993;17: 355-362), the PDB validation software, and various structure-validation tools developed in our own laboratory. PSVS provides standard constraint analyses, statistics on goodness-of-fit between structures and experimental data, and knowledge-based structure quality scores in standardized format suitable for database integration. The analysis provides both global and site-specific measures of protein structure quality. Global quality measures are reported as Z scores, based on calibration with a set of high-resolution X-ray crystal structures. PSVS is particularly useful in assessing protein structures determined by NMR methods, but is also valuable for assessing X-ray crystal structures or homology models. Using these tools, we assessed protein structures generated by the Northeast Structural Genomics Consortium and other international structural genomics projects, over a 5-year period. Protein structures produced from structural genomics projects exhibit quality score distributions similar to those of structures produced in traditional structural biology projects during the same time period. However, while some NMR structures have structure quality scores similar to those seen in higher-resolution X-ray crystal structures, the majority of NMR structures have lower scores. Potential reasons for this "structure quality score gap" between NMR and X-ray crystal structures are discussed.
Collapse
Affiliation(s)
- Aneerban Bhattacharya
- Center for Advanced Biotechnology and Medicine, Northeast Structural Genomics Consortium, Rutgers University and Robert Wood Johnson Medical School, Piscataway, New Jersey 08854, USA
| | | | | |
Collapse
|
15
|
Gronwald W, Brunner K, Kirchhöfer R, Trenner J, Neidig KP, Kalbitzer HR. AUREMOL-RFAC-3D, combination of R-factors and their use for automated quality assessment of protein solution structures. JOURNAL OF BIOMOLECULAR NMR 2007; 37:15-30. [PMID: 17136423 DOI: 10.1007/s10858-006-9096-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2006] [Accepted: 09/20/2006] [Indexed: 05/12/2023]
Abstract
We present here the computer program AUREMOL-RFAC-3D that is a generalization of the previously published program RFAC for the fully automated estimation of residual indices (R-factors) from 2D NOESY spectra. It is part of the larger AUREMOL software package (www.auremol.de). RFAC-3D calculates R-factors directly from two-dimensional homonuclear NOESY spectra as well as from three-dimensional (15)N or (13)C edited NOESY-HSQC spectra and thus extends the application range to larger proteins. The fully automated method includes automated peak picking and integration, a Bayesian noise and artifact recognition and the use of the complete relaxation matrix formalism. To enhance the reliability of the calculated R-factors the method is also generalized to calculate combined R-factors from a set of 2D and 3D-spectra. For an optimal combination of the information derived from different sources a plausible formalism had to be derived. In addition, we present a novel direct R-factors based measure that correlates an R-factors as defined in this paper to the root mean square deviation of the actual structure from the optimal structure. The new program has been successfully tested on the histidine containing phosphocarrier protein (HPr) from Staphylococcus carnosus and on the Ras-binding domain (RBD) of the Ral guanine-nucleotide dissociation stimulation factor (RalGDS).
Collapse
Affiliation(s)
- Wolfram Gronwald
- Department of Biophysics and Physical Biochemistry, University of Regensburg, Universitätsstr.31, D-93040, Regensburg, Federal Republic of Germany
| | | | | | | | | | | |
Collapse
|
16
|
Brunner K, Gronwald W, Trenner JM, Neidig KP, Kalbitzer HR. A general method for the unbiased improvement of solution NMR structures by the use of related X-ray data, the AUREMOL-ISIC algorithm. BMC STRUCTURAL BIOLOGY 2006; 6:14. [PMID: 16800891 PMCID: PMC1559696 DOI: 10.1186/1472-6807-6-14] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2006] [Accepted: 06/26/2006] [Indexed: 11/11/2022]
Abstract
Background Rapid and accurate three-dimensional structure determination of biological macromolecules is mandatory to keep up with the vast progress made in the identification of primary sequence information. During the last few years the amount of data deposited in the protein data bank has substantially increased providing additional information for novel structure determination projects. The key question is how to combine the available database information with the experimental data of the current project ensuring that only relevant information is used and a correct structural bias is produced. For this purpose a novel fully automated algorithm based on Bayesian reasoning has been developed. It allows the combination of structural information from different sources in a consistent way to obtain high quality structures with a limited set of experimental data. The new ISIC (Intelligent Structural Information Combination) algorithm is part of the larger AUREMOL software package. Results Our new approach was successfully tested on the improvement of the solution NMR structures of the Ras-binding domain of Byr2 from Schizosaccharomyces pombe, the Ras-binding domain of RalGDS from human calculated from a limited set of NMR data, and the immunoglobulin binding domain from protein G from Streptococcus by their corresponding X-ray structures. In all test cases clearly improved structures were obtained. The largest danger in using data from other sources is a possible bias towards the added structure. In the worst case instead of a refined target structure the structure from the additional source is essentially reproduced. We could clearly show that the ISIC algorithm treats these difficulties properly. Conclusion In summary, we present a novel fully automated method to combine strongly coupled knowledge from different sources. The combination with validation tools such as the calculation of NMR R-factors strengthens the impact of the method considerably since the improvement of the structures can be assessed quantitatively. The ISIC method can be applied to a large number of similar problems where the quality of the obtained three-dimensional structures is limited by the available experimental data like the improvement of large NMR structures calculated from sparse experimental data or the refinement of low resolution X-ray structures. Also structures may be refined using other available structural information such as homology models.
Collapse
Affiliation(s)
- Konrad Brunner
- Department of Biophysics and Physical Biochemistry, University of Regensburg, Postfach, D-93040 Regensburg, Federal Republic of Germany
| | - Wolfram Gronwald
- Department of Biophysics and Physical Biochemistry, University of Regensburg, Postfach, D-93040 Regensburg, Federal Republic of Germany
| | - Jochen M Trenner
- Department of Biophysics and Physical Biochemistry, University of Regensburg, Postfach, D-93040 Regensburg, Federal Republic of Germany
| | - Klaus-Peter Neidig
- Bruker BioSpin GmbH, Software Department, Silberstreifen 4, D-76287 Rheinstetten, Federal Republic of Germany
| | - Hans Robert Kalbitzer
- Department of Biophysics and Physical Biochemistry, University of Regensburg, Postfach, D-93040 Regensburg, Federal Republic of Germany
| |
Collapse
|
17
|
Zagrovic B, van Gunsteren WF. Comparing atomistic simulation data with the NMR experiment: how much can NOEs actually tell us? Proteins 2006; 63:210-8. [PMID: 16425239 DOI: 10.1002/prot.20872] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Simulated molecular dynamics trajectories of proteins and nucleic acids are often compared with nuclear magnetic resonance (NMR) data for the purposes of assessing the quality of the force field used or, equally important, trying to interpret ambiguous experimental data. In particular, nuclear Overhauser enhancement (NOE) intensities or atom-atom distances derived from them are frequently calculated from the simulated ensembles because the distance restraints derived from NOEs are the key ingredient in NMR-based protein structure determination. In this study, we ask how diverse and nonnative-like an ensemble of structures can be and still match the experimental NOE distance upper bounds well. We present two examples in which simulated ensembles of highly nonnative polypeptide structures (an unfolded state ensemble of the villin headpiece and a high-temperature denatured ensemble of lysozyme) are shown to match fairly well the experimental NOE distance upper bounds from which the corresponding native structures were derived. For example, the unfolded ensemble of villin headpiece, which is on average 0.90 +/- 0.13 nm root-mean-square deviation away from the native NMR structure, deviates from the experimental restraints by only 0.027 nm on average. However, this artificially good agreement is largely a consequence of 1) the highly nonlinear effects of r(-6) (or r(-3)) averaging and 2) focusing only on the experimentally observed set of NOE bounds. Namely, in addition to the experimentally observed NOEs, both simulated ensembles (especially the villin ensemble) also predict a large number of NOEs, which are not seen in the experiment. If these are taken into account, the agreement between simulation and experiment gets markedly worse, as it should, given the nonnative nature of the underlying simulated ensembles. In light of the examples given, we conclude that comparing experimental NOE distance restraints with large simulated ensembles provides just by itself only limited information about the quality of simulation.
Collapse
Affiliation(s)
- Bojan Zagrovic
- Department of Chemistry and Applied Biosciences, ETH Hönggerberg, Zürich, Switzerland
| | | |
Collapse
|
18
|
Nabuurs SB, Spronk CAEM, Vuister GW, Vriend G. Traditional biomolecular structure determination by NMR spectroscopy allows for major errors. PLoS Comput Biol 2006; 2:e9. [PMID: 16462939 PMCID: PMC1359070 DOI: 10.1371/journal.pcbi.0020009] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2005] [Accepted: 12/29/2005] [Indexed: 12/04/2022] Open
Abstract
One of the major goals of structural genomics projects is to determine the three-dimensional structure of representative members of as many different fold families as possible. Comparative modeling is expected to fill the remaining gaps by providing structural models of homologs of the experimentally determined proteins. However, for such an approach to be successful it is essential that the quality of the experimentally determined structures is adequate. In an attempt to build a homology model for the protein dynein light chain 2A (DLC2A) we found two potential templates, both experimentally determined nuclear magnetic resonance (NMR) structures originating from structural genomics efforts. Despite their high sequence identity (96%), the folds of the two structures are markedly different. This urged us to perform in-depth analyses of both structure ensembles and the deposited experimental data, the results of which clearly identify one of the two models as largely incorrect. Next, we analyzed the quality of a large set of recent NMR-derived structure ensembles originating from both structural genomics projects and individual structure determination groups. Unfortunately, a visual inspection of structures exhibiting lower quality scores than DLC2A reveals that the seriously flawed DLC2A structure is not an isolated incident. Overall, our results illustrate that the quality of NMR structures cannot be reliably evaluated using only traditional experimental input data and overall quality indicators as a reference and clearly demonstrate the urgent need for a tight integration of more sophisticated structure validation tools in NMR structure determination projects. In contrast to common methodologies where structures are typically evaluated as a whole, such tools should preferentially operate on a per-residue basis. Three-dimensional biomolecular structures provide an invaluable source of biologically relevant information. To be able to learn the most of the wealth of information that these structures can provide us, it is of great importance that the quality and accuracy of the protein structure models deposited in the Protein Data Bank are as high as possible. In this work, the authors describe an analysis that illustrates that this is unfortunately not the case for many protein structures solved using nuclear magnetic resonance spectroscopy. They present an example in which two strikingly different models describing the same protein are analyzed using commonly available structure validation tools, and the results of this analysis show one of the two models to be incorrect. Subsequently, using a large set of recently determined structures, the authors demonstrate that unfortunately this example does not stand on its own. The analyses and examples clearly illustrate that relying solely on the experimental data to evaluate structural quality can provide a false sense of correctness and the combination of multiple sophisticated structure validation tools is required to detect the presence of errors in protein nuclear magnetic resonance structures.
Collapse
Affiliation(s)
- Sander B Nabuurs
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University, Nijmegen, Netherlands
| | - Chris A. E. M Spronk
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University, Nijmegen, Netherlands
| | - Geerten W Vuister
- Department of Biophysical Chemistry, Institute for Molecules and Materials, Radboud University, Nijmegen, Netherlands
- * To whom correspondence should be addressed. E-mail: (GWV); (GV)
| | - Gert Vriend
- Center for Molecular and Biomolecular Informatics, Nijmegen Center for Molecular Life Sciences, Radboud University, Nijmegen, Netherlands
- * To whom correspondence should be addressed. E-mail: (GWV); (GV)
| |
Collapse
|
19
|
Habeck M, Rieping W, Nilges M. Weighting of experimental evidence in macromolecular structure determination. Proc Natl Acad Sci U S A 2006; 103:1756-61. [PMID: 16446450 PMCID: PMC1413624 DOI: 10.1073/pnas.0506412103] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The determination of macromolecular structures requires weighting of experimental evidence relative to prior physical information. Although it can critically affect the quality of the calculated structures, experimental data are routinely weighted on an empirical basis. At present, cross-validation is the most rigorous method to determine the best weight. We describe a general method to adaptively weight experimental data in the course of structure calculation. It is further shown that the necessity to define weights for the data can be completely alleviated. We demonstrate the method on a structure calculation from NMR data and find that the resulting structures are optimal in terms of accuracy and structural quality. Our method is devoid of the bias imposed by an empirical choice of the weight and has some advantages over estimating the weight by cross-validation.
Collapse
Affiliation(s)
- Michael Habeck
- Unité de Bioinformatique Structurale, Institut Pasteur, Centre National de la Recherche Scientifique Unité de Recherche Associée 2185, 25-28, Rue du Dr Roux, 75724 Paris Cedex 15, France
| | - Wolfgang Rieping
- Unité de Bioinformatique Structurale, Institut Pasteur, Centre National de la Recherche Scientifique Unité de Recherche Associée 2185, 25-28, Rue du Dr Roux, 75724 Paris Cedex 15, France
| | - Michael Nilges
- Unité de Bioinformatique Structurale, Institut Pasteur, Centre National de la Recherche Scientifique Unité de Recherche Associée 2185, 25-28, Rue du Dr Roux, 75724 Paris Cedex 15, France
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
20
|
Geyer JP, Döker R, Kremer W, Zhao X, Kuhlmann J, Kalbitzer HR. Solution structure of the Ran-binding domain 2 of RanBP2 and its interaction with the C terminus of Ran. J Mol Biol 2005; 348:711-25. [PMID: 15826666 DOI: 10.1016/j.jmb.2005.02.033] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2004] [Revised: 02/15/2005] [Accepted: 02/16/2005] [Indexed: 10/25/2022]
Abstract
The termination of export processes from the nucleus to the cytoplasm in higher eukaryotes is mediated by binding of the small GTPase Ran as part of the export complexes to the Ran-binding domains (RanBD) of Ran-binding protein 2 (RanBP2) of the nuclear pore complex. So far, the structures of the first RanBD of RanBP2 and of RanBP1 in complexes with Ran have been known from X-ray crystallographic studies. Here we report the NMR solution structure of the uncomplexed second RanBD of RanBP2. The structure shows a pleckstrin homology (PH) fold featuring two almost orthogonal beta-sheets consisting of three and four strands and an alpha-helix sitting on top. This is in contrast to the RanBD in the crystal structure complexes in which one beta-strand is missing. That is probably due to the binding of the C-terminal alpha-helix of Ran to the RanBD in these complexes. To analyze the interaction between RanBD2 and the C terminus of Ran, NMR-titration studies with peptides comprising the six or 28 C-terminal residues of Ran were performed. While the six-residue peptide alone does not bind to RanBD2 in a specific manner, the 28-residue peptide, including the entire C-terminal helix of Ran, binds to RanBD2 in a manner analogous to the crystal structures. By solving the solution structure of the 28mer peptide alone, we confirmed that it adopts a stable alpha-helical structure like in native Ran and therefore serves as a valid model of the Ran C terminus. These results support current models that assume recognition of the transport complexes by the RanBDs through the Ran C terminus that is exposed in these complexes.
Collapse
Affiliation(s)
- J Peter Geyer
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Universitätsstrasse 31, D-93053 Regensburg, Germany
| | | | | | | | | | | |
Collapse
|
21
|
Möglich A, Weinfurtner D, Maurer T, Gronwald W, Kalbitzer HR. A restraint molecular dynamics and simulated annealing approach for protein homology modeling utilizing mean angles. BMC Bioinformatics 2005; 6:91. [PMID: 15819976 PMCID: PMC1127110 DOI: 10.1186/1471-2105-6-91] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2004] [Accepted: 04/08/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND We have developed the program PERMOL for semi-automated homology modeling of proteins. It is based on restrained molecular dynamics using a simulated annealing protocol in torsion angle space. As main restraints defining the optimal local geometry of the structure weighted mean dihedral angles and their standard deviations are used which are calculated with an algorithm described earlier by Doker et al. (1999, BBRC, 257, 348-350). The overall long-range contacts are established via a small number of distance restraints between atoms involved in hydrogen bonds and backbone atoms of conserved residues. Employing the restraints generated by PERMOL three-dimensional structures are obtained using standard molecular dynamics programs such as DYANA or CNS. RESULTS To test this modeling approach it has been used for predicting the structure of the histidine-containing phosphocarrier protein HPr from E. coli and the structure of the human peroxisome proliferator activated receptor gamma (Ppar gamma). The divergence between the modeled HPr and the previously determined X-ray structure was comparable to the divergence between the X-ray structure and the published NMR structure. The modeled structure of Ppar gamma was also very close to the previously solved X-ray structure with an RMSD of 0.262 nm for the backbone atoms. CONCLUSION In summary, we present a new method for homology modeling capable of producing high-quality structure models. An advantage of the method is that it can be used in combination with incomplete NMR data to obtain reasonable structure models in accordance with the experimental data.
Collapse
Affiliation(s)
- Andreas Möglich
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Universitätsstr. 31, D-93053 Regensburg, Germany
- Department of Biophysical Chemistry, Biozentrum, University of Basel, Klingelbergstr. 70, CH-4056 Basel, Switzerland
| | - Daniel Weinfurtner
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Universitätsstr. 31, D-93053 Regensburg, Germany
- Institut für Organische Chemie und Biochemie, Technische Universität München, Lichtenbergstr. 4, D-85747 Garching, Germany
| | - Till Maurer
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Universitätsstr. 31, D-93053 Regensburg, Germany
- Department of Lead Discovery, Boehringer Ingelheim Pharma GmbH, Birkendorfer Str. 65, D-88397 Biberach, Germany
| | - Wolfram Gronwald
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Universitätsstr. 31, D-93053 Regensburg, Germany
| | - Hans Robert Kalbitzer
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Universitätsstr. 31, D-93053 Regensburg, Germany
| |
Collapse
|
22
|
Snyder DA, Bhattacharya A, Huang YJ, Montelione GT. Assessing precision and accuracy of protein structures derived from NMR data. Proteins 2005; 59:655-61. [PMID: 15822105 DOI: 10.1002/prot.20499] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
23
|
Huang YJ, Powers R, Montelione GT. Protein NMR recall, precision, and F-measure scores (RPF scores): structure quality assessment measures based on information retrieval statistics. J Am Chem Soc 2005; 127:1665-74. [PMID: 15701001 DOI: 10.1021/ja047109h] [Citation(s) in RCA: 195] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
One of the most important challenges in modern protein NMR is the development of fast and sensitive structure quality assessment measures that can be used to evaluate the "goodness-of-fit" of the 3D structure with NOESY data, to indicate the correctness of the fold and accuracy of the resulting structure. Quality assessment is especially critical for automated NOESY interpretation and structure determination approaches. This paper describes new NMR quality assessment scores, including Recall, Precision, and F-measure scores (referred to here are "NMR RPF" scores), which quickly provide global measures of the goodness-of-fit of the 3D structures with NOESY peak lists using methods from information retrieval statistics. The sensitivity of the F-measure is improved using a scaled Fold Discriminating Power (DP) score. These statistical RPF scores are quite rapid to compute since NOE assignments and complete relaxation matrix calculations are not required. A graphical method for site-specific assessment of structure quality based on the Precision statistic is also described. These statistical measures are demonstrated to be valuable for assessing protein NMR structure accuracy. Their relationships to other proposed NMR "R-factors" and structure quality assessment scores are also discussed.
Collapse
Affiliation(s)
- Yuanpeng J Huang
- Center for Advanced Biotechnology and Medicine and Department of Molecular Biology and Biochemistry, Rutgers University, Northeast Structural Genomics Consortium, and Robert Wood Johnson Medical School, Piscataway, New Jersey 08854-5368, USA
| | | | | |
Collapse
|
24
|
Huang YJ, Moseley HNB, Baran MC, Arrowsmith C, Powers R, Tejero R, Szyperski T, Montelione GT. An integrated platform for automated analysis of protein NMR structures. Methods Enzymol 2005; 394:111-41. [PMID: 15808219 DOI: 10.1016/s0076-6879(05)94005-6] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022]
Abstract
Recent developments provide automated analysis of NMR assignments and three-dimensional (3D) structures of proteins. These approaches are generally applicable to proteins ranging from about 50 to 150 amino acids. In this chapter, we summarize progress by the Northeast Structural Genomics Consortium in standardizing the NMR data collection process for protein structure determination and in building an integrated platform for automated protein NMR structure analysis. Our integrated platform includes the following principal steps: (1) standardized NMR data collection, (2) standardized data processing (including spectral referencing and Fourier transformation), (3) automated peak picking and peak list editing, (4) automated analysis of resonance assignments, (5) automated analysis of NOESY data together with 3D structure determination, and (6) methods for protein structure validation. In particular, the software AutoStructure for automated NOESY data analysis is described in this chapter, together with a discussion of practical considerations for its use in high-throughput structure production efforts. The critical area of data quality assessment has evolved significantly over the past few years and involves evaluation of both intermediate and final peak lists, resonance assignments, and structural information derived from the NMR data. Methods for quality control of each of the major automated analysis steps in our platform are also discussed. Despite significant remaining challenges, when good quality data are available, automated analysis of protein NMR assignments and structures with this platform is both fast and reliable.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Baran MC, Huang YJ, Moseley HNB, Montelione GT. Automated analysis of protein NMR assignments and structures. Chem Rev 2004; 104:3541-56. [PMID: 15303826 DOI: 10.1021/cr030408p] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Michael C Baran
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, and Northeast Structural Genomics Consortium, Rutgers University, 679 Hoes Lane, Piscataway, NJ 08854, USA
| | | | | | | |
Collapse
|
26
|
Jung A, Bamann C, Kremer W, Kalbitzer HR, Brunner E. High-temperature solution NMR structure of TmCsp. Protein Sci 2004; 13:342-50. [PMID: 14739320 PMCID: PMC2286716 DOI: 10.1110/ps.03281604] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Cold shock proteins (Csps) are assumed to play a central role in the regulation of gene expression under cold shock conditions. Acting as single-stranded nucleic acid-binding proteins, they trigger the translation process and are therefore involved in the compensation of the influence of low temperatures (cold shock) upon the cell metabolism. However, it is unknown so far how Csps are switched on and off as a function of temperature. The aim of the present study is the study of possible structural changes responsible for this switching process. (1)H-(15)N HSQC spectra recorded at different temperatures and chemical-shift analysis have indicated subtle conformational changes for the cold-shock protein from the hyperthermophilic bacterium Thermotoga maritima (TmCsp) when the temperature is elevated from 303 K to its physiological temperature (343 K). The three-dimensional structure of TmCsp was determined by nuclear magnetic resonance (NMR) spectroscopy at 343 K to obtain quantitative information concerning these structural changes. By use of residual dipolar couplings, the loss of NOE information at high temperature could be compensated successfully. Most pronounced conformational changes compared with room-temperature conditions are observed for amino acid residues closely neighbored to two characteristic beta-bulges and a well-defined loop region of the protein. Because the residues shown to be responsible for the interaction of TmCsp with single-stranded nucleic acids can almost exclusively be found within these regions, nucleic acid-binding activity might be down-regulated with increasing temperature by the described conformational changes.
Collapse
Affiliation(s)
- Astrid Jung
- University of Regensburg, Institute of Biophysics and Physical Biochemistry, D-93040 Regensburg, Germany
| | | | | | | | | |
Collapse
|
27
|
Maurer T, Meier S, Kachel N, Munte CE, Hasenbein S, Koch B, Hengstenberg W, Kalbitzer HR. High-resolution structure of the histidine-containing phosphocarrier protein (HPr) from Staphylococcus aureus and characterization of its interaction with the bifunctional HPr kinase/phosphorylase. J Bacteriol 2004; 186:5906-18. [PMID: 15317796 PMCID: PMC516805 DOI: 10.1128/jb.186.17.5906-5918.2004] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2004] [Accepted: 05/17/2004] [Indexed: 11/20/2022] Open
Abstract
A high-resolution structure of the histidine-containing phosphocarrier protein (HPr) from Staphylococcus aureus was obtained by heteronuclear multidimensional nuclear magnetic resonance (NMR) spectroscopy on the basis of 1,766 structural restraints. Twenty-three hydrogen bonds in HPr could be directly detected by polarization transfer from the amide nitrogen to the carbonyl carbon involved in the hydrogen bond. Differential line broadening was used to characterize the interaction of HPr with the HPr kinase/phosphorylase (HPrK/P) of Staphylococcus xylosus, which is responsible for phosphorylation-dephosphorylation of the hydroxyl group of the regulatory serine residue at position 46. The dissociation constant Kd was determined to be 0.10 +/- 0.02 mM at 303 K from the NMR data, assuming independent binding. The data are consistent with a stoichiometry of 1 HPr molecule per HPrK/P monomer in solution. Using transversal relaxation optimized spectroscopy-heteronuclear single quantum correlation, we mapped the interaction site of the two proteins in the 330-kDa complex. As expected, it covers the region around Ser46 and the small helix b following this residue. In addition, HPrK/P also binds to the second phosphorylation site of HPr at position 15. This interaction may be essential for the recognition of the phosphorylation state of His15 and the phosphorylation-dependent regulation of the kinase/phosphorylase activity. In accordance with this observation, the recently published X-ray structure of the HPr/HPrK core protein complex from Lactobacillus casei shows interactions with the two phosphorylation sites. However, the NMR data also suggest differences for the full-length protein from S. xylosus: there are no indications for an interaction with the residues preceding the regulatory Ser46 residue (Thr41 to Lys45) in the protein of S. xylosus. In contrast, it seems to interact with the C-terminal helix of HPr in solution, an interaction which is not observed for the complex of HPr with the core of HPrK/P of L. casei in crystals.
Collapse
Affiliation(s)
- Till Maurer
- Institut für Biophysik und Physikalische Biochemie, Universität Regensburg, Regensburg, Germany
| | | | | | | | | | | | | | | |
Collapse
|
28
|
Meiler J, Baker D. Rapid protein fold determination using unassigned NMR data. Proc Natl Acad Sci U S A 2003; 100:15404-9. [PMID: 14668443 PMCID: PMC307580 DOI: 10.1073/pnas.2434121100] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2003] [Indexed: 11/18/2022] Open
Abstract
Experimental structure determination by x-ray crystallography and NMR spectroscopy is slow and time-consuming compared with the rate at which new protein sequences are being identified. NMR spectroscopy has the advantage of rapidly providing the structurally relevant information in the form of unassigned chemical shifts (CSs), intensities of NOESY crosspeaks [nuclear Overhauser effects (NOEs)], and residual dipolar couplings (RDCs), but use of these data are limited by the time and effort needed to assign individual resonances to specific atoms. Here, we develop a method for generating low-resolution protein structures by using unassigned NMR data that relies on the de novo protein structure prediction algorithm, rosetta [Simons, K. T., Kooperberg, C., Huang, E. & Baker, D. (1997) J. Mol. Biol. 268, 209-225] and a Monte Carlo procedure that searches for the assignment of resonances to atoms that produces the best fit of the experimental NMR data to a candidate 3D structure. A large ensemble of models is generated from sequence information alone by using rosetta, an optimal assignment is identified for each model, and the models are then ranked based on their fit with the NMR data assuming the identified assignments. The method was tested on nine protein sequences between 56 and 140 amino acids and published CS, NOE, and RDC data. The procedure yielded models with rms deviations between 3 and 6 A, and, in four of the nine cases, the partial assignments obtained by the method could be used to refine the structures to high resolution (0.6-1.8 A) by repeated cycles of structure generation guided by the partial assignments, followed by reassignment using the newly generated models.
Collapse
Affiliation(s)
- Jens Meiler
- Department of Biochemistry and Howard Hughes Medical Institute, University of Washington, PO Box 357350, Seattle, WA 98195-7350, USA
| | | |
Collapse
|
29
|
Kachel N, Erdmann KS, Kremer W, Wolff P, Gronwald W, Heumann R, Kalbitzer HR. Structure Determination and Ligand Interactions of the PDZ2b Domain of PTP-Bas (hPTP1E): Splicing-induced Modulation of Ligand Specificity. J Mol Biol 2003; 334:143-55. [PMID: 14596806 DOI: 10.1016/j.jmb.2003.09.026] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Two versions of the PDZ2 domain of the protein tyrosine phosphatase PTP-Bas/human PTP-BL are generated by alternative splicing. The domains differ by the insertion of five amino acid residues and their affinity to the tumour suppressor protein APC. Whereas PDZ2a is able to bind APC in the nanomolar range, PDZ2b shows no apparent interaction with APC. Here the solution structure of the splicing variant of PDZ2 with the insertion has been determined using 2D and 3D heteronuclear NMR experiments. The structural reason for the changed binding specificity is the reorientation of the loop with extra five amino acid residues, which folds back onto beta-strands two and three. In addition the side-chain of Lys32 closes the binding site of the APC binding protein and the two helices, especially alpha-helix 2, change their relative position to the protein core. Consecutively, the binding site is sterically no longer fully accessible. From the NMR-titration studies with a C-terminal APC-peptide the affinity of the peptide with the protein can be estimated as 540(+/-40)microM. The binding site encompasses part of the analogous binding site of PDZ2a as already described previously, yet specific interaction sites are abolished by the insertion of amino acids in PDZ2b. As shown by high-affinity chromatography, GST-PDZ2b and GST-PDZ2a bind to phosphatidylinositol 4,5-bisphosphate (PIP(2)) micelles with a dissociation constant K(D) of 21 microM and 55 microM, respectively. In line with these data PDZ2b binds isolated, dissolved PIP(2) and PIP(3) (phosphatidylinositol 3,4,5-trisphosphate) molecules specifically with a lower K(D) of 230(+/-20)microM as detected by NMR spectroscopy. The binding site could be located by our studies and involves the residues Ile24, Val26, Val70, Asn71, Gly77, Ala78, Glu85, Arg88, Gly91 and Gln92. PIP(2) and PIP(3) binding takes place in the groove of the PDZ domain that is normally part of the APC binding site.
Collapse
Affiliation(s)
- Norman Kachel
- Institut für Biophysik und Physikalische Biochemie, Universität Regensburg, Universitätsstr. 31, D-93053, Regensburg, Germany
| | | | | | | | | | | | | |
Collapse
|
30
|
Yee A, Chang X, Pineda-Lucena A, Wu B, Semesi A, Le B, Ramelot T, Lee GM, Bhattacharyya S, Gutierrez P, Denisov A, Lee CH, Cort JR, Kozlov G, Liao J, Finak G, Chen L, Wishart D, Lee W, McIntosh LP, Gehring K, Kennedy MA, Edwards AM, Arrowsmith CH. An NMR approach to structural proteomics. Proc Natl Acad Sci U S A 2002; 99:1825-30. [PMID: 11854485 PMCID: PMC122278 DOI: 10.1073/pnas.042684599] [Citation(s) in RCA: 166] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The influx of genomic sequence information has led to the concept of structural proteomics, the determination of protein structures on a genome-wide scale. Here we describe an approach to structural proteomics of small proteins using NMR spectroscopy. Over 500 small proteins from several organisms were cloned, expressed, purified, and evaluated by NMR. Although there was variability among proteomes, overall 20% of these proteins were found to be readily amenable to NMR structure determination. NMR sample preparation was centralized in one facility, and a distributive approach was used for NMR data collection and analysis. Twelve structures are reported here as part of this approach, which allowed us to infer putative functions for several conserved hypothetical proteins.
Collapse
Affiliation(s)
- Adelinda Yee
- Ontario Cancer Institute and Department of Medical Biophysics, University of Toronto, 101 College Street, Toronto, ON, Canada M5G 1L7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Zabell APR, Post CB. Docking multiple conformations of a flexible ligand into a protein binding site using NMR restraints. Proteins 2002; 46:295-307. [PMID: 11835505 DOI: 10.1002/prot.10017] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
A method is described for docking a large, flexible ligand using intra-ligand conformational restraints from exchange-transferred NOE (etNOE) data. Numerous conformations of the ligand are generated in isolation, and a subset of representative conformations is selected. A crude model of the protein-ligand complex is used as a template for overlaying the selected ligand structures, and each complex is conformationally relaxed by molecular mechanics to optimize the interaction. Finally, the complexes were assessed for structural quality. Alternative approaches are described for the three steps of the method: generation of the initial docking template; selection of a subset of ligand conformations; and conformational sampling of the complex. The template is generated either by manual docking using interactive graphics or by a computational grid-based search of the binding site. A subset of conformations from the total number of peptides calculated in isolation is selected based on either low energy and satisfaction of the etNOE restraints, or a cluster analysis of the full set. To optimize the interactions in the complex, either a restrained Monte Carlo-energy minimization (MCM) protocol or a restrained simulated annealing (SA) protocol were used. This work produced 53 initial complexes of which 8 were assessed in detail. With the etNOE conformational restraints, all of the approaches provide reasonable models. The grid-based approach to generate an initial docking template allows a large volume to be sampled, and as a result, two distinct binding modes were identified for a fifteen-residue peptide binding to an enzyme active site.
Collapse
Affiliation(s)
- Adam P R Zabell
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, Indiana 47907-1333, USA
| | | |
Collapse
|
32
|
Gronwald W, Huber F, Grünewald P, Spörner M, Wohlgemuth S, Herrmann C, Kalbitzer HR. Solution structure of the Ras binding domain of the protein kinase Byr2 from Schizosaccharomyces pombe. Structure 2001; 9:1029-41. [PMID: 11709167 DOI: 10.1016/s0969-2126(01)00671-2] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
BACKGROUND After activation, small GTPases such as Ras transfer the incoming signal to effectors by specifically interacting with the binding domain of these proteins. Structural details of the binding domain of different effectors determine which pathway is predominantly activated. Byr2 from fission yeast is a functional homolog of Raf, which is the direct downstream target of Ras in mammalians that initiates a protein kinase cascade. The amino acid sequence of Byr2's Ras binding domain is only weakly related to that of Raf, and Byr2's three-dimensional structure is unknown. RESULTS We have solved the 3D structure of the Ras binding domain of Byr2 (Byr2RBD) from Schizosaccharomyces pombe in solution. The structure consists of three alpha helices and a mixed five-stranded beta pleated sheet arranged in the topology betabetaalphabetabetaalphabetaalpha with the first seven canonic secondary structure elements forming a ubiquitin superfold. 15N-(1)H-TROSY-HSQC spectroscopy of the complex of Byr2RBD with Ras*Mg(2+)*GppNHp reveals that the first and second beta strands and the first alpha helix of Byr2 are mainly involved in the protein-protein interaction as observed in other Ras binding domains. Although the putative interaction site of H-Ras from human and Ras1 from S. pombe are identical in sequence, binding to Byr2 leads to small but significant differences in the NMR spectra, indicating a slightly different binding mode. CONCLUSIONS The ubiquitin superfold appears to be the general structural motif for Ras binding domains even in cases with vanishing sequence identity. However, details of the 3D structure and the interacting interface are different, thereby determining the specifity of the recognition of Ras and Ras-related proteins.
Collapse
Affiliation(s)
- W Gronwald
- Institut für Biophysik und physikalische Biochemie, Universität Regensburg, Postfach, D-93040, Regensburg, Germany
| | | | | | | | | | | | | |
Collapse
|
33
|
Heinemann U, Illing G, Oschkinat H. High-throughput three-dimensional protein structure determination. Curr Opin Biotechnol 2001; 12:348-54. [PMID: 11551462 DOI: 10.1016/s0958-1669(00)00226-3] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In the wake of finished genomic sequencing projects, high-throughput analysis techniques are being developed in various fields of functional genomics. Of special interest in this regard is the three-dimensional structure analysis of proteins by X-ray crystallography and NMR spectroscopy, which has been characterized by distinctly low-throughput in the past. A number of recent advances in instrumentation and software are promising to radically change this situation, leaving the production of suitable protein samples as the sole rate-limiting step in structural analyses.
Collapse
Affiliation(s)
- U Heinemann
- Forschungsgruppe Kristallographie, Max-Delbrück-Centrum für Molekulare Medizin, Robert-Rössle-Strasse 10, D-13125 Berlin, Germany.
| | | | | |
Collapse
|
34
|
Prestegard JH, Valafar H, Glushka J, Tian F. Nuclear magnetic resonance in the era of structural genomics. Biochemistry 2001; 40:8677-85. [PMID: 11467927 DOI: 10.1021/bi0102095] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Current interests in structural genomics, and the associated need for high through-put structure determination methods, offer an opportunity to examine new nuclear magnetic resonance (NMR) methodology and the impact this methodology can have on structure determination of proteins. The time required for structure determination by traditional NMR methods is currently long, but improved hardware, automation of analysis, and new sources of data such as residual dipolar couplings promise to change this. Greatly improved efficiency, coupled with an ability to characterize proteins that may not produce crystals suitable for investigation by X-ray diffraction, suggests that NMR will play an important role in structural genomics programs.
Collapse
Affiliation(s)
- J H Prestegard
- Complex Carbohydrate Research Center, University of Georgia, Athens, Georgia 30602, USA
| | | | | | | |
Collapse
|