1
|
Buchner L, Güntert P. Systematic evaluation of combined automated NOE assignment and structure calculation with CYANA. JOURNAL OF BIOMOLECULAR NMR 2015; 62:81-95. [PMID: 25796507 DOI: 10.1007/s10858-015-9921-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 03/16/2015] [Indexed: 05/07/2023]
Abstract
The automated assignment of NOESY cross peaks has become a fundamental technique for NMR protein structure analysis. A widely used algorithm for this purpose is implemented in the program CYANA. It has been used for a large number of structure determinations of proteins in solution but a systematic evaluation of its performance has not yet been reported. In this paper we systematically analyze the reliability of combined automated NOESY assignment and structure calculation with CYANA under a variety of conditions on the basis of the experimental NMR data sets of ten proteins. To evaluate the robustness of the algorithm, the original high-quality experimental data sets were modified in different ways to simulate the effect of data imperfections, i.e. incomplete or erroneous chemical shift assignments, missing NOESY cross peaks, inaccurate peak positions, inaccurate peak intensities, lower dimensionality NOESY spectra, and higher tolerances for the matching of chemical shifts and peak positions. The results show that the algorithm is remarkably robust with regard to imperfections of the NOESY peak lists and the chemical shift tolerances but susceptible to lacking or erroneous resonance assignments, in particular for nuclei that are involved in many NOESY cross peaks.
Collapse
Affiliation(s)
- Lena Buchner
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, and Frankfurt Institute of Advanced Studies, Goethe University Frankfurt am Main, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany
| | | |
Collapse
|
2
|
Cannistraci CV, Abbas A, Gao X. Median Modified Wiener Filter for nonlinear adaptive spatial denoising of protein NMR multidimensional spectra. Sci Rep 2015; 5:8017. [PMID: 25619991 PMCID: PMC4306135 DOI: 10.1038/srep08017] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2014] [Accepted: 12/29/2014] [Indexed: 11/21/2022] Open
Abstract
Denoising multidimensional NMR-spectra is a fundamental step in NMR protein structure determination. The state-of-the-art method uses wavelet-denoising, which may suffer when applied to non-stationary signals affected by Gaussian-white-noise mixed with strong impulsive artifacts, like those in multi-dimensional NMR-spectra. Regrettably, Wavelet's performance depends on a combinatorial search of wavelet shapes and parameters; and multi-dimensional extension of wavelet-denoising is highly non-trivial, which hampers its application to multidimensional NMR-spectra. Here, we endorse a diverse philosophy of denoising NMR-spectra: less is more! We consider spatial filters that have only one parameter to tune: the window-size. We propose, for the first time, the 3D extension of the median-modified-Wiener-filter (MMWF), an adaptive variant of the median-filter, and also its novel variation named MMWF*. We test the proposed filters and the Wiener-filter, an adaptive variant of the mean-filter, on a benchmark set that contains 16 two-dimensional and three-dimensional NMR-spectra extracted from eight proteins. Our results demonstrate that the adaptive spatial filters significantly outperform their non-adaptive versions. The performance of the new MMWF* on 2D/3D-spectra is even better than wavelet-denoising. Noticeably, MMWF* produces stable high performance almost invariant for diverse window-size settings: this signifies a consistent advantage in the implementation of automatic pipelines for protein NMR-spectra analysis.
Collapse
Affiliation(s)
- Carlo Vittorio Cannistraci
- Biomedical Cybernetics Group, Biotechnology Center (BIOTEC), Technische Universität Dresden, Tatzberg 47/49, 01307 Dresden, Germany
| | - Ahmed Abbas
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
3
|
Abbas A, Guo X, Jing BY, Gao X. An automated framework for NMR resonance assignment through simultaneous slice picking and spin system forming. JOURNAL OF BIOMOLECULAR NMR 2014; 59:75-86. [PMID: 24748536 DOI: 10.1007/s10858-014-9828-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/03/2014] [Accepted: 04/05/2014] [Indexed: 06/03/2023]
Abstract
Despite significant advances in automated nuclear magnetic resonance-based protein structure determination, the high numbers of false positives and false negatives among the peaks selected by fully automated methods remain a problem. These false positives and negatives impair the performance of resonance assignment methods. One of the main reasons for this problem is that the computational research community often considers peak picking and resonance assignment to be two separate problems, whereas spectroscopists use expert knowledge to pick peaks and assign their resonances at the same time. We propose a novel framework that simultaneously conducts slice picking and spin system forming, an essential step in resonance assignment. Our framework then employs a genetic algorithm, directed by both connectivity information and amino acid typing information from the spin systems, to assign the spin systems to residues. The inputs to our framework can be as few as two commonly used spectra, i.e., CBCA(CO)NH and HNCACB. Different from the existing peak picking and resonance assignment methods that treat peaks as the units, our method is based on 'slices', which are one-dimensional vectors in three-dimensional spectra that correspond to certain ([Formula: see text]) values. Experimental results on both benchmark simulated data sets and four real protein data sets demonstrate that our method significantly outperforms the state-of-the-art methods while using a less number of spectra than those methods. Our method is freely available at http://sfb.kaust.edu.sa/Pages/Software.aspx.
Collapse
Affiliation(s)
- Ahmed Abbas
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955-6900, Saudi Arabia
| | | | | | | |
Collapse
|
4
|
Nielsen JT, Kulminskaya N, Bjerring M, Nielsen NC. Automated robust and accurate assignment of protein resonances for solid state NMR. JOURNAL OF BIOMOLECULAR NMR 2014; 59:119-34. [PMID: 24817190 DOI: 10.1007/s10858-014-9835-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 04/29/2014] [Indexed: 05/26/2023]
Abstract
The process of resonance assignment represents a time-consuming and potentially error-prone bottleneck in structural studies of proteins by solid-state NMR (ssNMR). Software for the automation of this process is therefore of high interest. Procedures developed through the last decades for solution-state NMR are not directly applicable for ssNMR due to the inherently lower data quality caused by lower sensitivity and broader lines, leading to overlap between peaks. Recently, the first efforts towards procedures specifically aimed for ssNMR have been realized (Schmidt et al. in J Biomol NMR 56(3):243-254, 2013). Here we present a robust automatic method, which can accurately assign protein resonances using peak lists from a small set of simple 2D and 3D ssNMR experiments, applicable in cases with low sensitivity. The method is demonstrated on three uniformly (13)C, (15)N labeled biomolecules with different challenges on the assignments. In particular, for the immunoglobulin binding domain B1 of streptococcal protein G automatic assignment shows 100% accuracy for the backbone resonances and 91.8% when including all side chain carbons. It is demonstrated, by using a procedure for generating artificial spectra with increasing line widths, that our method, GAMES_ASSIGN can handle a significant amount of overlapping peaks in the assignment. The impact of including different ssNMR experiments is evaluated as well.
Collapse
Affiliation(s)
- Jakob Toudahl Nielsen
- Center for Insoluble Protein Structures (inSPIN), Interdisciplinary Nanoscience Center (iNANO), Department of Chemistry, Aarhus University, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark,
| | | | | | | |
Collapse
|
5
|
Schmidt E, Güntert P. Reliability of exclusively NOESY-based automated resonance assignment and structure determination of proteins. JOURNAL OF BIOMOLECULAR NMR 2013; 57:193-204. [PMID: 24036635 DOI: 10.1007/s10858-013-9779-x] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/09/2013] [Accepted: 09/02/2013] [Indexed: 06/02/2023]
Abstract
Protein structure determination by NMR can in principle be speeded up both by reducing the measurement time on the NMR spectrometer and by a more efficient analysis of the spectra. Here we study the reliability of protein structure determination based on a single type of spectra, namely nuclear Overhauser effect spectroscopy (NOESY), using a fully automated procedure for the sequence-specific resonance assignment with the recently introduced FLYA algorithm, followed by combined automated NOE distance restraint assignment and structure calculation with CYANA. This NOESY-FLYA method was applied to eight proteins with 63-160 residues for which resonance assignments and solution structures had previously been determined by the Northeast Structural Genomics Consortium (NESG), and unrefined and refined NOESY data sets have been made available for the Critical Assessment of Automated Structure Determination of Proteins by NMR project. Using only peak lists from three-dimensional (13)C- or (15)N-resolved NOESY spectra as input, the FLYA algorithm yielded for the eight proteins 91-98 % correct backbone and side-chain assignments if manually refined peak lists are used, and 64-96 % correct assignments based on raw peak lists. Subsequent structure calculations with CYANA then produced structures with root-mean-square deviation (RMSD) values to the manually determined reference structures of 0.8-2.0 Å if refined peak lists are used. With raw peak lists, calculations for 4 proteins converged resulting in RMSDs to the reference structure of 0.8-2.8 Å, whereas no convergence was obtained for the four other proteins (two of which did already not converge with the correct manual resonance assignments given as input). These results show that, given high-quality experimental NOESY peak lists, the chemical shift assignments can be uncovered, without any recourse to traditional through-bond type assignment experiments, to an extent that is sufficient for calculating accurate three-dimensional structures.
Collapse
Affiliation(s)
- Elena Schmidt
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, Frankfurt Institute for Advanced Studies, Goethe University Frankfurt am Main, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany
| | | |
Collapse
|
6
|
Abbas A, Kong XB, Liu Z, Jing BY, Gao X. Automatic peak selection by a Benjamini-Hochberg-based algorithm. PLoS One 2013; 8:e53112. [PMID: 23308147 PMCID: PMC3538655 DOI: 10.1371/journal.pone.0053112] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2012] [Accepted: 11/26/2012] [Indexed: 11/25/2022] Open
Abstract
A common issue in bioinformatics is that computational methods often generate a large number of predictions sorted according to certain confidence scores. A key problem is then determining how many predictions must be selected to include most of the true predictions while maintaining reasonably high precision. In nuclear magnetic resonance (NMR)-based protein structure determination, for instance, computational peak picking methods are becoming more and more common, although expert-knowledge remains the method of choice to determine how many peaks among thousands of candidate peaks should be taken into consideration to capture the true peaks. Here, we propose a Benjamini-Hochberg (B-H)-based approach that automatically selects the number of peaks. We formulate the peak selection problem as a multiple testing problem. Given a candidate peak list sorted by either volumes or intensities, we first convert the peaks into [Formula: see text]-values and then apply the B-H-based algorithm to automatically select the number of peaks. The proposed approach is tested on the state-of-the-art peak picking methods, including WaVPeak [1] and PICKY [2]. Compared with the traditional fixed number-based approach, our approach returns significantly more true peaks. For instance, by combining WaVPeak or PICKY with the proposed method, the missing peak rates are on average reduced by 20% and 26%, respectively, in a benchmark set of 32 spectra extracted from eight proteins. The consensus of the B-H-selected peaks from both WaVPeak and PICKY achieves 88% recall and 83% precision, which significantly outperforms each individual method and the consensus method without using the B-H algorithm. The proposed method can be used as a standard procedure for any peak picking method and straightforwardly applied to some other prediction selection problems in bioinformatics. The source code, documentation and example data of the proposed method is available at http://sfb.kaust.edu.sa/pages/software.aspx.
Collapse
Affiliation(s)
- Ahmed Abbas
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| | - Xin-Bing Kong
- Department of Statistics, Fudan University, Shanghai, China
| | - Zhi Liu
- Department of Mathematics, Faculty of Science and Technology, University of Macau, Taipa, Macau
| | - Bing-Yi Jing
- Department of Mathematics, Hong Kong University of Science and Technology, Kowloon, Hong Kong
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia
| |
Collapse
|
7
|
Zawadzka-Kazimierczuk A, Koźmiński W, Billeter M. TSAR: a program for automatic resonance assignment using 2D cross-sections of high dimensionality, high-resolution spectra. JOURNAL OF BIOMOLECULAR NMR 2012; 54:81-95. [PMID: 22806130 DOI: 10.1007/s10858-012-9652-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2012] [Accepted: 06/29/2012] [Indexed: 05/13/2023]
Abstract
While NMR studies of proteins typically aim at structure, dynamics or interactions, resonance assignments represent in almost all cases the initial step of the analysis. With increasing complexity of the NMR spectra, for example due to decreasing extent of ordered structure, this task often becomes both difficult and time-consuming, and the recording of high-dimensional data with high-resolution may be essential. Random sampling of the evolution time space, combined with sparse multidimensional Fourier transform (SMFT), allows for efficient recording of very high dimensional spectra (≥4 dimensions) while maintaining high resolution. However, the nature of this data demands for automation of the assignment process. Here we present the program TSAR (Tool for SMFT-based Assignment of Resonances), which exploits all advantages of SMFT input. Moreover, its flexibility allows to process data from any type of experiments that provide sequential connectivities. The algorithm was tested on several protein samples, including a disordered 81-residue fragment of the δ subunit of RNA polymerase from Bacillus subtilis containing various repetitive sequences. For our test examples, TSAR achieves a high percentage of assigned residues without any erroneous assignments.
Collapse
|
8
|
Schmidt E, Güntert P. A new algorithm for reliable and general NMR resonance assignment. J Am Chem Soc 2012; 134:12817-29. [PMID: 22794163 DOI: 10.1021/ja305091n] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The new FLYA automated resonance assignment algorithm determines NMR chemical shift assignments on the basis of peak lists from any combination of multidimensional through-bond or through-space NMR experiments for proteins. Backbone and side-chain assignments can be determined. All experimental data are used simultaneously, thereby exploiting optimally the redundancy present in the input peak lists and circumventing potential pitfalls of assignment strategies in which results obtained in a given step remain fixed input data for subsequent steps. Instead of prescribing a specific assignment strategy, the FLYA resonance assignment algorithm requires only experimental peak lists and the primary structure of the protein, from which the peaks expected in a given spectrum can be generated by applying a set of rules, defined in a straightforward way by specifying through-bond or through-space magnetization transfer pathways. The algorithm determines the resonance assignment by finding an optimal mapping between the set of expected peaks that are assigned by definition but have unknown positions and the set of measured peaks in the input peak lists that are initially unassigned but have a known position in the spectrum. Using peak lists obtained by purely automated peak picking from the experimental spectra of three proteins, FLYA assigned correctly 96-99% of the backbone and 90-91% of all resonances that could be assigned manually. Systematic studies quantified the impact of various factors on the assignment accuracy, namely the extent of missing real peaks and the amount of additional artifact peaks in the input peak lists, as well as the accuracy of the peak positions. Comparing the resonance assignments from FLYA with those obtained from two other existing algorithms showed that using identical experimental input data these other algorithms yielded significantly (40-142%) more erroneous assignments than FLYA. The FLYA resonance assignment algorithm thus has the reliability and flexibility to replace most manual and semi-automatic assignment procedures for NMR studies of proteins.
Collapse
Affiliation(s)
- Elena Schmidt
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, Goethe University Frankfurt am Main, Frankfurt am Main, Germany
| | | |
Collapse
|
9
|
Liu Z, Abbas A, Jing BY, Gao X. WaVPeak: picking NMR peaks through wavelet-based smoothing and volume-based filtering. Bioinformatics 2012; 28:914-20. [PMID: 22328784 PMCID: PMC3315717 DOI: 10.1093/bioinformatics/bts078] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2011] [Revised: 01/16/2012] [Accepted: 02/08/2012] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Nuclear magnetic resonance (NMR) has been widely used as a powerful tool to determine the 3D structures of proteins in vivo. However, the post-spectra processing stage of NMR structure determination usually involves a tremendous amount of time and expert knowledge, which includes peak picking, chemical shift assignment and structure calculation steps. Detecting accurate peaks from the NMR spectra is a prerequisite for all following steps, and thus remains a key problem in automatic NMR structure determination. RESULTS We introduce WaVPeak, a fully automatic peak detection method. WaVPeak first smoothes the given NMR spectrum by wavelets. The peaks are then identified as the local maxima. The false positive peaks are filtered out efficiently by considering the volume of the peaks. WaVPeak has two major advantages over the state-of-the-art peak-picking methods. First, through wavelet-based smoothing, WaVPeak does not eliminate any data point in the spectra. Therefore, WaVPeak is able to detect weak peaks that are embedded in the noise level. NMR spectroscopists need the most help isolating these weak peaks. Second, WaVPeak estimates the volume of the peaks to filter the false positives. This is more reliable than intensity-based filters that are widely used in existing methods. We evaluate the performance of WaVPeak on the benchmark set proposed by PICKY (Alipanahi et al., 2009), one of the most accurate methods in the literature. The dataset comprises 32 2D and 3D spectra from eight different proteins. Experimental results demonstrate that WaVPeak achieves an average of 96%, 91%, 88%, 76% and 85% recall on (15)N-HSQC, HNCO, HNCA, HNCACB and CBCA(CO)NH, respectively. When the same number of peaks are considered, WaVPeak significantly outperforms PICKY. AVAILABILITY WaVPeak is an open source program. The source code and two test spectra of WaVPeak are available at http://faculty.kaust.edu.sa/sites/xingao/Pages/Publications.aspx. The online server is under construction. CONTACT statliuzhi@xmu.edu.cn; ahmed.abbas@kaust.edu.sa; majing@ust.hk; xin.gao@kaust.edu.sa.
Collapse
Affiliation(s)
- Zhi Liu
- The Wang Yanan Institute for Studies in Economics, Xiamen University, Xiamen 361000, China
| | | | | | | |
Collapse
|
10
|
Markwick PR, Nilges M. Computational approaches to the interpretation of NMR data for studying protein dynamics. Chem Phys 2012. [DOI: 10.1016/j.chemphys.2011.11.023] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
11
|
Guerry P, Herrmann T. Comprehensive automation for NMR structure determination of proteins. Methods Mol Biol 2012; 831:429-51. [PMID: 22167686 DOI: 10.1007/978-1-61779-480-3_22] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter gives an overview of automated protein structure determination by nuclear magnetic resonance (NMR) with the UNIO protocol that enables high to full automation of all NMR data analysis steps involved. Four established algorithms, namely, the MATCH algorithm for sequence-specific resonance assignment, the ASCAN algorithm for side-chain resonance assignment, the CANDID algorithm for NOE assignment, and the ATNOS algorithm for signal identification in NMR spectra, are assembled into three principal UNIO NMR data analysis components (MATCH, ATNOS/ASCAN, and ATNOS/CANDID) that are accessed thanks to a particularly intuitive and flexible, yet powerful graphical user interface (GUI). UNIO is designed to work independently or in association with other NMR software. The principal data analysis components for sequence-specific backbone, side-chain and NOE assignment may be run separately or out of sequence. User-intervention at individual stages is encouraged and facilitated by graphical tools included for the preparation, analysis, validation, and subsequent presentation of the NMR structure.
Collapse
Affiliation(s)
- Paul Guerry
- Centre Européen de RMN à très Hauts Champs, Université de Lyon, Ecole Normale Supérieure de Lyon, CNRS, Université Claude, Villeurbanne, France
| | | |
Collapse
|
12
|
Orekhov VY, Jaravine VA. Analysis of non-uniformly sampled spectra with multi-dimensional decomposition. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2011; 59:271-92. [PMID: 21920222 DOI: 10.1016/j.pnmrs.2011.02.002] [Citation(s) in RCA: 246] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2011] [Accepted: 02/21/2011] [Indexed: 05/04/2023]
Affiliation(s)
- Vladislav Yu Orekhov
- Swedish NMR Centre, University of Gothenburg, Box 465, 40530 Gothenburg, Sweden.
| | | |
Collapse
|
13
|
Jang R, Gao X, Li M. Towards fully automated structure-based NMR resonance assignment of ¹⁵N-labeled proteins from automatically picked peaks. J Comput Biol 2011; 18:347-63. [PMID: 21385039 DOI: 10.1089/cmb.2010.0251] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In NMR resonance assignment, an indispensable step in NMR protein studies, manually processed peaks from both N-labeled and C-labeled spectra are typically used as inputs. However, the use of homologous structures can allow one to use only N-labeled NMR data and avoid the added expense of using C-labeled data. We propose a novel integer programming framework for structure-based backbone resonance assignment using N-labeled data. The core consists of a pair of integer programming models: one for spin system forming and amino acid typing, and the other for backbone resonance assignment. The goal is to perform the assignment directly from spectra without any manual intervention via automatically picked peaks, which are much noisier than manually picked peaks, so methods must be error-tolerant. In the case of semi-automated/manually processed peak data, we compare our system with the Xiong-Pandurangan-Bailey-Kellogg's contact replacement (CR) method, which is the most error-tolerant method for structure-based resonance assignment. Our system, on average, reduces the error rate of the CR method by five folds on their data set. In addition, by using an iterative algorithm, our system has the added capability of using the NOESY data to correct assignment errors due to errors in predicting the amino acid and secondary structure type of each spin system. On a publicly available data set for human ubiquitin, where the typing accuracy is 83%, we achieve 91% accuracy, compared to the 59% accuracy obtained without correcting for such errors. In the case of automatically picked peaks, using assignment information from yeast ubiquitin, we achieve a fully automatic assignment with 97% accuracy. To our knowledge, this is the first system that can achieve fully automatic structure-based assignment directly from spectra. This has implications in NMR protein mutant studies, where the assignment step is repeated for each mutant.
Collapse
Affiliation(s)
- Richard Jang
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
| | | | | |
Collapse
|
14
|
Warner LR, Varga K, Lange OF, Baker SL, Baker D, Sousa MC, Pardi A. Structure of the BamC two-domain protein obtained by Rosetta with a limited NMR data set. J Mol Biol 2011; 411:83-95. [PMID: 21624375 DOI: 10.1016/j.jmb.2011.05.022] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2011] [Revised: 05/13/2011] [Accepted: 05/16/2011] [Indexed: 10/18/2022]
Abstract
The CS-RDC-NOE Rosetta program was used to generate the solution structure of a 27-kDa fragment of the Escherichia coli BamC protein from a limited set of NMR data. The BamC protein is a component of the essential five-protein β-barrel assembly machine in E. coli. The first 100 residues in BamC were disordered in solution. The Rosetta calculations showed that BamC₁₀₁₋₃₄₄ forms two well-defined domains connected by an ~18-residue linker, where the relative orientation of the domains was not defined. Both domains adopt a helix-grip fold previously observed in the Bet v 1 superfamily. ¹⁵N relaxation data indicated a high degree of conformational flexibility for the linker connecting the N-terminal domain and the C-terminal domain in BamC. The results here show that CS-RDC-NOE Rosetta is robust and has a high tolerance for misassigned nuclear Overhauser effect restraints, greatly simplifying NMR structure determinations.
Collapse
Affiliation(s)
- Lisa R Warner
- Department of Chemistry and Biochemistry, University of Colorado, Boulder, Boulder, CO 80309, USA
| | | | | | | | | | | | | |
Collapse
|
15
|
Breukels V, Konijnenberg A, Nabuurs SM, Doreleijers JF, Kovalevskaya NV, Vuister GW. Overview on the use of NMR to examine protein structure. CURRENT PROTOCOLS IN PROTEIN SCIENCE 2011; Chapter 17:Unit17.5. [PMID: 21488042 DOI: 10.1002/0471140864.ps1705s64] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Any protein structure determination process contains several steps, starting from obtaining a suitable sample, then moving on to acquiring data and spectral assignment, and lastly to the final steps of structure determination and validation. This unit describes all of these steps, starting with the basic physical principles behind NMR and some of the most commonly measured and observed phenomena such as chemical shift, scalar and residual coupling, and the nuclear Overhauser effect. Then, in somewhat more detail, the process of spectral assignment and structure elucidation is explained. Furthermore, the use of NMR to study protein-ligand interaction, protein dynamics, or protein folding is described.
Collapse
Affiliation(s)
- Vincent Breukels
- Protein Biophysics, Institute for Molecules and Materials, Radboud University Nijmegen, Nijmegen, The Netherlands
| | | | | | | | | | | |
Collapse
|
16
|
Ziarek JJ, Peterson FC, Lytle BL, Volkman BF. Binding site identification and structure determination of protein-ligand complexes by NMR a semiautomated approach. Methods Enzymol 2011; 493:241-75. [PMID: 21371594 PMCID: PMC3635485 DOI: 10.1016/b978-0-12-381274-2.00010-8] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Over the last 15 years, the role of NMR spectroscopy in the lead identification and optimization stages of pharmaceutical drug discovery has steadily increased. NMR occupies a unique niche in the biophysical analysis of drug-like compounds because of its ability to identify binding sites, affinities, and ligand poses at the level of individual amino acids without necessarily solving the structure of the protein-ligand complex. However, it can also provide structures of flexible proteins and low-affinity (K(d)>10(-6)M) complexes, which often fail to crystallize. This chapter emphasizes a throughput-focused protocol that aims to identify practical aspects of binding site characterization, automated and semiautomated NMR assignment methods, and structure determination of protein-ligand complexes by NMR.
Collapse
Affiliation(s)
- Joshua J. Ziarek
- Department of Biochemistry, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, 53226 USA
| | - Francis C. Peterson
- Department of Biochemistry, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, 53226 USA
| | - Betsy L. Lytle
- Department of Biochemistry, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, 53226 USA
| | - Brian F. Volkman
- Department of Biochemistry, Medical College of Wisconsin, 8701 Watertown Plank Road, Milwaukee, Wisconsin, 53226 USA
| |
Collapse
|
17
|
Modeling pilus structures from sparse data. J Struct Biol 2010; 173:436-44. [PMID: 21115127 DOI: 10.1016/j.jsb.2010.11.015] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2010] [Revised: 11/11/2010] [Accepted: 11/15/2010] [Indexed: 11/23/2022]
Abstract
Bacterial Type II secretion systems (T2SS) and type IV pili (T4P) biogenesis machineries share the ability to assemble thin filaments from pilin protein subunits in the plasma membrane. Here we describe in detail the calculation strategy that served to determine a detailed atomic model of the T2SS pilus from Klebsiella oxytoca (Campos et al., PNAS 2010). The strategy is based on molecular modeling with generalized distance restraints and experimental validation (salt bridge charge inversion; double cysteine substitution and crosslinking). It does not require directly fitting structures into an envelope obtained from electron microscopy, but relies on lower resolution information, in particular the symmetry parameters of the helix forming the pilus. We validate the strategy with T4P where either a higher resolution structure is available (for the gonococcal (GC) pilus from Neisseria gonorrhoeae), or where we can compare our results to additional experimental data (for Vibrio cholerae TCP). The models are of sufficient precision to compare the architecture of the different pili in detail.
Collapse
|
18
|
Jee JG. Unambiguous Determination of Intermolecular Hydrogen Bond of NMR Structure by Molecular Dynamics Refinement Using All-Atom Force Field and Implicit Solvent Model. B KOREAN CHEM SOC 2010. [DOI: 10.5012/bkcs.2010.31.9.2717] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
19
|
Stratmann D, Guittet E, van Heijenoort C. Robust structure-based resonance assignment for functional protein studies by NMR. JOURNAL OF BIOMOLECULAR NMR 2010; 46:157-73. [PMID: 20024602 PMCID: PMC2813526 DOI: 10.1007/s10858-009-9390-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2009] [Accepted: 11/04/2009] [Indexed: 05/20/2023]
Abstract
High-throughput functional protein NMR studies, like protein interactions or dynamics, require an automated approach for the assignment of the protein backbone. With the availability of a growing number of protein 3D structures, a new class of automated approaches, called structure-based assignment, has been developed quite recently. Structure-based approaches use primarily NMR input data that are not based on J-coupling and for which connections between residues are not limited by through bonds magnetization transfer efficiency. We present here a robust structure-based assignment approach using mainly H(N)-H(N) NOEs networks, as well as (1)H-(15) N residual dipolar couplings and chemical shifts. The NOEnet complete search algorithm is robust against assignment errors, even for sparse input data. Instead of a unique and partly erroneous assignment solution, an optimal assignment ensemble with an accuracy equal or near to 100% is given by NOEnet. We show that even low precision assignment ensembles give enough information for functional studies, like modeling of protein-complexes. Finally, the combination of NOEnet with a low number of ambiguous J-coupling sequential connectivities yields a high precision assignment ensemble. NOEnet will be available under: http://www.icsn.cnrs-gif.fr/download/nmr.
Collapse
Affiliation(s)
- Dirk Stratmann
- NMR, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Eric Guittet
- Centre de Recherche de Gif, Laboratoire de Chimie et Biologie Structurales ICSN-CNRS, 1, av. de la terrasse, 91190 Gif-sur-Yvette, France
| | - Carine van Heijenoort
- Centre de Recherche de Gif, Laboratoire de Chimie et Biologie Structurales ICSN-CNRS, 1, av. de la terrasse, 91190 Gif-sur-Yvette, France
| |
Collapse
|
20
|
Snyder DA, Brüschweiler R. Generalized indirect covariance NMR formalism for establishment of multidimensional spin correlations. J Phys Chem A 2009; 113:12898-903. [PMID: 19810742 PMCID: PMC2783375 DOI: 10.1021/jp9070168] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Multidimensional nuclear magnetic resonance (NMR) experiments measure spin-spin correlations, which provide important information about bond connectivities and molecular structure. However, direct observation of certain kinds of correlations can be very time-consuming due to limitations in sensitivity and resolution. Covariance NMR derives correlations between spins via the calculation of a (symmetric) covariance matrix, from which a matrix-square root produces a spectrum with enhanced resolution. Recently, the covariance concept has been adopted to the reconstruction of nonsymmetric spectra from pairs of 2D spectra that have a frequency dimension in common. Since the unsymmetric covariance NMR procedure lacks the matrix-square root step, it does not suppress relay effects and thereby may generate false positive signals due to chemical shift degeneracy. A generalized covariance formalism is presented here that embeds unsymmetric covariance processing within the context of the regular covariance transform. It permits the construction of unsymmetric covariance NMR spectra subjected to arbitrary matrix functions, such as the square root, with improved spectral properties. This formalism extends the domain of covariance NMR to include the reconstruction of nonsymmetric NMR spectra at resolutions or sensitivities that are superior to the ones achievable by direct measurements.
Collapse
Affiliation(s)
- David A. Snyder
- Department of Chemistry, William Paterson University, 300 Pompton Road, Wayne, NJ 07470
- Chemical Sciences Laboratory, Department of Chemistry and Biochemistry and National High Magnetic Field Laboratory, Florida State University, Tallahassee, FL 32306
| | - Rafael Brüschweiler
- Chemical Sciences Laboratory, Department of Chemistry and Biochemistry and National High Magnetic Field Laboratory, Florida State University, Tallahassee, FL 32306
| |
Collapse
|
21
|
Abstract
MOTIVATION Picking peaks from experimental NMR spectra is a key unsolved problem for automated NMR protein structure determination. Such a process is a prerequisite for resonance assignment, nuclear overhauser enhancement (NOE) distance restraint assignment, and structure calculation tasks. Manual or semi-automatic peak picking, which is currently the prominent way used in NMR labs, is tedious, time consuming and costly. RESULTS We introduce new ideas, including noise-level estimation, component forming and sub-division, singular value decomposition (SVD)-based peak picking and peak pruning and refinement. PICKY is developed as an automated peak picking method. Different from the previous research on peak picking, we provide a systematic study of the proposed method. PICKY is tested on 32 real 2D and 3D spectra of eight target proteins, and achieves an average of 88% recall and 74% precision. PICKY is efficient. It takes PICKY on average 15.7 s to process an NMR spectrum. More important than these numbers, PICKY actually works in practice. We feed peak lists generated by PICKY to IPASS for resonance assignment, feed IPASS assignment to SPARTA for fragments generation, and feed SPARTA fragments to FALCON for structure calculation. This results in high-resolution structures of several proteins, for example, TM1112, at 1.25 A. AVAILABILITY PICKY is available upon request. The peak lists of PICKY can be easily loaded by SPARKY to enable a better interactive strategy for rapid peak picking.
Collapse
Affiliation(s)
- Babak Alipanahi
- David R.Cheriton School of Computer Science, University of Waterloo, Waterloo, ON, Canada
| | | | | | | | | |
Collapse
|
22
|
Williamson MP, Craven CJ. Automated protein structure calculation from NMR data. JOURNAL OF BIOMOLECULAR NMR 2009; 43:131-143. [PMID: 19137264 DOI: 10.1007/s10858-008-9295-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/19/2008] [Accepted: 12/10/2008] [Indexed: 05/27/2023]
Abstract
Current software is almost at the stage to permit completely automatic structure determination of small proteins of <15 kDa, from NMR spectra to structure validation with minimal user interaction. This goal is welcome, as it makes structure calculation more objective and therefore more easily validated, without any loss in the quality of the structures generated. Moreover, it releases expert spectroscopists to carry out research that cannot be automated. It should not take much further effort to extend automation to ca 20 kDa. However, there are technological barriers to further automation, of which the biggest are identified as: routines for peak picking; adoption and sharing of a common framework for structure calculation, including the assembly of an automated and trusted package for structure validation; and sample preparation, particularly for larger proteins. These barriers should be the main target for development of methodology for protein structure determination, particularly by structural genomics consortia.
Collapse
Affiliation(s)
- Mike P Williamson
- Department of Molecular Biology and Biotechnology, University of Sheffield, Firth Court, Western Bank, Sheffield, S10 2TN, UK.
| | | |
Collapse
|
23
|
Schmucki R, Yokoyama S, Güntert P. Automated assignment of NMR chemical shifts using peak-particle dynamics simulation with the DYNASSIGN algorithm. JOURNAL OF BIOMOLECULAR NMR 2009; 43:97-109. [PMID: 19034675 DOI: 10.1007/s10858-008-9291-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2008] [Accepted: 11/06/2008] [Indexed: 05/27/2023]
Abstract
A new algorithm, DYNASSIGN, for the automated assignment of NMR chemical shift resonances was developed in which expected cross peaks in multidimensional NMR spectra are represented by peak-particles and assignment restraints are translated into a potential energy function. Molecular dynamics simulation techniques are used to calculate a trajectory of the system of peak-particles subjected to the potential function in order to find energetically optimal configurations that correspond to correct assignments. Peak-particle dynamics-based simulated annealing was combined with the Hungarian algorithm for local optimization, and a residue-based score was introduced to distinguish between reliable assignments and "unassigned" resonances for which no reliable assignment can be established. The DYNASSIGN algorithm was implemented in the program CYANA and tested with data sets obtained from the experimental NMR data of nine small proteins. With a set of 10 commonly used NMR spectra, on average 82.5% of all backbone and side-chain (1)H, (13)C and (15)N resonances could be assigned with an average error rate of 3.5%.
Collapse
Affiliation(s)
- Roland Schmucki
- Institute of Biophysical Chemistry and Frankfurt Institute for Advanced Studies, Goethe University Frankfurt am Main, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany
| | | | | |
Collapse
|
24
|
Abstract
The function of bio-macromolecules is determined by both their 3D structure and conformational dynamics. These molecules are inherently flexible systems displaying a broad range of dynamics on time-scales from picoseconds to seconds. Nuclear Magnetic Resonance (NMR) spectroscopy has emerged as the method of choice for studying both protein structure and dynamics in solution. Typically, NMR experiments are sensitive both to structural features and to dynamics, and hence the measured data contain information on both. Despite major progress in both experimental approaches and computational methods, obtaining a consistent view of structure and dynamics from experimental NMR data remains a challenge. Molecular dynamics simulations have emerged as an indispensable tool in the analysis of NMR data.
Collapse
Affiliation(s)
- Phineus R. L. Markwick
- Institut Pasteur, Département de Biologie Structurale et Chimie, Unité de Bio-Informatique Structurale, CNRS URA 2185, Paris, France
| | - Thérèse Malliavin
- Institut Pasteur, Département de Biologie Structurale et Chimie, Unité de Bio-Informatique Structurale, CNRS URA 2185, Paris, France
| | - Michael Nilges
- Institut Pasteur, Département de Biologie Structurale et Chimie, Unité de Bio-Informatique Structurale, CNRS URA 2185, Paris, France
| |
Collapse
|
25
|
Automated structure determination from NMR spectra. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2008; 38:129-43. [PMID: 18807026 DOI: 10.1007/s00249-008-0367-z] [Citation(s) in RCA: 178] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2008] [Accepted: 08/28/2008] [Indexed: 10/21/2022]
Abstract
Automated methods for protein structure determination by NMR have increasingly gained acceptance and are now widely used for the automated assignment of distance restraints and the calculation of three-dimensional structures. This review gives an overview of the techniques for automated protein structure analysis by NMR, including both NOE-based approaches and methods relying on other experimental data such as residual dipolar couplings and chemical shifts, and presents the FLYA algorithm for the fully automated NMR structure determination of proteins that is suitable to substitute all manual spectra analysis and thus overcomes a major efficiency limitation of the NMR method for protein structure determination.
Collapse
|
26
|
Fiorito F, Herrmann T, Damberger FF, Wüthrich K. Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [ (1)H, (1)H]-NOESY. JOURNAL OF BIOMOLECULAR NMR 2008; 42:23-33. [PMID: 18709333 DOI: 10.1007/s10858-008-9259-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2008] [Accepted: 07/15/2008] [Indexed: 05/26/2023]
Abstract
ASCAN is a new algorithm for automatic sequence-specific NMR assignment of amino acid side-chains in proteins, which uses as input the primary structure of the protein, chemical shift lists of (1)H(N), (15)N, (13)C(alpha), (13)C(beta) and possibly (1)H(alpha) from the previous polypeptide backbone assignment, and one or several 3D (13)C- or (15)N-resolved [(1)H,(1)H]-NOESY spectra. ASCAN has also been laid out for the use of TOCSY-type data sets as supplementary input. The program assigns new resonances based on comparison of the NMR signals expected from the chemical structure with the experimentally observed NOESY peak patterns. The core parts of the algorithm are a procedure for generating expected peak positions, which is based on variable combinations of assigned and unassigned resonances that arise for the different amino acid types during the assignment procedure, and a corresponding set of acceptance criteria for assignments based on the NMR experiments used. Expected patterns of NOESY cross peaks involving unassigned resonances are generated using the list of previously assigned resonances, and tentative chemical shift values for the unassigned signals taken from the BMRB statistics for globular proteins. Use of this approach with the 101-amino acid residue protein FimD(25-125) resulted in 84% of the hydrogen atoms and their covalently bound heavy atoms being assigned with a correctness rate of 90%. Use of these side-chain assignments as input for automated NOE assignment and structure calculation with the ATNOS/CANDID/DYANA program suite yielded structure bundles of comparable quality, in terms of precision and accuracy of the atomic coordinates, as those of a reference structure determined with interactive assignment procedures. A rationale for the high quality of the ASCAN-based structure determination results from an analysis of the distribution of the assigned side chains, which revealed near-complete assignments in the core of the protein, with most of the incompletely assigned residues located at or near the protein surface.
Collapse
Affiliation(s)
- Francesco Fiorito
- Institut für Molekularbiologie und Biophysik, ETH Zürich, CH-8093, Zurich, Switzerland
| | | | | | | |
Collapse
|
27
|
Abstract
Nuclear magnetic resonance (NMR) spectroscopy is a powerful tool to study the three-dimensional structure of proteins and nucleic acids at atomic resolution. Since the NMR data can be recorded in solution, conditions such as pH, salt concentration, and temperature can be adjusted so as to closely mimic the biomacromolecules natural milieu. In addition to structure determination, NMR applications can investigate time-dependent phenomena, such as dynamic features of the biomacromolecules, reaction kinetics, molecular recognition, or protein folding. The advent of higher magnetic field strengths, new technical developments, and the use of either uniform or selective isotopic labeling techniques, currently allows NMR users the opportunity to investigate the tertiary structure of biomacromolecules of approximately 50 kDa. This chapter will outline the basic protocol for structure determination of proteins by NMR spectroscopy. In general, there are four main stages: (i) preparation of a homogeneous protein sample, (ii) the recording of the NMR data sets, (iii) assignment of the spectra to each NMR observable atom in the protein, and (iv) generation of structures using computer software and the correctly assigned NMR data.
Collapse
Affiliation(s)
- Andrew J Dingley
- Department of Chemistry and School of Biological Sciences, The University of Auckland, Science Centre, 23 Symonds Street, Auckland, New Zealand
| | | | | |
Collapse
|
28
|
Snyder DA, Zhang F, Brüschweiler R. Covariance NMR in higher dimensions: application to 4D NOESY spectroscopy of proteins. JOURNAL OF BIOMOLECULAR NMR 2007; 39:165-75. [PMID: 17876709 DOI: 10.1007/s10858-007-9187-1] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/01/2007] [Accepted: 08/06/2007] [Indexed: 05/17/2023]
Abstract
Elucidation of high-resolution protein structures by NMR spectroscopy requires a large number of distance constraints that are derived from nuclear Overhauser effects between protons (NOEs). Due to the high level of spectral overlap encountered in 2D NMR spectra of proteins, the measurement of high quality distance constraints requires higher dimensional NMR experiments. Although four-dimensional Fourier transform (FT) NMR experiments can provide the necessary kind of spectral information, the associated measurement times are often prohibitively long. Covariance NMR spectroscopy yields 2D spectra that exhibit along the indirect frequency dimension the same high resolution as along the direct dimension using minimal measurement time. The generalization of covariance NMR to 4D NMR spectroscopy presented here exploits the inherent symmetry of certain 4D NMR experiments and utilizes the trace metric between donor planes for the construction of a high-resolution spectral covariance matrix. The approach is demonstrated for a 4D (13)C-edited NOESY experiment of ubiquitin. The 4D covariance spectrum narrows the line-widths of peaks strongly broadened in the FT spectrum due to the necessarily short number of increments collected, and it resolves otherwise overlapped cross peaks allowing for an increase in the number of NOE assignments to be made from a given dataset. At the same time there is no significant decrease in the positive predictive value of observing a peak as compared to the corresponding 4D Fourier transform spectrum. These properties make the 4D covariance method a potentially valuable tool for the structure determination of larger proteins and for high-throughput applications in structural biology.
Collapse
Affiliation(s)
- David A Snyder
- Department of Chemistry and Biochemistry, National High Magnetic Field Laboratory, Florida State University, Tallahassee, FL 32306, USA
| | | | | |
Collapse
|
29
|
Kobayashi N, Iwahara J, Koshiba S, Tomizawa T, Tochio N, Güntert P, Kigawa T, Yokoyama S. KUJIRA, a package of integrated modules for systematic and interactive analysis of NMR data directed to high-throughput NMR structure studies. JOURNAL OF BIOMOLECULAR NMR 2007; 39:31-52. [PMID: 17636449 DOI: 10.1007/s10858-007-9175-5] [Citation(s) in RCA: 138] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2006] [Revised: 06/15/2007] [Accepted: 06/15/2007] [Indexed: 05/16/2023]
Abstract
The recent expansion of structural genomics has increased the demands for quick and accurate protein structure determination by NMR spectroscopy. The conventional strategy without an automated protocol can no longer satisfy the needs of high-throughput application to a large number of proteins, with each data set including many NMR spectra, chemical shifts, NOE assignments, and calculated structures. We have developed the new software KUJIRA, a package of integrated modules for the systematic and interactive analysis of NMR data, which is designed to reduce the tediousness of organizing and manipulating a large number of NMR data sets. In combination with CYANA, the program for automated NOE assignment and structure determination, we have established a robust and highly optimized strategy for comprehensive protein structure analysis. An application of KUJIRA in accordance with our new strategy was carried out by a non-expert in NMR structure analysis, demonstrating that the accurate assignment of the chemical shifts and a high-quality structure of a small protein can be completed in a few weeks. The high completeness of the chemical shift assignment and the NOE assignment achieved by the systematic analysis using KUJIRA and CYANA led, in practice, to increased reliability of the determined structure.
Collapse
Affiliation(s)
- Naohiro Kobayashi
- RIKEN Genomic Sciences Center, 1-7-22, Suehiro-cho, Tsurumi, Yokohama 230-0045, Japan
| | | | | | | | | | | | | | | |
Collapse
|
30
|
Abstract
Fully automated structure determination of proteins in solution (FLYA) yields, without human intervention, three-dimensional protein structures starting from a set of multidimensional NMR spectra. Integrating existing and new software, automated peak picking over all spectra is followed by peak list filtering, the generation of an ensemble of initial chemical shift assignments, the determination of consensus chemical shift assignments for all (1)H, (13)C, and (15)N nuclei, the assignment of NOESY cross-peaks, the generation of distance restraints, and the calculation of the three-dimensional structure by torsion angle dynamics. The resulting, preliminary structure serves as additional input to the second stage of the procedure, in which a new ensemble of chemical shift assignments and a refined structure are calculated. The three-dimensional structures of three 12-16 kDa proteins computed with the FLYA algorithm coincided closely with the conventionally determined structures. Deviations were below 0.95 A for the backbone atom positions, excluding the flexible chain termini. 96-97% of all backbone and side-chain chemical shifts in the structured regions were assigned to the correct residues. The purely computational FLYA method is suitable for substituting all manual spectra analysis and thus overcomes a main efficiency limitation of the NMR method for protein structure determination.
Collapse
Affiliation(s)
- Blanca López-Méndez
- Tatsuo Miyazawa Memorial Program, RIKEN Genomic Sciences Center, Tsurumi, Yokohama 230-0045, Japan
| | | |
Collapse
|
31
|
Jaravine V, Ibraghimov I, Orekhov VY. Removal of a time barrier for high-resolution multidimensional NMR spectroscopy. Nat Methods 2006; 3:605-7. [PMID: 16862134 DOI: 10.1038/nmeth900] [Citation(s) in RCA: 138] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2006] [Accepted: 06/20/2006] [Indexed: 11/08/2022]
Abstract
We introduce the recursive multidimensional decomposition (R-MDD) method to speed recording of high-resolution NMR spectra. The measurement time is logarithmically dependent on the sizes of indirect spectral dimensions. R-MDD has the sensitivity and resolution advantages of optimized nonuniform acquisition schemes and is applicable to all types of biomolecular spectra. We demonstrated it for triple resonance experiments on three globular proteins (ubiquitin, azurin and the barstar-barnase complex) of 8-22 kDa.
Collapse
Affiliation(s)
- Victor Jaravine
- The Swedish NMR Centre at Gothenburg University, Box 465, 40530 Gothenburg, Sweden
| | | | | |
Collapse
|
32
|
Scott A, López-Méndez B, Güntert P. Fully automated structure determinations of the Fes SH2 domain using different sets of NMR spectra. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2006; 44 Spec No:S83-8. [PMID: 16826546 DOI: 10.1002/mrc.1813] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Abstract
The recently introduced fully automated protein NMR structure determination algorithm (FLYA) yields, without any human intervention, a three-dimensional (3D) protein structure starting from a set of two- and three-dimensional NMR spectra. This paper investigates the influence of reduced sets of experimental spectra on the quality of NMR structures obtained with FLYA. In a case study using the Src homology domain 2 from the human feline sarcoma oncogene Fes (Fes SH2), five reduced data sets selected from the full set of 13 three-dimensional spectra of the previously determined conventional structure were used to calculate the protein structure. Three reduced data sets utilized only CBCA(CO)NH and CBCANH for the backbone assignments and two data sets used only CBCA(CO)NH. All, some, or none of the five original side-chain assignment spectra were used. Results were compared with those of a FLYA calculation for the complete set of spectra and those of the conventionally determined structure. In four of the five cases tested, the three-dimensional structures deviated by less than 1.3 A in backbone RMSD from the conventionally determined Fes SH2 reference structure, showing that the FLYA algorithm is remarkably stable and accurate when used with reduced sets of input spectra.
Collapse
Affiliation(s)
- Anna Scott
- Tatsuo Miyazawa Memorial Program, RIKEN Genomic Sciences Center, 1-7-22 Suehiro, Tsurumi, Yokohama 230-0045, Japan
| | | | | |
Collapse
|
33
|
Benod C, Delsuc MA, Pons JL. CRAACK: Consensus Program for NMR Amino Acid Type Assignment. J Chem Inf Model 2006; 46:1517-22. [PMID: 16711771 DOI: 10.1021/ci050092h] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein peak spectrum assignment is a prerequisite of the nuclear magnetic resonance study of a molecule. We present here a computer tool which proposes the determination of the amino acid type from the values of the chemical shifts. This tool is based on two consensus algorithms based on several published typing algorithms and was trained and extensively tested against the Biological Magnetic Resonance Bank chemical shift data bank. The first one accomplishes the analysis with support vector machine technology, grouping related amino acids together, and presents a mean rate of success above 90% on the test set. The second one uses a classical consensus algorithm of vote. Furthermore, secondary structural prediction is available. This tool can be used for assisting manual assignment of peptides and proteins and can also be used as a step in an automated approach to assignment. This program has been called CRAACK and is publicly available at the following URL: http://abcis.cbs.cnrs.fr/craack.
Collapse
Affiliation(s)
- Cindy Benod
- Centre de Biochimie Structurale, CNRS UMR 5048, INSERM UMR 554, Université Montpellier 1, 29 rue de Navacelles, 34090 Montpellier, France
| | | | | |
Collapse
|
34
|
Carlisle EA, Holder JL, Maranda AM, de Alwis AR, Selkie EL, McKay SL. Effect of pH, urea, peptide length, and neighboring amino acids on alanine α-proton random coil chemical shifts. Biopolymers 2006; 85:72-80. [PMID: 17054116 DOI: 10.1002/bip.20614] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Accurate random coil alpha-proton chemical shift values are essential for precise protein structure analysis using chemical shift index (CSI) calculations. The current study determines the chemical shift effects of pH, urea, peptide length and neighboring amino acids on the alpha-proton of Ala using model peptides of the general sequence GnXaaAYaaGn, where Xaa and Yaa are Leu, Val, Phe, Tyr, His, Trp or Pro, and n = 1-3. Changes in pH (2-6), urea (0-1M), and peptide length (n = 1-3) had no effect on Ala alpha-proton chemical shifts. Denaturing concentrations of urea (8M) caused significant downfield shifts (0.10 +/- 0.01 ppm) relative to an external DSS reference. Neighboring aliphatic residues (Leu, Val) had no effect, whereas aromatic amino acids (Phe, Tyr, His and Trp) and Pro caused significant shifts in the alanine alpha-proton, with the extent of the shifts dependent on the nature and position of the amino acid. Smaller aromatic residues (Phe, Tyr, His) caused larger shift effects when present in the C-terminal position (approximately 0.10 vs. 0.05 ppm N-terminal), and the larger aromatic tryptophan caused greater effects in the N-terminal position (0.15 ppm vs. 0.10 C-terminal). Proline affected both significant upfield (0.06 ppm, N-terminal) and downfield (0.25 ppm, C-terminal) chemical shifts. These new Ala correction factors detail the magnitude and range of variation in environmental chemical shift effects, in addition to providing insight into the molecular level interactions that govern protein folding.
Collapse
Affiliation(s)
- Elizabeth A Carlisle
- Department of Chemistry and Biochemistry, Ebaugh Laboratories, Denison University, Granville, OH 43023, USA
| | | | | | | | | | | |
Collapse
|
35
|
Affiliation(s)
- Xavier Barril
- Senior Scientist, Vernalis (R&D), Granta Park, Abington, Cambridge, UK
| | | |
Collapse
|