1
|
Monneau YR, Rossi P, Bhaumik A, Huang C, Jiang Y, Saleh T, Xie T, Xing Q, Kalodimos CG. Automatic methyl assignment in large proteins by the MAGIC algorithm. JOURNAL OF BIOMOLECULAR NMR 2017; 69:215-227. [PMID: 29098507 PMCID: PMC5764113 DOI: 10.1007/s10858-017-0149-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Accepted: 10/23/2017] [Indexed: 05/03/2023]
Abstract
Selective methyl labeling is an extremely powerful approach to study the structure, dynamics and function of biomolecules by NMR. Despite spectacular progress in the field, such studies remain rather limited in number. One of the main obstacles remains the assignment of the methyl resonances, which is labor intensive and error prone. Typically, NOESY crosspeak patterns are manually correlated to the available crystal structure or an in silico template model of the protein. Here, we propose methyl assignment by graphing inference construct, an exhaustive search algorithm with no peak network definition requirement. In order to overcome the combinatorial problem, the exhaustive search is performed locally, i.e. for a small number of methyls connected through-space according to experimental 3D methyl NOESY data. The local network approach drastically reduces the search space. Only the best local assignments are combined to provide the final output. Assignments that match the data with comparable scores are made available to the user for cross-validation by additional experiments such as methyl-amide NOEs. Several NMR datasets for proteins in the 25-50 kDa range were used during development and for performance evaluation against the manually assigned data. We show that the algorithm is robust, reliable and greatly speeds up the methyl assignment task.
Collapse
Affiliation(s)
- Yoan R Monneau
- Université Grenoble Alpes, CEA, CNRS, IBS, 38000, Grenoble, France
| | - Paolo Rossi
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA.
| | - Anusarka Bhaumik
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Chengdong Huang
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Yajun Jiang
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Tamjeed Saleh
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Tao Xie
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Qiong Xing
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA
| | - Charalampos G Kalodimos
- Deparment of Structural Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105, USA.
| |
Collapse
|
2
|
Trautwein M, Fredriksson K, Möller HM, Exner TE. Automated assignment of NMR chemical shifts based on a known structure and 4D spectra. JOURNAL OF BIOMOLECULAR NMR 2016; 65:217-236. [PMID: 27484442 DOI: 10.1007/s10858-016-0050-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/08/2016] [Accepted: 07/28/2016] [Indexed: 06/06/2023]
Abstract
Apart from their central role during 3D structure determination of proteins the backbone chemical shift assignment is the basis for a number of applications, like chemical shift perturbation mapping and studies on the dynamics of proteins. This assignment is not a trivial task even if a 3D protein structure is known and needs almost as much effort as the assignment for structure prediction if performed manually. We present here a new algorithm based solely on 4D [(1)H,(15)N]-HSQC-NOESY-[(1)H,(15)N]-HSQC spectra which is able to assign a large percentage of chemical shifts (73-82 %) unambiguously, demonstrated with proteins up to a size of 250 residues. For the remaining residues, a small number of possible assignments is filtered out. This is done by comparing distances in the 3D structure to restraints obtained from the peak volumes in the 4D spectrum. Using dead-end elimination, assignments are removed in which at least one of the restraints is violated. Including additional information from chemical shift predictions, a complete unambiguous assignment was obtained for Ubiquitin and 95 % of the residues were correctly assigned in the 251 residue-long N-terminal domain of enzyme I. The program including source code is available at https://github.com/thomasexner/4Dassign .
Collapse
Affiliation(s)
- Matthias Trautwein
- Institute of Pharmacy, Eberhard Karls Universität Tübingen, Auf der Morgenstelle 8, 72076, Tübingen, Germany
| | - Kai Fredriksson
- Institute of Pharmacy, Eberhard Karls Universität Tübingen, Auf der Morgenstelle 8, 72076, Tübingen, Germany
| | - Heiko M Möller
- Institute of Chemistry, University of Potsdam, Karl-Liebknecht-Str. 24-25, 14476, Potsdam OT Golm, Germany
| | - Thomas E Exner
- Institute of Pharmacy, Eberhard Karls Universität Tübingen, Auf der Morgenstelle 8, 72076, Tübingen, Germany.
| |
Collapse
|
3
|
Xiao Y, Warner LR, Latham MP, Ahn NG, Pardi A. Structure-Based Assignment of Ile, Leu, and Val Methyl Groups in the Active and Inactive Forms of the Mitogen-Activated Protein Kinase Extracellular Signal-Regulated Kinase 2. Biochemistry 2015; 54:4307-19. [PMID: 26132046 DOI: 10.1021/acs.biochem.5b00506] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Resonance assignments are the first step in most NMR studies of protein structure, function, and dynamics. Standard protein assignment methods employ through-bond backbone experiments on uniformly (13)C/(15)N-labeled proteins. For larger proteins, this through-bond assignment procedure often breaks down due to rapid relaxation and spectral overlap. The challenges involved in studies of larger proteins led to efficient methods for (13)C labeling of side chain methyl groups, which have favorable relaxation properties and high signal-to-noise. These methyls are often still assigned by linking them to the previously assigned backbone, thus limiting the applications for larger proteins. Here, a structure-based procedure is described for assignment of (13)C(1)H3-labeled methyls by comparing distance information obtained from three-dimensional methyl-methyl nuclear Overhauser effect (NOE) spectroscopy with the X-ray structure. The Ile, Leu, or Val (ILV) methyl type is determined by through-bond experiments, and the methyl-methyl NOE data are analyzed in combination with the known structure. A hierarchical approach was employed that maps the largest observed "NOE-methyl cluster" onto the structure. The combination of identification of ILV methyl type with mapping of the NOE-methyl clusters greatly simplifies the assignment process. This method was applied to the inactive and active forms of the 42-kDa ILV (13)C(1)H3-methyl labeled extracellular signal-regulated kinase 2 (ERK2), leading to assignment of 60% of the methyls, including 90% of Ile residues. A series of ILV to Ala mutants were analyzed, which helped confirm the assignments. These assignments were used to probe the local and long-range effects of ligand binding to inactive and active ERK2.
Collapse
Affiliation(s)
- Yao Xiao
- †Department of Chemistry and Biochemistry and ‡BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Lisa R Warner
- †Department of Chemistry and Biochemistry and ‡BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Michael P Latham
- †Department of Chemistry and Biochemistry and ‡BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Natalie G Ahn
- †Department of Chemistry and Biochemistry and ‡BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80309, United States
| | - Arthur Pardi
- †Department of Chemistry and Biochemistry and ‡BioFrontiers Institute, University of Colorado Boulder, Boulder, Colorado 80309, United States
| |
Collapse
|
4
|
Cavuşlar G, Çatay B, Apaydın MS. A tabu search approach for the NMR protein structure-based assignment problem. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1621-1628. [PMID: 23221084 DOI: 10.1109/tcbb.2012.122] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Spectroscopy is an experimental technique which exploits the magnetic properties of specific nuclei and enables the study of proteins in solution. The key bottleneck of NMR studies is to map the NMR peaks to corresponding nuclei, also known as the assignment problem. Structure-Based Assignment (SBA) is an approach to solve this computationally challenging problem by using prior information about the protein obtained from a homologous structure. NVR-BIP used the Nuclear Vector Replacement (NVR) framework to model SBA as a binary integer programming problem. In this paper, we prove that this problem is NP-hard and propose a tabu search (TS) algorithm (NVR-TS) equipped with a guided perturbation mechanism to efficiently solve it. NVR-TS uses a quadratic penalty relaxation of NVR-BIP where the violations in the Nuclear Overhauser Effect constraints are penalized in the objective function. Experimental results indicate that our algorithm finds the optimal solution on NVRBIP’s data set which consists of seven proteins with 25 templates (31 to 126 residues). Furthermore, it achieves relatively high assignment accuracies on two additional large proteins, MBP and EIN (348 and 243 residues, respectively), which NVR-BIP failed to solve. The executable and the input files are available for download at http://people.sabanciuniv.edu/catay/NVR-TS/NVR-TS.html.
Collapse
Affiliation(s)
- Gizem Cavuşlar
- University of Wisconsin-Madison, 1513 University Avenue, Madison, WI 53706, USA.
| | | | | |
Collapse
|
5
|
Jang R, Gao X, Li M. Towards fully automated structure-based NMR resonance assignment of ¹⁵N-labeled proteins from automatically picked peaks. J Comput Biol 2011; 18:347-63. [PMID: 21385039 DOI: 10.1089/cmb.2010.0251] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In NMR resonance assignment, an indispensable step in NMR protein studies, manually processed peaks from both N-labeled and C-labeled spectra are typically used as inputs. However, the use of homologous structures can allow one to use only N-labeled NMR data and avoid the added expense of using C-labeled data. We propose a novel integer programming framework for structure-based backbone resonance assignment using N-labeled data. The core consists of a pair of integer programming models: one for spin system forming and amino acid typing, and the other for backbone resonance assignment. The goal is to perform the assignment directly from spectra without any manual intervention via automatically picked peaks, which are much noisier than manually picked peaks, so methods must be error-tolerant. In the case of semi-automated/manually processed peak data, we compare our system with the Xiong-Pandurangan-Bailey-Kellogg's contact replacement (CR) method, which is the most error-tolerant method for structure-based resonance assignment. Our system, on average, reduces the error rate of the CR method by five folds on their data set. In addition, by using an iterative algorithm, our system has the added capability of using the NOESY data to correct assignment errors due to errors in predicting the amino acid and secondary structure type of each spin system. On a publicly available data set for human ubiquitin, where the typing accuracy is 83%, we achieve 91% accuracy, compared to the 59% accuracy obtained without correcting for such errors. In the case of automatically picked peaks, using assignment information from yeast ubiquitin, we achieve a fully automatic assignment with 97% accuracy. To our knowledge, this is the first system that can achieve fully automatic structure-based assignment directly from spectra. This has implications in NMR protein mutant studies, where the assignment step is repeated for each mutant.
Collapse
Affiliation(s)
- Richard Jang
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario, Canada
| | | | | |
Collapse
|
6
|
Alipanahi B, Gao X, Karakoc E, Li SC, Balbach F, Feng G, Donaldson L, Li M. Error tolerant NMR backbone resonance assignment and automated structure generation. J Bioinform Comput Biol 2011; 9:15-41. [PMID: 21328705 DOI: 10.1142/s0219720011005276] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2010] [Revised: 09/04/2010] [Accepted: 10/12/2010] [Indexed: 11/18/2022]
Abstract
Error tolerant backbone resonance assignment is the cornerstone of the NMR structure determination process. Although a variety of assignment approaches have been developed, none works sufficiently well on noisy fully automatically picked peaks to enable the subsequent automatic structure determination steps. We have designed an integer linear programming (ILP) based assignment system (IPASS) that has enabled fully automatic protein structure determination for four test proteins. IPASS employs probabilistic spin system typing based on chemical shifts and secondary structure predictions. Furthermore, IPASS extracts connectivity information from the inter-residue information and the (automatically picked) (15)N-edited NOESY peaks which are then used to fix reliable fragments. When applied to automatically picked peaks for real proteins, IPASS achieves an average precision and recall of 82% and 63%, respectively. In contrast, the next best method, MARS, achieves an average precision and recall of 77% and 36%, respectively. The assignments generated by IPASS are then fed into our protein structure calculation system, FALCON-NMR, to determine the 3D structures without human intervention. The final models have backbone RMSDs of 1.25Å, 0.88Å, 1.49Å, and 0.67Å to the reference native structures for proteins TM1112, CASKIN, VRAR, and HACS1, respectively. The web server is publicly available at http://monod.uwaterloo.ca/nmr/ipass.
Collapse
Affiliation(s)
- Babak Alipanahi
- David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L3G1, Canada
| | | | | | | | | | | | | | | |
Collapse
|
7
|
Ikeya T, Jee JG, Shigemitsu Y, Hamatsu J, Mishima M, Ito Y, Kainosho M, Güntert P. Exclusively NOESY-based automated NMR assignment and structure determination of proteins. JOURNAL OF BIOMOLECULAR NMR 2011; 50:137-146. [PMID: 21448734 DOI: 10.1007/s10858-011-9502-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2011] [Accepted: 03/11/2011] [Indexed: 05/30/2023]
Abstract
A fully automated method is presented for determining NMR solution structures of proteins using exclusively NOESY spectra as input, obviating the need to measure any spectra only for obtaining resonance assignments but devoid of structural information. Applied to two small proteins, the approach yielded structures that coincided closely with conventionally determined structures.
Collapse
Affiliation(s)
- Teppei Ikeya
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, and Frankfurt Institute for Advanced Studies, Goethe University Frankfurt am Main, Germany
| | | | | | | | | | | | | | | |
Collapse
|
8
|
Wang X, Tash B, Flanagan JM, Tian F. RDC derived protein backbone resonance assignment using fragment assembly. JOURNAL OF BIOMOLECULAR NMR 2011; 49:85-98. [PMID: 21191805 PMCID: PMC6936109 DOI: 10.1007/s10858-010-9467-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/23/2010] [Accepted: 12/15/2010] [Indexed: 05/23/2023]
Abstract
Experimental residual dipolar couplings (RDCs) in combination with structural models have the potential for accelerating the protein backbone resonance assignment process because RDCs can be measured accurately and interpreted quantitatively. However, this application has been limited due to the need for very high-resolution structural templates. Here, we introduce a new approach to resonance assignment based on optimal agreement between the experimental and calculated RDCs from a structural template that contains all assignable residues. To overcome the inherent computational complexity of such a global search, we have adopted an efficient two-stage search algorithm and included connectivity data from conventional assignment experiments. In the first stage, a list of strings of resonances (CA-links) is generated via exhaustive searches for short segments of sequentially connected residues in a protein (local templates), and then ranked by the agreement of the experimental (13)C(α) chemical shifts and (15)N-(1)H RDCs to the predicted values for each local template. In the second stage, the top CA-links for different local templates in stage I are combinatorially connected to produce CA-links for all assignable residues. The resulting CA-links are ranked for resonance assignment according to their measured RDCs and predicted values from a tertiary structure. Since the final RDC ranking of CA-links includes all assignable residues and the assignment is derived from a "global minimum", our approach is far less reliant on the quality of experimental data and structural templates. The present approach is validated with the assignments of several proteins, including a 42 kDa maltose binding protein (MBP) using RDCs and structural templates of varying quality. Since backbone resonance assignment is an essential first step for most of biomolecular NMR applications and is often a bottleneck for large systems, we expect that this new approach will improve the efficiency of the assignment process for small and medium size proteins and will extend the size limits assignable by current methods for proteins with structural models.
Collapse
Affiliation(s)
- Xingsheng Wang
- Department of Biochemistry and Molecular Biology, College of Medicine, Pennsylvania State University, Hershey, PA 17033, USA
| | | | | | | |
Collapse
|
9
|
Wren JD, Kupfer DM, Perkins EJ, Bridges S, Berleant D. Proceedings of the 2010 MidSouth Computational Biology and Bioinformatics Society (MCBIOS) Conference. BMC Bioinformatics 2010; 11 Suppl 6:S1. [PMID: 20946592 PMCID: PMC3026356 DOI: 10.1186/1471-2105-11-s6-s1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
|
10
|
Stratmann D, Guittet E, van Heijenoort C. Robust structure-based resonance assignment for functional protein studies by NMR. JOURNAL OF BIOMOLECULAR NMR 2010; 46:157-73. [PMID: 20024602 PMCID: PMC2813526 DOI: 10.1007/s10858-009-9390-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2009] [Accepted: 11/04/2009] [Indexed: 05/20/2023]
Abstract
High-throughput functional protein NMR studies, like protein interactions or dynamics, require an automated approach for the assignment of the protein backbone. With the availability of a growing number of protein 3D structures, a new class of automated approaches, called structure-based assignment, has been developed quite recently. Structure-based approaches use primarily NMR input data that are not based on J-coupling and for which connections between residues are not limited by through bonds magnetization transfer efficiency. We present here a robust structure-based assignment approach using mainly H(N)-H(N) NOEs networks, as well as (1)H-(15) N residual dipolar couplings and chemical shifts. The NOEnet complete search algorithm is robust against assignment errors, even for sparse input data. Instead of a unique and partly erroneous assignment solution, an optimal assignment ensemble with an accuracy equal or near to 100% is given by NOEnet. We show that even low precision assignment ensembles give enough information for functional studies, like modeling of protein-complexes. Finally, the combination of NOEnet with a low number of ambiguous J-coupling sequential connectivities yields a high precision assignment ensemble. NOEnet will be available under: http://www.icsn.cnrs-gif.fr/download/nmr.
Collapse
Affiliation(s)
- Dirk Stratmann
- NMR, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Eric Guittet
- Centre de Recherche de Gif, Laboratoire de Chimie et Biologie Structurales ICSN-CNRS, 1, av. de la terrasse, 91190 Gif-sur-Yvette, France
| | - Carine van Heijenoort
- Centre de Recherche de Gif, Laboratoire de Chimie et Biologie Structurales ICSN-CNRS, 1, av. de la terrasse, 91190 Gif-sur-Yvette, France
| |
Collapse
|