1
|
Choi Y. Computational Identification and Design of Complementary β-Strand Sequences. Methods Mol Biol 2022; 2405:83-94. [PMID: 35298809 DOI: 10.1007/978-1-0716-1855-4_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The ß-sheet is a regular secondary structure element which consists of linear segments called ß-strands. They are involved in many important biological processes, and some are known to be related to serious diseases such as neurologic disorders and amyloidosis. The self-assembly of ß-sheet peptides also has practical applications in material sciences since they can be building blocks of repeated nanostructures. Therefore, computational algorithms for identification of ß-sheet formation can offer useful insight into the mechanism of disease-prone protein segments and the construction of biocompatible nanomaterials. Despite the recent advances in structure-based methods for the assessment of atomic interactions, identifying amyloidogenic peptides has proven to be extremely difficult since they are structurally very flexible. Thus, an alternative strategy is required to describe ß-sheet formation. It has been hypothesized and observed that there are certain amino acid propensities between ß-strand pairs. Based on this hypothesis, a database search algorithm, B-SIDER, is developed for the identification and design of ß-sheet forming sequences. Given a target sequence, the algorithm identifies exact or partial matches from the structure database and constructs a position-specific score matrix. The score matrix can be utilized to design novel sequences that can form a ß-sheet specifically with the target.
Collapse
Affiliation(s)
- Yoonjoo Choi
- Combinatorial Tumor Immunotherapy MRC, Chonnam National University Medical School, Hwasun-gun, Jeollanam-do, Republic of Korea.
| |
Collapse
|
2
|
Dehghani T, Naghibzadeh M, Sadri J. Enhancement of Protein β-Sheet Topology Prediction Using Maximum Weight Disjoint Path Cover. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1936-1947. [PMID: 29994539 DOI: 10.1109/tcbb.2018.2837753] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Predicting β-sheet topology (β-topology) is one of the most critical intermediate steps towards protein structure and function prediction. The β-topology prediction problem is defined as the determination of the optimal arrangement of β-strand interactions within protein β-sheets. Significant efforts have been made to predict β-topologies. However, due to the inaccurate determination of interactions among β-strands and the huge topological space of proteins with a large number of β-strands, more efficient methods are required to improve both the accuracy and speed of β-topology prediction. In order to attain higher accuracy, the current paper introduces a bidirectional strand-strand interaction graph and considers all possible orientations (parallel and antiparallel) and orders of β-strand pairwise interactions. For the first time, the β-topology prediction is transformed into a maximum weight disjoint path cover solution by conserving all potential topologies. Moreover, to manage the computation time, a set of candidate β-sheets is generated and an optimization process is applied to select a subset of maximum score disjoint β-sheets as a predicted β-topology. The proposed method is comprehensively compared with state-of-the-art methods. The experimental results on the BetaSheet916 and BetaSheet1452 datasets reveal that the current study's approach enhances performance measurements as well as reduces the runtime.
Collapse
|
3
|
Yu TG, Kim HS, Choi Y. B-SIDER: Computational Algorithm for the Design of Complementary β-Sheet Sequences. J Chem Inf Model 2019; 59:4504-4511. [PMID: 31512871 DOI: 10.1021/acs.jcim.9b00548] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The β-sheet is an element of protein secondary structure, and intra-/intermolecular β-sheet interactions play pivotal roles in biological regulatory processes including scaffolding, transporting, and oligomerization. In nature, a β-sheet formation is tightly regulated because dysregulated β-stacking often leads to severe diseases such as Alzheimer's, Parkinson's, systemic amyloidosis, or diabetes. Thus, the identification of intrinsic β-sheet-forming propensities can provide valuable insight into protein designs for the development of novel therapeutics. However, structure-based design methods may not be generally applicable to such amyloidogenic peptides mainly owing to high structural plasticity and complexity. Therefore, an alternative design strategy based on complementary sequence information is of significant importance. Herein, we developed a database search method called β-Stacking Interaction DEsign for Reciprocity (B-SIDER) for the design of complementary β-strands. This method makes use of the structural database information and generates target-specific score matrices. The discriminatory power of the B-SIDER score function was tested on representative amyloidogenic peptide substructures against a sequence-based score matrix (PASTA 2.0) and two popular ab initio protein design score functions (Rosetta and FoldX). B-SIDER is able to distinguish wild-type amyloidogenic β-strands as favored interactions in a more consistent manner than other methods. B-SIDER was prospectively applied to the design of complementary β-strands for a splitGFP scaffold. Three variants were identified to have stronger interactions than the original sequence selected through a directed evolution, emitting higher fluorescence intensities. Our results indicate that B-SIDER can be applicable to the design of other β-strands, assisting in the development of therapeutics against disease-related amyloidogenic peptides.
Collapse
Affiliation(s)
- Tae-Geun Yu
- Department of Biological Sciences , Korea Advanced Institute of Science and Technology (KAIST) , Daejeon 34141 , Republic of Korea
| | - Hak-Sung Kim
- Department of Biological Sciences , Korea Advanced Institute of Science and Technology (KAIST) , Daejeon 34141 , Republic of Korea
| | - Yoonjoo Choi
- Department of Biological Sciences , Korea Advanced Institute of Science and Technology (KAIST) , Daejeon 34141 , Republic of Korea
| |
Collapse
|
4
|
Sabzekar M, Naghibzadeh M, Eghdami M, Aydin Z. Protein β-sheet prediction using an efficient dynamic programming algorithm. Comput Biol Chem 2017; 70:142-155. [PMID: 28881217 DOI: 10.1016/j.compbiolchem.2017.08.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2017] [Revised: 07/25/2017] [Accepted: 08/18/2017] [Indexed: 11/28/2022]
Abstract
Predicting the β-sheet structure of a protein is one of the most important intermediate steps towards the identification of its tertiary structure. However, it is regarded as the primary bottleneck due to the presence of non-local interactions between several discontinuous regions in β-sheets. To achieve reliable long-range interactions, a promising approach is to enumerate and rank all β-sheet conformations for a given protein and find the one with the highest score. The problem with this solution is that the search space of the problem grows exponentially with respect to the number of β-strands. Additionally, brute-force calculation in this conformational space leads to dealing with a combinatorial explosion problem with intractable computational complexity. The main contribution of this paper is to generate and search the space of the problem efficiently to reduce the time complexity of the problem. To achieve this, two tree structures, called sheet-tree and grouping-tree, are proposed. They model the search space by breaking it into sub-problems. Then, an advanced dynamic programming is proposed that stores the intermediate results, avoids repetitive calculation by repeatedly uses them efficiently in successive steps and reduces the space of the problem by removing those intermediate results that will no longer be required in later steps. As a consequence, the following contributions have been made. Firstly, more accurate β-sheet structures are found by searching all possible conformations, and secondly, the time complexity of the problem is reduced by searching the space of the problem efficiently which makes the proposed method applicable to predict β-sheet structures with high number of β-strands. Experimental results on the BetaSheet916 dataset showed significant improvements of the proposed method in both execution time and the prediction accuracy in comparison with the state-of-the-art β-sheet structure prediction methods Moreover, we investigate the effect of different contact map predictors on the performance of the proposed method using BetaSheet1452 dataset. The source code is available at http://www.conceptsgate.com/BetaTop.rar.
Collapse
Affiliation(s)
- Mostafa Sabzekar
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Mahdie Eghdami
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Zafer Aydin
- Department of Computer Engineering, Abdullah Gul University, Kayseri, Turkey
| |
Collapse
|
5
|
Sabzekar M, Naghibzadeh M, Sadri J. Efficient dynamic programming algorithm with prior knowledge for protein β-strand alignment. J Theor Biol 2017; 417:43-50. [PMID: 28108305 DOI: 10.1016/j.jtbi.2017.01.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Revised: 11/11/2016] [Accepted: 01/12/2017] [Indexed: 11/30/2022]
Abstract
One of the main tasks towards the prediction of protein β-sheet structure is to predict the native alignment of β-strands. The alignment of two β-strands defines similar regions that may reflect functional, structural, or evolutionary relationships between them. Therefore, any improvement in β-strands alignment not only reduces the computational search space but also improves β-sheet structure prediction accuracy. To define the alignment scores, previous studies utilized predicted residue-residue contacts (contact maps). However, there are two serious problems using them. First, the precision of contact map prediction techniques, especially for long-range contacts (i.e., β-residues), is still not satisfactory. Second, the residue-residue contact predictors usually utilize general properties of amino acids and disregard the structural features of β-residues. In this paper, we consider β-structure information, which is estimated from protein β-sheet data sets, as alignment scores. However, the predicted contact maps are used as a prior knowledge about residues. They are used for strengthening or weakening the alignment scores in our algorithm. Thus, we can utilize both β-residues and β-structure information in alignment of β-strands. The structure of dynamic programming of the alignment algorithm is changed in order to work with our prior knowledge. Moreover, the Four Russians method is applied to the proposed alignment algorithm in order to reduce the time complexity of the problem. For evaluating the proposed method, we applied it to the state-of-the-art β-sheet structure prediction methods. The experimental results on the BetaSheet916 data set showed significant improvements in the execution time, the accuracy of β-strands' alignment and consequently β-sheet structure prediction accuracy. The results are available at http://conceptsgate.com/BetaSheet.
Collapse
Affiliation(s)
- Mostafa Sabzekar
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran
| | - Mahmoud Naghibzadeh
- Department of Computer Engineering, Ferdowsi University of Mashhad, Mashhad, Iran.
| | - Javad Sadri
- Department of Computer Science & Software Engineering, Concordia University, Canada
| |
Collapse
|
6
|
Deng L, Dong X, Wu A, Song T, Jiang T. Coevolution signals capture the specific packing of secondary structures in protein architecture. Protein Cell 2015; 5:480-3. [PMID: 24699983 PMCID: PMC4026424 DOI: 10.1007/s13238-014-0051-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Affiliation(s)
- Lizong Deng
- Key Laboratory of Protein and Peptide Pharmaceuticals, National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, 100101, China
| | | | | | | | | |
Collapse
|
7
|
Daniels NM, Gallant A, Ramsey N, Cowen LJ. MRFy: Remote Homology Detection for Beta-Structural Proteins Using Markov Random Fields and Stochastic Search. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:4-16. [PMID: 26357074 DOI: 10.1109/tcbb.2014.2344682] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
We introduce MRFy, a tool for protein remote homology detection that captures beta-strand dependencies in the Markov random field. Over a set of 11 SCOP beta-structural superfamilies, MRFy shows a 14 percent improvement in mean Area Under the Curve for the motif recognition problem as compared to HMMER, 25 percent improvement as compared to RAPTOR, 14 percent improvement as compared to HHPred, and a 18 percent improvement as compared to CNFPred and RaptorX. MRFy was implemented in the Haskell functional programming language, and parallelizes well on multi-core systems. MRFy is available, as source code as well as an executable, from http://mrfy.cs.tufts.edu/.
Collapse
|
8
|
Joo H, Tsai J. An amino acid code for β-sheet packing structure. Proteins 2014; 82:2128-40. [PMID: 24668690 DOI: 10.1002/prot.24569] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2013] [Revised: 03/17/2014] [Accepted: 03/19/2014] [Indexed: 11/09/2022]
Abstract
To understand the relationship between protein sequence and structure, this work extends the knob-socket model in an investigation of β-sheet packing. Over a comprehensive set of β-sheet folds, the contacts between residues were used to identify packing cliques: sets of residues that all contact each other. These packing cliques were then classified based on size and contact order. From this analysis, the two types of four-residue packing cliques necessary to describe β-sheet packing were characterized. Both occur between two adjacent hydrogen bonded β-strands. First, defining the secondary structure packing within β-sheets, the combined socket or XY:HG pocket consists of four residues i, i+2 on one strand and j, j+2 on the other. Second, characterizing the tertiary packing between β-sheets, the knob-socket XY:H+B consists of a three-residue XY:H socket (i, i+2 on one strand and j on the other) packed against a knob B residue (residue k distant in sequence). Depending on the packing depth of the knob B residue, two types of knob-sockets are found: side-chain and main-chain sockets. The amino acid composition of the pockets and knob-sockets reveal the sequence specificity of β-sheet packing. For β-sheet formation, the XY:HG pocket clearly shows sequence specificity of amino acids. For tertiary packing, the XY:H+B side-chain and main-chain sockets exhibit distinct amino acid preferences at each position. These relationships define an amino acid code for β-sheet structure and provide an intuitive topological mapping of β-sheet packing.
Collapse
Affiliation(s)
- Hyun Joo
- Department of Chemistry, University of the Pacific, Stockton, California, 95212
| | | |
Collapse
|
9
|
Savojardo C, Fariselli P, Martelli PL, Casadio R. BCov: a method for predicting β-sheet topology using sparse inverse covariance estimation and integer programming. ACTA ACUST UNITED AC 2013; 29:3151-7. [PMID: 24064422 DOI: 10.1093/bioinformatics/btt555] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
MOTIVATION Prediction of protein residue contacts, even at the coarse-grain level, can help in finding solutions to the protein structure prediction problem. Unlike α-helices that are locally stabilized, β-sheets result from pairwise hydrogen bonding of two or more disjoint regions of the protein backbone. The problem of predicting contacts among β-strands in proteins has been addressed by several supervised computational approaches. Recently, prediction of residue contacts based on correlated mutations has been greatly improved and finally allows the prediction of 3D structures of the proteins. RESULTS In this article, we describe BCov, which is the first unsupervised method to predict the β-sheet topology starting from the protein sequence and its secondary structure. BCov takes advantage of the sparse inverse covariance estimation to define β-strand partner scores. Then an optimization based on integer programming is carried out to predict the β-sheet connectivity. When tested on the prediction of β-strand pairing, BCov scores with average values of Matthews Correlation Coefficient (MCC) and F1 equal to 0.56 and 0.61, respectively, on a non-redundant dataset of 916 protein chains known with atomic resolution. Our approach well compares with the state-of-the-art methods trained so far for this specific task. AVAILABILITY AND IMPLEMENTATION The method is freely available under General Public License at http://biocomp.unibo.it/savojard/bcov/bcov-1.0.tar.gz. The new dataset BetaSheet1452 can be downloaded at http://biocomp.unibo.it/savojard/bcov/BetaSheet1452.dat.
Collapse
Affiliation(s)
- Castrense Savojardo
- Biocomputing Group, CIRI-Health Science and Technology/Department of Biology, University of Bologna, 40126 Bologna, Italy and Department of Computer Science and Engineering, Via Mura Anteo Zamboni 7, 40127 Bologna, Italy
| | | | | | | |
Collapse
|
10
|
Abstract
In the present article, we provide a brief overview of the main approaches to analysing the sequence-structure relationship of proteins and outline a novel method of structure prediction. The proposed method involves finding a set of rules that describes a correlation between the distribution of residues in a sequence and the essential structural characteristics of a protein structure. The residue distribution rules specify the 'favourable' residues that are required in certain positions of a polypeptide chain in order for it to assume a particular protein fold, and the 'unfavourable' residues incompatible with the given fold. Identification of amino acid distribution rules derives from examination of inter-residue contacts. We describe residue distribution rules for a large group of β-sandwich-like proteins characterized by a specific arrangement of strands in their two β-sheets. It was shown that this method has very high accuracy (approximately 85%). The advantage of the residue rule approach is that it makes possible prediction of protein folding even in polypeptide chains that have very low global sequence similarities, as low as 18%. Another potential benefit is that a better understanding of which residues play essential roles in a given protein fold may facilitate rational protein engineering design.
Collapse
|
11
|
Guilloux A, Caudron B, Jestin JL. A method to predict edge strands in beta-sheets from protein sequences. Comput Struct Biotechnol J 2013; 7:e201305001. [PMID: 24688737 PMCID: PMC3962219 DOI: 10.5936/csbj.201305001] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2013] [Revised: 05/27/2013] [Accepted: 05/30/2013] [Indexed: 12/15/2022] Open
Abstract
There is a need for rules allowing three-dimensional structure information to be derived from protein sequences. In this work, consideration of an elementary protein folding step allows protein sub-sequences which optimize folding to be derived for any given protein sequence. Classical mechanics applied to this system and the energy conservation law during the elementary folding step yields an equation whose solutions are taken over the field of rational numbers. This formalism is applied to beta-sheets containing two edge strands and at least two central strands. The number of protein sub-sequences optimized for folding per amino acid in beta-strands is shown in particular to predict edge strands from protein sequences. Topological information on beta-strands and loops connecting them is derived for protein sequences with a prediction accuracy of 75%. The statistical significance of the finding is given. Applications in protein structure prediction are envisioned such as for the quality assessment of protein structure models.
Collapse
Affiliation(s)
- Antonin Guilloux
- Analyse algébrique, Institut de Mathématiques de Jussieu, Université Pierre et Marie Curie, Paris VI, France
| | - Bernard Caudron
- Centre d'Informatique pour la Biologie, Institut Pasteur, Paris, France
| | | |
Collapse
|
12
|
Statistical Analysis of Terminal Extensions of Protein β-Strand Pairs. Adv Bioinformatics 2013; 2013:909436. [PMID: 23424587 PMCID: PMC3569888 DOI: 10.1155/2013/909436] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2012] [Revised: 12/30/2012] [Accepted: 12/30/2012] [Indexed: 11/17/2022] Open
Abstract
The long-range interactions, required to the accurate predictions of tertiary structures of β-sheet-containing proteins, are still difficult to simulate. To remedy this problem and to facilitate β-sheet structure predictions, many efforts have been made by computational methods. However, known efforts on β-sheets mainly focus on interresidue contacts or amino acid partners. In this study, to go one step further, we studied β-sheets on the strand level, in which a statistical analysis was made on the terminal extensions of paired β-strands. In most cases, the two paired β-strands have different lengths, and terminal extensions exist. The terminal extensions are the extended part of the paired strands besides the common paired part. However, we found that the best pairing required a terminal alignment, and β-strands tend to pair to make bigger common parts. As a result, 96.97% of β-strand pairs have a ratio of 25% of the paired common part to the whole length. Also 94.26% and 95.98% of β-strand pairs have a ratio of 40% of the paired common part to the length of the two β-strands, respectively. Interstrand register predictions by searching interacting β-strands from several alternative offsets should comply with this rule to reduce the computational searching space to improve the performances of algorithms.
Collapse
|
13
|
Burkoff NS, Várnai C, Wild DL. Predicting protein β-sheet contacts using a maximum entropy-based correlated mutation measure. ACTA ACUST UNITED AC 2013; 29:580-7. [PMID: 23314126 DOI: 10.1093/bioinformatics/btt005] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
MOTIVATION The problem of ab initio protein folding is one of the most difficult in modern computational biology. The prediction of residue contacts within a protein provides a more tractable immediate step. Recently introduced maximum entropy-based correlated mutation measures (CMMs), such as direct information, have been successful in predicting residue contacts. However, most correlated mutation studies focus on proteins that have large good-quality multiple sequence alignments (MSA) because the power of correlated mutation analysis falls as the size of the MSA decreases. However, even with small autogenerated MSAs, maximum entropy-based CMMs contain information. To make use of this information, in this article, we focus not on general residue contacts but contacts between residues in β-sheets. The strong constraints and prior knowledge associated with β-contacts are ideally suited for prediction using a method that incorporates an often noisy CMM. RESULTS Using contrastive divergence, a statistical machine learning technique, we have calculated a maximum entropy-based CMM. We have integrated this measure with a new probabilistic model for β-contact prediction, which is used to predict both residue- and strand-level contacts. Using our model on a standard non-redundant dataset, we significantly outperform a 2D recurrent neural network architecture, achieving a 5% improvement in true positives at the 5% false-positive rate at the residue level. At the strand level, our approach is competitive with the state-of-the-art single methods achieving precision of 61.0% and recall of 55.4%, while not requiring residue solvent accessibility as an input. AVAILABILITY http://www2.warwick.ac.uk/fac/sci/systemsbiology/research/software/
Collapse
Affiliation(s)
- Nikolas S Burkoff
- Systems Biology Centre, Senate House, University of Warwick, Coventry, CV4 7AL, UK
| | | | | |
Collapse
|
14
|
Ho HK, Zhang L, Ramamohanarao K, Martin S. A survey of machine learning methods for secondary and supersecondary protein structure prediction. Methods Mol Biol 2013; 932:87-106. [PMID: 22987348 DOI: 10.1007/978-1-62703-065-6_6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
In this chapter we provide a survey of protein secondary and supersecondary structure prediction using methods from machine learning. Our focus is on machine learning methods applicable to β-hairpin and β-sheet prediction, but we also discuss methods for more general supersecondary structure prediction. We provide background on the secondary and supersecondary structures that we discuss, the features used to describe them, and the basic theory behind the machine learning methods used. We survey the machine learning methods available for secondary and supersecondary structure prediction and compare them where possible.
Collapse
Affiliation(s)
- Hui Kian Ho
- Department of Computer Science and Software Engineering, University of Melbourne, National ICT Australia, Parkville, VIC, Australia
| | | | | | | |
Collapse
|
15
|
Caudron B, Jestin J. Sequence criteria for the anti-parallel character of protein beta-strands. J Theor Biol 2012; 315:146-9. [DOI: 10.1016/j.jtbi.2012.09.011] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2012] [Revised: 09/10/2012] [Accepted: 09/12/2012] [Indexed: 12/17/2022]
|
16
|
Ho HK, Gange G, Kuiper MJ, Ramamohanarao K. BetaSearch: a new method for querying β-residue motifs. BMC Res Notes 2012; 5:391. [PMID: 22839199 PMCID: PMC3532365 DOI: 10.1186/1756-0500-5-391] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 06/15/2012] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Searching for structural motifs across known protein structures can be useful for identifying unrelated proteins with similar function and characterising secondary structures such as β-sheets. This is infeasible using conventional sequence alignment because linear protein sequences do not contain spatial information. β-residue motifs are β-sheet substructures that can be represented as graphs and queried using existing graph indexing methods, however, these approaches are designed for general graphs that do not incorporate the inherent structural constraints of β-sheets and require computationally-expensive filtering and verification procedures. 3D substructure search methods, on the other hand, allow β-residue motifs to be queried in a three-dimensional context but at significant computational costs. FINDINGS We developed a new method for querying β-residue motifs, called BetaSearch, which leverages the natural planar constraints of β-sheets by indexing them as 2D matrices, thus avoiding much of the computational complexities involved with structural and graph querying. BetaSearch exhibits faster filtering, verification, and overall query time than existing graph indexing approaches whilst producing comparable index sizes. Compared to 3D substructure search methods, BetaSearch achieves 33 and 240 times speedups over index-based and pairwise alignment-based approaches, respectively. Furthermore, we have presented case-studies to demonstrate its capability of motif matching in sequentially dissimilar proteins and described a method for using BetaSearch to predict β-strand pairing. CONCLUSIONS We have demonstrated that BetaSearch is a fast method for querying substructure motifs. The improvements in speed over existing approaches make it useful for efficiently performing high-volume exploratory querying of possible protein substructural motifs or conformations. BetaSearch was used to identify a nearly identical β-residue motif between an entirely synthetic (Top7) and a naturally-occurring protein (Charcot-Leyden crystal protein), as well as identifying structural similarities between biotin-binding domains of avidin, streptavidin and the lipocalin gamma subunit of human C8.
Collapse
Affiliation(s)
- Hui Kian Ho
- Department of Computing and Information Systems, The University of Melbourne, Victoria, Australia.
| | | | | | | |
Collapse
|
17
|
Subramani A, Floudas CA. β-sheet topology prediction with high precision and recall for β and mixed α/β proteins. PLoS One 2012; 7:e32461. [PMID: 22427840 PMCID: PMC3302896 DOI: 10.1371/journal.pone.0032461] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Accepted: 01/26/2012] [Indexed: 11/19/2022] Open
Abstract
The prediction of the correct -sheet topology for pure and mixed proteins is a critical intermediate step toward the three dimensional protein structure prediction. The predicted beta sheet topology provides distance constraints between sequentially separated residues, which reduces the three dimensional search space for a protein structure prediction algorithm. Here, we present a novel mixed integer linear optimization based framework for the prediction of -sheet topology in and mixed proteins. The objective is to maximize the total strand-to-strand contact potential of the protein. A large number of physical constraints are applied to provide biologically meaningful topology results. The formulation permits the creation of a rank-ordered list of preferred -sheet arrangements. Finally, the generated topologies are re-ranked using a fully atomistic approach involving torsion angle dynamics and clustering. For a large, non-redundant data set of 2102 and mixed proteins with at least 3 strands taken from the PDB, the proposed approach provides the top 5 solutions with average precision and recall greater than 78%. Consistent results are obtained in the -sheet topology prediction for blind targets provided during the CASP8 and CASP9 experiments, as well as for actual and predicted secondary structures. The -sheet topology prediction algorithm, BeST, is available to the scientific community at http://selene.princeton.edu/BeST/.
Collapse
Affiliation(s)
| | - Christodoulos A. Floudas
- Department of Chemical and Biological Engineering, Princeton University, Princeton, New Jersey, United States of America
- * E-mail:
| |
Collapse
|
18
|
Daniels NM, Hosur R, Berger B, Cowen LJ. SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homology detection for beta-structural proteins into the twilight zone. Bioinformatics 2012; 28:1216-22. [PMID: 22408192 PMCID: PMC3338012 DOI: 10.1093/bioinformatics/bts110] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile hidden Markov models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov random fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified MRFs and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif. Results: We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in area under curve (AUC) for beta-structural motif recognition as compared with HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared with RAPTOR (a well-known threading method) and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile–profile HMM method), despite HHpred's use of extensive additional training data. We demonstrate SMURFLite's ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a 100 new fold predictions. Availability and implementaion: A webserver that runs SMURFLite is available at: http://smurf.cs.tufts.edu/smurflite/ Contact:lenore.cowen@tufts.edu; bab@mit.edu
Collapse
Affiliation(s)
- Noah M Daniels
- Department of Computer Science, Tufts University, Medford, MA 02155, USA
| | | | | | | |
Collapse
|
19
|
Fitzpatrick AW, Knowles TPJ, Waudby CA, Vendruscolo M, Dobson CM. Inversion of the balance between hydrophobic and hydrogen bonding interactions in protein folding and aggregation. PLoS Comput Biol 2011; 7:e1002169. [PMID: 22022239 PMCID: PMC3192805 DOI: 10.1371/journal.pcbi.1002169] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2010] [Accepted: 07/06/2011] [Indexed: 12/25/2022] Open
Abstract
Identifying the forces that drive proteins to misfold and aggregate, rather than to fold into their functional states, is fundamental to our understanding of living systems and to our ability to combat protein deposition disorders such as Alzheimer's disease and the spongiform encephalopathies. We report here the finding that the balance between hydrophobic and hydrogen bonding interactions is different for proteins in the processes of folding to their native states and misfolding to the alternative amyloid structures. We find that the minima of the protein free energy landscape for folding and misfolding tend to be respectively dominated by hydrophobic and by hydrogen bonding interactions. These results characterise the nature of the interactions that determine the competition between folding and misfolding of proteins by revealing that the stability of native proteins is primarily determined by hydrophobic interactions between side-chains, while the stability of amyloid fibrils depends more on backbone intermolecular hydrogen bonding interactions. In order to carry out their biological functions, most proteins fold into well-defined conformations known as native states. Failure to fold, or to remain folded correctly, may result in misfolding and aggregation, which are processes associated with a wide range of highly debilitating, and so far incurable, human conditions that include Alzheimer's and Parkinson's diseases and type II diabetes. In our work we investigate the nature of the fundamental interactions that are responsible for the folding and misfolding behaviour of proteins, finding that interactions between protein side-chains play a major role in stabilising native states, whilst backbone hydrogen bonding interactions are key in determining the stability of amyloid fibrils.
Collapse
Affiliation(s)
| | | | | | | | - Christopher M. Dobson
- Department of Chemistry, University of Cambridge, Cambridge, United Kingdom
- * E-mail:
| |
Collapse
|
20
|
Studies on the rules of β-strand alignment in a protein β-sheet structure. J Theor Biol 2011; 285:69-76. [PMID: 21745480 DOI: 10.1016/j.jtbi.2011.06.030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2011] [Revised: 05/31/2011] [Accepted: 06/24/2011] [Indexed: 11/21/2022]
Abstract
To further disclose the underlying mechanisms of protein β-sheet formation, studies were made on the rules of β-strands alignment forming β-sheet structure using statistical and machine learning approaches. Firstly, statistical analysis was performed on the sum of β-strands between each β-strand pairs in protein sequences. The results showed a propensity of near-neighbor pairing (or called "first come first pair") in the β-strand pairs. Secondly, based on the same dataset, the pairwise cross-combinations of real β-strand pairs and four pseudo-β-strand contained pairs were classified by support vector machine (SVM). A novel feature extracting approach was designed for classification using the average amino acid pairing encoding matrix (APEM). Analytical results of the classification indicated that a segment of β-strand had the ability to distinguish β-strands from segments of α-helix and coil. However, the result also showed that a β-strand was not strongly conserved to choose its real partner from all the alternative β-strand partners, which was corresponding with the ordination results of the statistical analysis each other. Thus, the rules of "first come first pair" propensity and the non-conservative ability to choose real partner, were possible important factors affecting the β-strands alignment forming β-sheet structures.
Collapse
|
21
|
Tsutsumi M, Otaki JM. Parallel and antiparallel β-strands differ in amino acid composition and availability of short constituent sequences. J Chem Inf Model 2011; 51:1457-64. [PMID: 21520893 DOI: 10.1021/ci200027d] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
One of the important secondary structures in proteins is the β-strand. However, due to its complexity, it is less characterized than helical structures. Using the 1641 representative three-dimensional protein structure data from the Protein Data Bank, we characterized β-strand structures based on strand length and amino acid composition, focusing on differences between parallel and antiparallel β-strands. Antiparallel strands were more frequent and slightly longer than parallel strands. Overall, the majority of β-sheets were antiparallel sheets; however, mixed sheets were reasonably abundant, and parallel sheets were relatively rare. Notably, the nonpolar, aliphatic hydrocarbon amino acids, valine, isoleucine, and leucine were observed at a high frequency in both strands but were more abundant in parallel than in antiparallel strands. The relative amino acid occurrence in β-sheets, especially in parallel strands, was highly correlated with amino acid hydrophobicity. This correlation was not observed in α-helices and 3(10)-helices. In addition, we examined the frequency of 400 amino acid doublets and 8000 amino acid triplets in β-strands based on availability, a measurement of the relative counts of the doublets and triplets. We identified some triplets that were specifically found in either parallel or antiparallel strands. We further identified "zero-count triplets" which did not occur in either parallel or antiparallel strands, despite the fact that they were probabilistically supposed to occur several times. Taken together, the present study revealed essential features of β-strand structures and the differences between parallel and antiparallel β-strands, which can potentially be applied to the secondary structure prediction and the functional design of protein sequences in the future.
Collapse
Affiliation(s)
- Motosuke Tsutsumi
- The BCPH Unit of Molecular Physiology, Department of Chemistry, Biology and Marine Science, University of the Ryukyus, Nishihara, Okinawa, Japan
| | | |
Collapse
|
22
|
Aydin Z, Altunbasak Y, Erdogan H. Bayesian models and algorithms for protein β-sheet prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2011; 8:395-409. [PMID: 21233522 DOI: 10.1109/tcbb.2008.140] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
Prediction of the 3D structure greatly benefits from the information related to secondary structure, solvent accessibility, and nonlocal contacts that stabilize a protein's structure. We address the problem of \beta-sheet prediction defined as the prediction of \beta--strand pairings, interaction types (parallel or antiparallel), and \beta-residue interactions (or contact maps). We introduce a Bayesian approach for proteins with six or less \beta-strands in which we model the conformational features in a probabilistic framework by combining the amino acid pairing potentials with a priori knowledge of \beta-strand organizations. To select the optimum \beta-sheet architecture, we significantly reduce the search space by heuristics that enforce the amino acid pairs with strong interaction potentials. In addition, we find the optimum pairwise alignment between \beta-strands using dynamic programming in which we allow any number of gaps in an alignment to model \beta-bulges more effectively. For proteins with more than six \beta-strands, we first compute \beta-strand pairings using the BetaPro method. Then, we compute gapped alignments of the paired \beta-strands and choose the interaction types and \beta--residue pairings with maximum alignment scores. We performed a 10-fold cross-validation experiment on the BetaSheet916 set and obtained significant improvements in the prediction accuracy.
Collapse
Affiliation(s)
- Zafer Aydin
- Department of Genome Sciences, University of Washington, Genome Sciences, Box 357456, 1705 NE Pacific St., Seattle, WA 98195-5065, USA.
| | | | | |
Collapse
|
23
|
Kumar A, Cowen L. Recognition of beta-structural motifs using hidden Markov models trained with simulated evolution. Bioinformatics 2010; 26:i287-93. [PMID: 20529918 PMCID: PMC2881384 DOI: 10.1093/bioinformatics/btq199] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
Motivation: One of the most successful methods to date for recognizing protein sequences that are evolutionarily related, has been profile hidden Markov models. However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in β-sheets. We thus explore methods for incorporating pairwise dependencies into these models. Results: We consider the remote homology detection problem for β-structural motifs. In particular, we ask if a statistical model trained on members of only one family in a SCOP β-structural superfamily, can recognize members of other families in that superfamily. We show that HMMs trained with our pairwise model of simulated evolution achieve nearly a median 5% improvement in AUC for β-structural motif recognition as compared to ordinary HMMs. Availability: All datasets and HMMs are available at: http://bcb.cs.tufts.edu/pairwise/ Contact:anoop.kumar@tufts.edu; lenore.cowen@tufts.edu
Collapse
Affiliation(s)
- Anoop Kumar
- Department of Computer Science, Tufts University, Medford, MA, USA.
| | | |
Collapse
|
24
|
Wu L, McElheny D, Takekiyo T, Keiderling TA. Geometry and Efficacy of Cross-Strand Trp/Trp, Trp/Tyr, and Tyr/Tyr Aromatic Interaction in a β-Hairpin Peptide. Biochemistry 2010; 49:4705-14. [DOI: 10.1021/bi100491s] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Ling Wu
- Department of Chemistry, University of Illinois at Chicago, 845 W. Taylor Street, Chicago, Illinois 60607-7061
| | - Dan McElheny
- Department of Chemistry, University of Illinois at Chicago, 845 W. Taylor Street, Chicago, Illinois 60607-7061
| | - Takahiro Takekiyo
- Department of Chemistry, University of Illinois at Chicago, 845 W. Taylor Street, Chicago, Illinois 60607-7061
| | - Timothy A. Keiderling
- Department of Chemistry, University of Illinois at Chicago, 845 W. Taylor Street, Chicago, Illinois 60607-7061
| |
Collapse
|
25
|
Zhang N, Duan G, Gao S, Ruan J, Zhang T. Prediction of the parallel/antiparallel orientation of beta-strands using amino acid pairing preferences and support vector machines. J Theor Biol 2010; 263:360-8. [PMID: 20035768 DOI: 10.1016/j.jtbi.2009.12.019] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2009] [Revised: 11/05/2009] [Accepted: 12/17/2009] [Indexed: 10/20/2022]
|
26
|
Menke M, Berger B, Cowen L. Markov random fields reveal an N-terminal double beta-propeller motif as part of a bacterial hybrid two-component sensor system. Proc Natl Acad Sci U S A 2010; 107:4069-74. [PMID: 20147619 PMCID: PMC2819974 DOI: 10.1073/pnas.0909950107] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The recent explosion in newly sequenced bacterial genomes is outpacing the capacity of researchers to try to assign functional annotation to all the new proteins. Hence, computational methods that can help predict structural motifs provide increasingly important clues in helping to determine how these proteins might function. We introduce a Markov Random Field approach tailored for recognizing proteins that fold into mainly beta-structural motifs, and apply it to build recognizers for the beta-propeller shapes. As an application, we identify a potential class of hybrid two-component sensor proteins, that we predict contain a double-propeller domain.
Collapse
Affiliation(s)
- Matt Menke
- Tufts University, Medford, MA 02155; and
- Massachusetts Institute of Technology, Cambridge, MA 02139
| | - Bonnie Berger
- Massachusetts Institute of Technology, Cambridge, MA 02139
| | | |
Collapse
|
27
|
Liu Y, Carbonell J, Gopalakrishnan V, Weigele P. Conditional graphical models for protein structural motif recognition. J Comput Biol 2009; 16:639-57. [PMID: 19432536 DOI: 10.1089/cmb.2008.0176] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Determining protein structures is crucial to understanding the mechanisms of infection and designing drugs. However, the elucidation of protein folds by crystallographic experiments can be a bottleneck in the development process. In this article, we present a probabilistic graphical model framework, conditional graphical models, for predicting protein structural motifs. It represents the structure characteristics of a structural motif using a graph, where the nodes denote the secondary structure elements, and the edges indicate the side-chain interactions between the components either within one protein chain or between chains. Then the model defines the optimal segmentation of a protein sequence against the graph by maximizing its "conditional" probability so that it can take advantages of the discriminative training approach. Efficient approximate inference algorithms using reversible jump Markov Chain Monte Carlo (MCMC) algorithm are developed to handle the resulting complex graphical models. We test our algorithm on four important structural motifs, and our method outperforms other state-of-art algorithms for motif recognition. We also hypothesize potential membership proteins of target folds from Swiss-Prot, which further supports the evolutionary hypothesis about viral folds.
Collapse
Affiliation(s)
- Yan Liu
- IBM T.J. Watson Research Center, Yorktown Heights, New York 10598, USA.
| | | | | | | |
Collapse
|
28
|
Zhang N, Ruan J, Duan G, Gao S, Zhang T. The interstrand amino acid pairs play a significant role in determining the parallel or antiparallel orientation of beta-strands. Biochem Biophys Res Commun 2009; 386:537-43. [PMID: 19540200 DOI: 10.1016/j.bbrc.2009.06.072] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2009] [Accepted: 06/16/2009] [Indexed: 12/12/2022]
Abstract
It is widely considered that it is not appropriate to treat beta-pairs in isolation, since other secondary structural models (such as helices, coils), protein topology and protein tertiary structures would limit beta-strand pairing. However, to understand the underlying mechanisms of beta-sheet formation, studies ought to be performed separately on more concrete aspects. In this study, we focus on the parallel or antiparallel orientation of beta-strands. First, statistical analysis was performed on the relative frequencies of the interstrand amino acid pairs within parallel and antiparallel beta-strands. Consequently, features were extracted by singular value decomposition from the statistical results. By using the support vector machine to distinguish the features extracted from the two types of beta-strands, high accuracy was achieved (up to 99.4%). This suggests that the interstrand amino acid pairs play a significant role in determining the parallel or antiparallel orientation of beta-strands. These results may provide useful information for developing other useful algorithms to examine to the beta-strand folding pathways, and could eventually lead to protein structure predictions.
Collapse
Affiliation(s)
- Ning Zhang
- Key Laboratory of Bioactive Materials, Ministry of Education and College of Life Science, Nankai University, Tianjin 300071, PR China
| | | | | | | | | |
Collapse
|
29
|
Eidenschink L, Kier BL, Huggins KNL, Andersen NH. Very short peptides with stable folds: building on the interrelationship of Trp/Trp, Trp/cation, and Trp/backbone-amide interaction geometries. Proteins 2009; 75:308-22. [PMID: 18831035 PMCID: PMC2656586 DOI: 10.1002/prot.22240] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
By combining a favorable turn sequence with a turn flanking Trp/Trp interaction and a C-terminal H-bonding interaction between a backbone amide and an i-2 Trp ring, a particularly stable (DeltaG(U) > 7 kJ/mol) truncated hairpin, Ac-WI-(D-Pro-D-Asn)-KWTG-NH(2), results. In this construct and others with a W-(4-residue turn)-W motif in severely truncated hairpins, the C-terminal Trp is the edge residue in a well-defined face-to-edge (FtE) aryl/aryl interaction. Longer hairpins and those with six-residue turns retain the reversed "edge-to-face" (EtF) Trp/Trp geometry first observed for the trpzip peptides. Mutational studies suggest that the W-(4-residue turn)-W interaction provides at least 3 kJ/mol of stabilization in excess of that due to the greater beta-propensity of Trp. The pi-cation, and Trp/Gly-H(N) interactions have been defined. The latter can give rise to >3 ppm upfield shifts for the Gly-H(N) in -WX(n)G- units both in turns (n = 2) and at the C-termini (n = 1) of hairpins. Terminal YTG units result in somewhat smaller shifts (extrapolated to 2 ppm for 100% folding). In peptides with both the EtF and FtE W/W interaction geometries, Trp to Tyr mutations indicate that Trp is the preferred "face" residue in aryl/aryl pairings, presumably because of its greater pi basicity.
Collapse
Affiliation(s)
- Lisa Eidenschink
- Department of Chemistry, University of Washington, Seattle, Washington 98195, USA
| | | | | | | |
Collapse
|
30
|
Jeong J, Berman P, Przytycka TM. Improving strand pairing prediction through exploring folding cooperativity. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2008; 5:484-491. [PMID: 18989036 PMCID: PMC2597093 DOI: 10.1109/tcbb.2008.88] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
The topology of beta-sheets is defined by the pattern of hydrogen-bonded strand pairing. Therefore, predicting hydrogen bonded strand partners is a fundamental step towards predicting beta-sheet topology. At the same time, finding the correct partners is very difficult due to long range interactions involved in strand pairing. Additionally, patterns of amino acids involved, in beta-sheet formations are very general and therefore difficult to use for computational recognition of specific contacts between strands. In this work, we report a new strand pairing algorithm. To address above mentioned difficulties, our algorithm attempts to mimic elements of the folding process. Namely, in addition to ensuring that the predicted hydrogen bonded strand pairs satisfy basic global consistency constraints, it takes into account hypothetical folding pathways. Consistently with this view, introducing hydrogen bonds between a pair of strands changes the probabilities of forming hydrogen bonds between other pairs of strand. We demonstrate that this approach provides an improvement over previously proposed algorithms. We also compare the performance of this method to that of a global optimization algorithm that poses the problem as integer linear programming optimization problem and solves it using ILOG CPLEX package.
Collapse
Affiliation(s)
- Jieun Jeong
- Department of Computer Science and Engineering, Pennsylvania State University, University Park, PA 16802, USA
| | | | | |
Collapse
|
31
|
Makabe K, Yan S, Tereshko V, Gawlak G, Koide S. Beta-strand flipping and slipping triggered by turn replacement reveal the opportunistic nature of beta-strand pairing. J Am Chem Soc 2007; 129:14661-9. [PMID: 17985889 DOI: 10.1021/ja074252c] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
We investigated how the register between adjacent beta-strands is specified using a series of mutants of the single-layer beta-sheet (SLB) in Borrelia OspA. The single-layer architecture of this system eliminates structural restraints imposed by a hydrophobic core, enabling us to address this question. A critical turn (turn 9/10) in the SLB was replaced with a segment with an intentional structural mismatch. Its crystal structure revealed a one-residue insertion into the central beta-strand (strand 9) of the SLB. This insertion triggered a surprisingly large-scale structural rearrangement: (i) the central strand (strand 9) was shifted by one residue, causing the strand to flip with respect to the adjacent beta-strands and thus completely disrupting the native side-chain contacts; (ii) the three-residue turn located on the opposite end of the beta-strand (turn 8/9) was pushed into its preceding beta-strand (strand 8); (iii) the register between strands 8 and 9 was shifted by three residues. Replacing the original sequence for turn 8/9 with a stronger turn motif restored the original strand register but still with a flipped beta-strand 9. The stability differences of these distinct structures were surprisingly small, consistent with an energy landscape where multiple low-energy states with different beta-sheet configurations exist. The observed conformations can be rationalized in terms of maximizing the number of backbone H-bonds. These results suggest that adjacent beta-strands "stick" through the use of factors that are not highly sequence specific and that beta-strands could slide back and forth relatively easily in the absence of external elements such as turns and tertiary packing.
Collapse
Affiliation(s)
- Koki Makabe
- Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | |
Collapse
|
32
|
Levin S, Nowick JS. An artificial beta-sheet that dimerizes through parallel beta-sheet interactions. J Am Chem Soc 2007; 129:13043-8. [PMID: 17918935 DOI: 10.1021/ja073391r] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
This Article introduces a simple chemical model of a beta-sheet (artificial beta-sheet) that dimerizes by parallel beta-sheet formation in chloroform solution. The artificial beta-sheet consists of two N-terminally linked peptide strands that are linked with succinic or fumaric acid and blocked along one edge with a hydrogen-bonding template composed of 5-aminoanisic acid hydrazide. The template is connected to one of the peptide strands by a turn unit composed of (S)-2-aminoadipic acid (Aaa). 1H NMR spectroscopic studies show that these artificial beta-sheets fold in CDCl3 solution to form well-defined beta-sheet structures that dimerize through parallel beta-sheet interactions. Most notably, all of these compounds show a rich network of NOEs associated with folding and dimerization. The compounds also exhibit chemical shifts and coupling constants consistent with the formation of folded dimeric beta-sheet structures. The aminoadipic acid unit shows patterns of NOEs and coupling constants consistent with a well-defined turn conformation. The present system represents a significant step toward modeling the type of parallel beta-sheet interactions that occur in protein aggregation.
Collapse
Affiliation(s)
- Sergiy Levin
- Department of Chemistry, University of California, Irvine, Irvine, California 92697-2025, USA
| | | |
Collapse
|
33
|
Wu Y, Lu M, Chen M, Li J, Ma J. OPUS-Ca: a knowledge-based potential function requiring only Calpha positions. Protein Sci 2007; 16:1449-63. [PMID: 17586777 PMCID: PMC2206690 DOI: 10.1110/ps.072796107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In this paper, we report a knowledge-based potential function, named the OPUS-Ca potential, that requires only Calpha positions as input. The contributions from other atomic positions were established from pseudo-positions artificially built from a Calpha trace for auxiliary purposes. The potential function is formed based on seven major representative molecular interactions in proteins: distance-dependent pairwise energy with orientational preference, hydrogen bonding energy, short-range energy, packing energy, tri-peptide packing energy, three-body energy, and solvation energy. From the testing of decoy recognition on a number of commonly used decoy sets, it is shown that the new potential function outperforms all known Calpha-based potentials and most other coarse-grained ones that require more information than Calpha positions. We hope that this potential function adds a new tool for protein structural modeling.
Collapse
Affiliation(s)
- Yinghao Wu
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | | | | | | | | |
Collapse
|
34
|
Abstract
The formation of beta-sheet domains in proteins involves five energetically important factors: the formation of networks of hydrogen bonds and hydrophobic faces, and the residue propensities, or preferences, to be found at the edges of the beta-sheet, to adopt the extended conformation, and to make contact with other residues. These relative energy contributions define a potential energy function. Here, we show how optimizing this potential energy function reveals the formation of hydrophobic faces as the utmost factor. The potential energy function was optimized to minimize the Z-scores of the native topologies among the exhaustive sets of over 400 different beta-sheets. These results corroborate with experimental data that showed the environment of a protein is an important modulator of beta-sheet folding. The contact propensities were found to be the least important, which could explain the poor predictive power of beta-strand alignment methods based on pair-wise contact matrices.
Collapse
Affiliation(s)
- Marc Parisien
- Department of Computer Science and Operations Research, Institute for Research in Immunology and Cancer, Université de Montréal, Montréal, Québec, Canada
| | | |
Collapse
|
35
|
Martin S, Brown WM, Faulon JL. Using product kernels to predict protein interactions. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2007; 110:215-45. [PMID: 17922100 DOI: 10.1007/10_2007_084] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
There is a wide variety of experimental methods for the identification of protein interactions. This variety has in turn spurred the development of numerous different computational approaches for modeling and predicting protein interactions. These methods range from detailed structure-based methods capable of operating on only a single pair of proteins at a time to approximate statistical methods capable of making predictions on multiple proteomes simultaneously. In this chapter, we provide a brief discussion of the relative merits of different experimental and computational methods available for identifying protein interactions. Then we focus on the application of our particular (computational) method using Support Vector Machine product kernels. We describe our method in detail and discuss the application of the method for predicting protein-protein interactions, beta-strand interactions, and protein-chemical interactions.
Collapse
Affiliation(s)
- Shawn Martin
- Computational Biology, Sandia National Laboratories, PO Box 5800, 87185-1316, Albuquerque, NM 87185-1316, USA.
| | | | | |
Collapse
|
36
|
Trovato A, Chiti F, Maritan A, Seno F. Insight into the structure of amyloid fibrils from the analysis of globular proteins. PLoS Comput Biol 2006; 2:e170. [PMID: 17173479 PMCID: PMC1698942 DOI: 10.1371/journal.pcbi.0020170] [Citation(s) in RCA: 172] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2006] [Accepted: 10/30/2006] [Indexed: 11/19/2022] Open
Abstract
The conversion from soluble states into cross-β fibrillar aggregates is a property shared by many different proteins and peptides and was hence conjectured to be a generic feature of polypeptide chains. Increasing evidence is now accumulating that such fibrillar assemblies are generally characterized by a parallel in-register alignment of β-strands contributed by distinct protein molecules. Here we assume a universal mechanism is responsible for β-structure formation and deduce sequence-specific interaction energies between pairs of protein fragments from a statistical analysis of the native folds of globular proteins. The derived fragment–fragment interaction was implemented within a novel algorithm, prediction of amyloid structure aggregation (PASTA), to investigate the role of sequence heterogeneity in driving specific aggregation into ordered self-propagating cross-β structures. The algorithm predicts that the parallel in-register arrangement of sequence portions that participate in the fibril cross-β core is favoured in most cases. However, the antiparallel arrangement is correctly discriminated when present in fibrils formed by short peptides. The predictions of the most aggregation-prone portions of initially unfolded polypeptide chains are also in excellent agreement with available experimental observations. These results corroborate the recent hypothesis that the amyloid structure is stabilised by the same physicochemical determinants as those operating in folded proteins. They also suggest that side chain–side chain interaction across neighbouring β-strands is a key determinant of amyloid fibril formation and of their self-propagating ability. In many fatal neurodegenerative diseases, including Alzheimer, Parkinson, and spongiform encephalopathies, proteins aggregate into specific fibrous structures to form insoluble plaques known as amyloid. The amyloid structure may also play a nonaberrant role in different organisms. Many globular proteins, folding to their biologically functional native structures in vivo, can be induced to aggregate into amyloid-like fibrils under suitable conditions in vitro. One hallmark of amyloid structure is a specific supramolecular architecture called cross-beta structure, held together by hydrogen bonds extending repeatedly along the fibril axis, but intermolecular interactions are yet unknown at the amino-acid level except for very few cases. In this study, the authors present an algorithm, called prediction of amyloid structure aggregation (PASTA), to computationally predict which portions of a given protein or peptide sequence forming amyloid fibrils are stabilizing the corresponding cross-beta structure and the specific intermolecular pattern of hydrogen-bonded amino acids. PASTA is based on the assumption that the same amino acid–specific interactions stabilizing hydrogen bond patterns in native structures of globular proteins are also employed by nature in amyloid structure. The successful comparison of the authors' prediction with available experimental data supports the existence of a unique framework to describe protein folding and aggregation.
Collapse
Affiliation(s)
- Antonio Trovato
- Consorzio Nazionale Interuniversitario per le Scienze Fisiche della Materia, Unità di Padova, Padua, Italy.
| | | | | | | |
Collapse
|
37
|
Bu Z, Shi Y, Callaway DJE, Tycko R. Molecular alignment within beta-sheets in Abeta(14-23) fibrils: solid-state NMR experiments and theoretical predictions. Biophys J 2006; 92:594-602. [PMID: 17056725 PMCID: PMC1751388 DOI: 10.1529/biophysj.106.091017] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We report investigations of the molecular structure of amyloid fibrils formed by residues 14-23 of the beta-amyloid peptide associated with Alzheimer's disease (Abeta(14-23)), using solid-state nuclear magnetic resonance (NMR) techniques in conjunction with electron microscopy and atomic force microscopy. The NMR measurements, which include two-dimensional proton-mediated (13)C-(13)C exchange and two-dimensional relayed proton-mediated (13)C-(13)C exchange spectra, show that Abeta(14-23) fibrils contain antiparallel beta-sheets with a registry of backbone hydrogen bonds that aligns residue 17+k of each peptide molecule with residue 22-k of neighboring molecules in the same beta-sheet. We compare these results, as well as previously reported experimental results for fibrils formed by other beta-amyloid fragments, with theoretical predictions of molecular alignment based on databases of residue-specific alignments in antiparallel beta-sheets in known protein structures. While the theoretical predictions are not in exact agreement with the experimental results, they facilitate the design of experiments by suggesting a small number of plausible alignments that are readily distinguished by solid-state NMR.
Collapse
Affiliation(s)
- Zimei Bu
- Fox Chase Cancer Center, Philadelphia, Pennsylvania, USA
| | | | | | | |
Collapse
|
38
|
McDonnell AV, Menke M, Palmer N, King J, Cowen L, Berger B. Fold recognition and accurate sequence-structure alignment of sequences directing beta-sheet proteins. Proteins 2006; 63:976-85. [PMID: 16547930 DOI: 10.1002/prot.20942] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The ability to predict structure from sequence is particularly important for toxins, virulence factors, allergens, cytokines, and other proteins of public health importance. Many such functions are represented in the parallel beta-helix and beta-trefoil families. A method using pairwise beta-strand interaction probabilities coupled with evolutionary information represented by sequence profiles is developed to tackle these problems for the beta-helix and beta-trefoil folds. The algorithm BetaWrapPro employs a "wrapping" component that may capture folding processes with an initiation stage followed by processive interaction of the sequence with the already-formed motifs. BetaWrapPro outperforms all previous motif recognition programs for these folds, recognizing the beta-helix with 100% sensitivity and 99.7% specificity and the beta-trefoil with 100% sensitivity and 92.5% specificity, in crossvalidation on a database of all nonredundant known positive and negative examples of these fold classes in the PDB. It additionally aligns 88% of residues for the beta-helices and 86% for the beta-trefoils accurately (within four residues of the exact position) to the structural template, which is then used with the side-chain packing program SCWRL to produce 3D structure predictions. One striking result has been the prediction of an unexpected parallel beta-helix structure for a pollen allergen, and its recent confirmation through solution of its structure. A Web server running BetaWrapPro is available and outputs putative PDB-style coordinates for sequences predicted to form the target folds.
Collapse
Affiliation(s)
- Andrew V McDonnell
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
| | | | | | | | | | | |
Collapse
|
39
|
Liu Y, Carbonell J, Weigele P, Gopalakrishnan V. Protein Fold Recognition Using Segmentation Conditional Random Fields (SCRFs). J Comput Biol 2006; 13:394-406. [PMID: 16597248 DOI: 10.1089/cmb.2006.13.394] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Protein fold recognition is an important step towards understanding protein three-dimensional structures and their functions. A conditional graphical model, i.e., segmentation conditional random fields (SCRFs), is proposed as an effective solution to this problem. In contrast to traditional graphical models, such as the hidden Markov model (HMM), SCRFs follow a discriminative approach. Therefore, it is flexible to include any features in the model, such as overlapping or long-range interaction features over the whole sequence. The model also employs a convex optimization function, which results in globally optimal solutions to the model parameters. On the other hand, the segmentation setting in SCRFs makes their graphical structures intuitively similar to the protein 3-D structures and more importantly provides a framework to model the long-range interactions between secondary structures directly. Our model is applied to predict the parallel beta-helix fold, an important fold in bacterial pathogenesis and carbohydrate binding/cleavage. The cross-family validation shows that SCRFs not only can score all known beta-helices higher than non-beta-helices in the Protein Data Bank (PDB), but also accurately locates rungs in known beta-helix proteins. Our method outperforms BetaWrap, a state-of-the-art algorithm for predicting beta-helix folds, and HMMER, a general motif detection algorithm based on HMM, and has the additional advantage of general application to other protein folds. Applying our prediction model to the Uniprot Database, we identify previously unknown potential beta-helices.
Collapse
Affiliation(s)
- Yan Liu
- School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
| | | | | | | |
Collapse
|
40
|
Fooks HM, Martin ACR, Woolfson DN, Sessions RB, Hutchinson EG. Amino Acid Pairing Preferences in Parallel β-Sheets in Proteins. J Mol Biol 2006; 356:32-44. [PMID: 16337654 DOI: 10.1016/j.jmb.2005.11.008] [Citation(s) in RCA: 70] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2005] [Revised: 10/14/2005] [Accepted: 11/02/2005] [Indexed: 11/15/2022]
Abstract
Statistical approaches have been applied to examine amino acid pairing preferences within parallel beta-sheets. The main chain hydrogen bonding pattern in parallel beta-sheets means that, for each residue pair, only one of the residues is involved in main chain hydrogen bonding with the strand containing the partner residue. We call this the hydrogen bonded (HB) residue and the partner residue the non-hydrogen bonded (nHB) residue, and differentiate between the favorability of a pair and that of its reverse pair, e.g. Asn(HB)-Thr(nHB)versus Thr(HB)-Asn(nHB). Significantly (p < or = 0.000001) favoured pairings were rationalised using stereochemical arguments. For instance, Asn(HB)-Thr(nHB) and Arg(HB)-Thr(nHB) were favoured pairs, where the residues adopted favoured chi1 rotamer positions that allowed side-chain interactions to occur. In contrast, Thr(HB)-Asn(nHB) and Thr(HB)-Arg(nHB) were not significantly favoured, and could only form side-chain interactions if the residues involved adopted less favourable chi1 conformations. The favourability of hydrophobic pairs e.g. Ile(HB)-Ile(nHB), Val(HB)-Val(nHB) and Leu(HB)-Ile(nHB) was explained by the residues adopting their most preferred chi1 and chi2 conformations, which enabled them to form nested arrangements. Cysteine-cysteine pairs are significantly favoured, although these do not form intrasheet disulphide bridges. Interactions between positively and negatively charged residues were asymmetrically preferred: those with the negatively charged residue at the HB position were more favoured. This trend was accounted for by the presence of general electrostatic interactions, which, based on analysis of distances between charged atoms, were likely to be stronger when the negatively charged residue is the HB partner. The Arg(HB)-Asp(nHB) interaction was an exception to this trend and its favorability was rationalised by the formation of specific side-chain interactions. This research provides rules that could be applied to protein structure prediction, comparative modelling and protein engineering and design. The methods used to analyse the pairing preferences are automated and detailed results are available (http://www.rubic.rdg.ac.uk/betapairprefsparallel/).
Collapse
Affiliation(s)
- H M Fooks
- School of Animal & Microbial Sciences, University of Reading, Whiteknights, P.O. Box 228, Reading RG6 6AJ, UK
| | | | | | | | | |
Collapse
|
41
|
Brown WM, Martin S, Chabarek JP, Strauss C, Faulon JL. Prediction of beta-strand packing interactions using the signature product. J Mol Model 2005; 12:355-61. [PMID: 16365772 DOI: 10.1007/s00894-005-0052-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2005] [Accepted: 09/23/2005] [Indexed: 10/25/2022]
Abstract
The prediction of beta-sheet topology requires the consideration of long-range interactions between beta-strands that are not necessarily consecutive in sequence. Since these interactions are difficult to simulate using ab initio methods, we propose a supplementary method able to assign beta-sheet topology using only sequence information. We envision using the results of our method to reduce the three-dimensional search space of ab initio methods. Our method is based on the signature molecular descriptor, which has been used previously to predict protein-protein interactions successfully, and to develop quantitative structure-activity relationships for small organic drugs and peptide inhibitors. Here, we show how the signature descriptor can be used in a Support Vector Machine to predict whether or not two beta-strands will pack adjacently within a protein. We then show how these predictions can be used to order beta-strands within beta-sheets. Using the entire PDB database with ten-fold cross-validation, we have achieved 74.0% accuracy in packing prediction and 75.6% accuracy in the prediction of edge strands. For the case of beta-strand ordering, we are able to predict the correct ordering accurately for 51.3% of the beta-sheets. Furthermore, using a simple confidence metric, we can determine those sheets for which accurate predictions can be obtained. For the top 25% highest confidence predictions, we are able to achieve 95.7% accuracy in beta-strand ordering. [Figure: see text].
Collapse
Affiliation(s)
- W Michael Brown
- Computational Biology 9212, Sandia National Laboratories, P.O. Box 5800, MS 310, Albuquerque, NM 87185, USA
| | | | | | | | | |
Collapse
|
42
|
Chen Z, Krause G, Reif B. Structure and Orientation of Peptide Inhibitors Bound to Beta-amyloid Fibrils. J Mol Biol 2005; 354:760-76. [PMID: 16271725 DOI: 10.1016/j.jmb.2005.09.055] [Citation(s) in RCA: 56] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2005] [Revised: 08/30/2005] [Accepted: 09/16/2005] [Indexed: 11/30/2022]
Abstract
Polymerization of the soluble beta-amyloid peptide into highly ordered fibrils is hypothesized to be a causative event in the development of Alzheimer's disease. Understanding the interactions of Abeta with inhibitors on an atomic level is fundamental for the development of diagnostics and therapeutic approaches, and can provide, in addition, important indirect information of the amyloid fibril structure. We have shown recently that trRDCs can be measured in solution state NMR for peptide ligands binding weakly to amyloid fibrils. We present here the structures for two inhibitor peptides, LPFFD and DPFFL, and their structural models bound to fibrillar Abeta(14-23) and Abeta(1-40) based on transferred nuclear Overhauser effect (trNOE) and transferred residual dipolar coupling (trRDC) data. In a first step, the inhibitor peptide structure is calculated on the basis of trNOE data; the trRDC data are then validated on the basis of the trNOE-derived structure using the program PALES. The orientation of the peptide inhibitors with respect to Abeta fibrils is obtained from trRDC data, assuming that Abeta fibrils orient such that the fibril axis is aligned in parallel with the magnetic field. The trRDC-derived alignment tensor of the peptide ligand is then used as a restraint for molecular dynamics docking studies. We find that the structure with the lowest rmsd value is in agreement with a model in which the inhibitor peptide binds to the long side of an amyloid fibril. Especially, we detect interactions involving the hydrophobic core, residues K16 and E22/D23 of the Abeta sequence. Structural differences are observed for binding of the inhibitor peptide to Abeta14-23 and Abeta1-40 fibrils, respectively, indicating different fibril structure. We expect this approach to be useful in the rational design of amyloid ligands with improved binding characteristics.
Collapse
Affiliation(s)
- Zhongjing Chen
- Forschungsinstitut für Molekulare Pharmakologie (FMP), Robert-Rössle-Str. 10, D-13125 Berlin, Germany
| | | | | |
Collapse
|
43
|
Jackups R, Liang J. Interstrand Pairing Patterns in β-Barrel Membrane Proteins: The Positive-outside Rule, Aromatic Rescue, and Strand Registration Prediction. J Mol Biol 2005; 354:979-93. [PMID: 16277990 DOI: 10.1016/j.jmb.2005.09.094] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2005] [Revised: 09/23/2005] [Accepted: 09/27/2005] [Indexed: 10/25/2022]
Abstract
beta-Barrel membrane proteins are found in the outer membrane of Gram-negative bacteria, mitochondria, and chloroplasts. Little is known about how residues in membrane beta-barrels interact preferentially with other residues on adjacent strands. We have developed probabilistic models to quantify propensities of residues for different spatial locations and for interstrand pairwise contact interactions involving strong H-bonds, side-chain interactions, and weak H-bonds. Using the reference state of exhaustive permutation of residues within the same beta-strand, the propensity values and p-values measuring statistical significance are calculated exactly by analytical formulae we have developed. Our findings show that there are characteristic preferences of residues for different membrane locations. Contrary to the "positive-inside" rule for helical membrane proteins, beta-barrel membrane proteins follow a significant albeit weaker "positive-outside" rule, in that the basic residues Arg and Lys are disproportionately favored in the extracellular cap region and disfavored in the periplasmic cap region. We find that different residue pairs prefer strong backbone H-bonded interstrand pairings (e.g. Gly-aromatic) or non-H-bonded pairings (e.g. aromatic-aromatic). In addition, we find that Tyr and Phe participate in aromatic rescue by shielding Gly from polar environments. We also show that these propensities can be used to predict the registration of strand pairs, an important task for the structure prediction of beta-barrel membrane proteins. Our accuracy of 44% is considerably better than random (7%). It also significantly outperforms a comparable registration prediction for soluble beta-sheets under similar conditions. Our results imply several experiments that can help to elucidate the mechanisms of in vitro and in vivo folding of beta-barrel membrane proteins. The propensity scales developed in this study will also be useful for computational structure prediction and for folding simulations.
Collapse
Affiliation(s)
- Ronald Jackups
- Department of Bioengineering, SEO, MC-063, University of Illinois at Chicago, 851 S. Morgan Street, Room 218, Chicago, IL 60607-7052, USA
| | | |
Collapse
|
44
|
Siepen JA, Radford SE, Westhead DR. Beta edge strands in protein structure prediction and aggregation. Protein Sci 2004; 12:2348-59. [PMID: 14500893 PMCID: PMC2366916 DOI: 10.1110/ps.03234503] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
It is well established that recognition between exposed edges of beta-sheets is an important mode of protein-protein interaction and can have pathological consequences; for instance, it has been linked to the aggregation of proteins into a fibrillar structure, which is associated with a number of predominantly neurodegenerative disorders. A number of protective mechanisms have evolved in the edge strands of beta-sheets, preventing the aggregation and insolubility of most natural beta-sheet proteins. Such mechanisms are unfavorable in the interior of a beta-sheet. The problem of distinguishing edge strands from central strands based on sequence information alone is important in predicting residues and mutations likely to be involved in aggregation, and is also a first step in predicting folding topology. Here we report support vector machine (SVM) and decision tree methods developed to classify edge strands from central strands in a representative set of protein domains. Interestingly, rules generated by the decision tree method are in close agreement with our knowledge of protein structure and are potentially useful in a number of different biological applications. When trained on strands from proteins of known structure, using structure-based (Dictionary of Secondary Structure in Proteins) strand assignments, both methods achieved mean cross-validated, prediction accuracies of approximately 78%. These accuracies were reduced when strand assignments from secondary structure prediction were used. Further investigation of this effect revealed that it could be explained by a significant reduction in the accuracy of standard secondary structure prediction methods for edge strands, in comparison with central strands.
Collapse
Affiliation(s)
- Jennifer A Siepen
- School of Biochemistry and Molecular Biology, University of Leeds, Leeds LS2 9JT, UK
| | | | | |
Collapse
|
45
|
Petkova AT, Buntkowsky G, Dyda F, Leapman RD, Yau WM, Tycko R. Solid state NMR reveals a pH-dependent antiparallel beta-sheet registry in fibrils formed by a beta-amyloid peptide. J Mol Biol 2004; 335:247-60. [PMID: 14659754 DOI: 10.1016/j.jmb.2003.10.044] [Citation(s) in RCA: 258] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We report solid state nuclear magnetic resonance (NMR) measurements that probe the supramolecular organization of beta-sheets in the cross-beta motif of amyloid fibrils formed by residues 11-25 of the beta-amyloid peptide associated with Alzheimer's disease (Abeta(11-25)). Fibrils were prepared at pH 7.4 and pH 2.4. The solid state NMR data indicate that the central hydrophobic segment of Abeta(11-25) (sequence LVFFA) adopts a beta-strand conformation and participates in antiparallel beta-sheets at both pH values, but that the registry of intermolecular hydrogen bonds is pH-dependent. Moreover, both registries determined for Abeta(11-25) fibrils are different from the hydrogen bond registry in the antiparallel beta-sheets of Abeta(16-22) fibrils at pH 7.4 determined in earlier solid state NMR studies. In all three cases, the hydrogen bond registry is highly ordered, with no detectable "registry-shift" defects. These results suggest that the supramolecular organization of beta-sheets in amyloid fibrils is determined by a sensitive balance of multiple side-chain-side-chain interactions. Recent structural models for Abeta(11-25) fibrils based on X-ray fiber diffraction data are inconsistent with the solid state NMR data at both pH values.
Collapse
Affiliation(s)
- A T Petkova
- Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA
| | | | | | | | | | | |
Collapse
|
46
|
Bystroff C, Shao Y, Yuan X. Five Hierarchical Levels of Sequence-Structure Correlation in Proteins. ACTA ACUST UNITED AC 2004; 3:97-104. [PMID: 15693735 DOI: 10.2165/00822942-200403020-00004] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Abstract
This article reviews recent work towards modelling protein folding pathways using a bioinformatics approach. Statistical models have been developed for sequence-structure correlations in proteins at five levels of structural complexity: (i) short motifs; (ii) extended motifs; (iii) nonlocal pairs of motifs; (iv) 3-dimensional arrangements of multiple motifs; and (v) global structural homology. We review statistical models, including sequence profiles, hidden Markov models (HMMs) and interaction potentials, for the first four levels of structural detail. The I-sites (folding Initiation sites) Library models short local structure motifs. Each succeeding level has a statistical model, as follows: HMMSTR (HMM for STRucture) is an HMM for extended motifs; HMMSTR-CM (Contact Maps) is a model for pairwise interactions between motifs; and SCALI-HMM (HMMs for Structural Core ALIgnments) is a set of HMMs for the spatial arrangements of motifs. The parallels between the statistical models and theoretical models for folding pathways are discussed in this article; however, global sequence models are not discussed because they have been extensively reviewed elsewhere. The data used and algorithms presented in this article are available at http://www.bioinfo.rpi.edu/~bystrc/ (click on "servers" or "downloads") or by request to bystrc@rpi.edu .
Collapse
Affiliation(s)
- Christopher Bystroff
- Biology Department, Rensselaer Polytechnic Institute, 110 8th Street, Troy, NY 12180, USA.
| | | | | |
Collapse
|
47
|
Kuwata K, Matumoto T, Cheng H, Nagayama K, James TL, Roder H. NMR-detected hydrogen exchange and molecular dynamics simulations provide structural insight into fibril formation of prion protein fragment 106-126. Proc Natl Acad Sci U S A 2003; 100:14790-5. [PMID: 14657385 PMCID: PMC299804 DOI: 10.1073/pnas.2433563100] [Citation(s) in RCA: 113] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
PrP106-126, a peptide corresponding to residues 107-127 of the human prion protein, induces neuronal cell death by apoptosis and causes proliferation and hypertrophy of glia, reproducing the main neuropathological features of prion-related transmissible spongiform encephalopathies, such as bovine spongiform encephalopathy and Creutzfeldt-Jakob disease. Although PrP106-126 has been shown to form amyloid-like fibrils in vitro, their structural properties have not been elucidated. Here, we investigate the conformational characteristics of a fibril-forming fragment of the mouse prion protein, MoPrP106-126, by using electron microscopy, CD spectroscopy, NMR-detected hydrogen-deuterium exchange measurements, and molecular dynamics simulations. The fibrils contain approximately 50% beta-sheet structure, and strong amide exchange protection is limited to the central portion of the peptide spanning the palindromic sequence VAGAAAAGAV. Molecular dynamics simulations indicate that MoPrP106-126 in water assumes a stable structure consisting of two four-stranded parallel beta-sheets that are tightly packed against each other by methyl-methyl interactions. Fibril formation involving polyalanine stacking is consistent with the experimental observations.
Collapse
Affiliation(s)
- Kazuo Kuwata
- Department of Biochemistry and Biophysics, School of Medicine, Gifu University, 40 Tsukasa-machi, Gifu 500-8705, Japan.
| | | | | | | | | | | |
Collapse
|