1
|
Csaba G, Zimmer R. Vorescore--fold recognition improved by rescoring of protein structure models. Bioinformatics 2010; 26:i474-81. [PMID: 20823310 PMCID: PMC2935407 DOI: 10.1093/bioinformatics/btq369] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Summary: The identification of good protein structure models and their appropriate ranking is a crucial problem in structure prediction and fold recognition. For many alignment methods, rescoring of alignment-induced models using structural information can improve the separation of useful and less useful models as compared with the alignment score. Vorescore, a template-based protein structure model rescoring system is introduced. The method scores the model structure against the template used for the modeling using Vorolign. The method works on models from different alignment methods and incorporates both knowledge from the prediction method and the rescoring. Results: The performance of Vorescore is evaluated in a large-scale and difficult protein structure prediction context. We use different threading methods to create models for 410 targets, in three scenarios: (i) family members are contained in the template set; (ii) superfamily members (but no family members); and (iii) only fold members (but no family or superfamily members). In all cases Vorescore improves significantly (e.g. 40% on both Gotoh and HHalign at the fold level) on the model quality, and clearly outperforms the state-of-the-art physics-based model scoring system Rosetta. Moreover, Vorescore improves on other successful rescoring approaches such as Pcons and ProQ. In an additional experiment we add high-quality models based on structural alignments to the set, which allows Vorescore to improve the fold recognition rate by another 50%. Availability: All models of the test set (about 2 million, 44 GB gzipped) are available upon request. Contact:csaba@bio.ifi.lmu.de; ralf.zimmer@ifi.lmu.de
Collapse
Affiliation(s)
- Gergely Csaba
- Department of Informatics, Ludwig-Maximilians-Universität München, München, Germany.
| | | |
Collapse
|
2
|
Paiardini A, Caputo V. Insights into the interaction of sortilin with proneurotrophins: a computational approach. Neuropeptides 2008; 42:205-14. [PMID: 18191449 DOI: 10.1016/j.npep.2007.11.004] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/24/2007] [Revised: 10/29/2007] [Accepted: 11/22/2007] [Indexed: 12/22/2022]
Abstract
Sortilin is a member of the recently discovered family of type-1 transmembrane Vps10p-domain receptors, which are expressed in several tissues, including brain and spinal chord. It has been recently demonstrated that the interaction between sortilin and the N-terminal portion of the precursor forms of the nerve growth factor (pro-NGF) and the brain-derived neurotrophic factor (pro-BDNF) represents a key event in the process that controls neurotrophins-mediated cell survival and death in developing neuronal tissue and post-traumatic neuronal apoptosis. Moreover, it is known that the cleavage of the N-terminal propeptide of sortilin is required for full functional activity of the receptor. The propeptide, indeed, hinders ligands from accessing the binding site of sortilin. However, to date, the molecular mechanism underlying the interaction between sortilin and pro-NGF/pro-BDNF remains unknown. By means of computational approaches, we suggest that the N-terminal Vps10p domain of sortilin, which is responsible for the interaction with the neurotrophins, adopts a beta-propeller fold, and that the N-terminal regions of sortilin, pro-NGF and pro-BDNF are mainly intrinsically disordered regions (IDRs). The following mechanism is therefore proposed: the Vps10p-domain of sortilin is a beta-propeller able to bind its own IDR and the IDRs of neurotrophins. The excision of its N-terminal disordered peptide allows the interaction with the intrinsically disordered N-terminus of pro-BDNF and pro-NGF, possibly through a disorder-to-order transition behaviour.
Collapse
Affiliation(s)
- Alessandro Paiardini
- Dipartimento di Scienze Biochimiche A. Rossi Fanelli, Università di Roma La Sapienza, Piazzale Aldo Moro 5, Via degli Apuli 9, 00185 Rome, Italy.
| | | |
Collapse
|
3
|
Abstract
Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.
Collapse
|
4
|
Xu J, Jiao F, Yu L. Protein structure prediction using threading. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:91-121. [PMID: 18075163 DOI: 10.1007/978-1-59745-574-9_4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
This chapter discusses the protocol for computational protein structure prediction by protein threading. First, we present a general procedure and summarize some typical ideas for each step of protein threading. Then, we describe the design and implementation of RAPTOR, a protein structure prediction program based on threading. The major focuses are three key components of RAPTOR: a linear programming approach to protein threading, two machine learning approaches (SVM and Gradient Boosting) to fold recognition, and evaluation of the statistical significance of the prediction results. The first part of this chapter is a brief review of protein threading, and the second part contains original research results. Some key ideas and results have been previously published.
Collapse
Affiliation(s)
- Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, IL, USA
| | | | | |
Collapse
|
5
|
Goonesekere NCW, Lee B. Context-specific amino acid substitution matrices and their use in the detection of protein homologs. Proteins 2007; 71:910-9. [DOI: 10.1002/prot.21775] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
|
6
|
Panteri R, Paiardini A, Keller F. A 3D model of Reelin subrepeat regions predicts Reelin binding to carbohydrates. Brain Res 2006; 1116:222-30. [PMID: 16979599 DOI: 10.1016/j.brainres.2006.07.128] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2006] [Revised: 07/10/2006] [Accepted: 07/29/2006] [Indexed: 11/18/2022]
Abstract
Reelin is a large molecule of the extracellular matrix (ECM) which regulates neuronal positioning during the early stages of cortical development in vertebrate species. The Reelin molecule can be subdivided into a smaller N-terminal domain, showing homology with F-spondin, and a larger C-terminal region containing 8 EGF-like repeats. The localization of Reelin in the ECM, its large dimensions and the modular organization of its primary structure led us to suppose a structure of its modules similar to domains commonly found in ECM proteins such as Agrin, laminins and thrombospondins. We therefore performed a sequence alignment and molecular modeling analysis to study the three-dimensional fold of the Reelin subrepeat regions. Our analysis produces a tentative model of the core region of the Reelin subrepeat sequences and suggests the presence in this 3D model of structural features common to polysaccharide-binding modules which are often found on proteoglycans of the ECM. These findings provide a conceptual framework for further experiments aimed at testing the functions of the EGF-like repeat regions of Reelin.
Collapse
Affiliation(s)
- Roger Panteri
- Laboratory of Developmental Neuroscience, Università Campus Bio-Medico, Via Longoni 83, 00155 Rome, Italy.
| | | | | |
Collapse
|
7
|
Poupon A. Voronoi and Voronoi-related tessellations in studies of protein structure and interaction. Curr Opin Struct Biol 2005; 14:233-41. [PMID: 15093839 DOI: 10.1016/j.sbi.2004.03.010] [Citation(s) in RCA: 102] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The three-dimensional structure of a protein can be modeled by a set of polyhedra drawn around its atoms or residues. The tessellation invented by Voronoi in 1908, and other tessellations of space derived from it, provide versatile representations of three-dimensional structures. In recent years, they have been used to investigate a series of issues relating to proteins: atom and residue volumes, packing, folding, interactions and binding.
Collapse
Affiliation(s)
- Anne Poupon
- Laboratoire d'Enzymologie et Biochimie Structurales, CNRS Bat 34, 91198 Gif-sur-Yvette, France.
| |
Collapse
|
8
|
|
9
|
Giordanetto F, Kroemer RT. A three-dimensional model of Suppressor Of Cytokine Signalling 1 (SOCS-1). Protein Eng Des Sel 2003; 16:115-24. [PMID: 12676980 DOI: 10.1093/proeng/gzg015] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Suppressor Of Cytokine Signalling 1 (SOCS-1) is one of the proteins responsible for the negative regulation of the JAK-STAT pathway triggered by many cytokines. This important inhibition involves complex formation between SOCS-1 and JAK2, which requires particular structural domains (KIR, ESS and SH2) on SOCS-1. A three-dimensional theoretical model of SOCS-1 is presented here. The model was generated by the application of different modelling techniques, including threading, structure-based modelling, surface analysis and protein docking. The structure accounts for the interactions between SOCS-1 and two other key proteins in the JAK-STAT pathway, namely JAK2 and Elongin BC. The proposed model for the interaction between SOCS-1 and JAK2 suggests that the SOCS-1 suppress the kinase activity of JAK2 by obstructing the catalytic groove of the tyrosine kinase. Subsequent interaction of the JAK-SOCS complex with Elongin BC was also modelled. A sequence and structural comparison between the SH2 domain of SOCS-1 and the SH2 domains of other proteins highlights key residues that could be responsible for SOCS-1 specificity. Currently available mutational data are evaluated. The results are consistent with the experimental data and they provide deeper insights into the inhibitory function of SOCS-1 at a molecular level.
Collapse
Affiliation(s)
- Fabrizio Giordanetto
- Department of Chemistry, Queen Mary, University of London, Mile End Road, London E1 4NS, UK
| | | |
Collapse
|
10
|
Giordanetto F, Kroemer RT. Prediction of the structure of human Janus kinase 2 (JAK2) comprising JAK homology domains 1 through 7. Protein Eng Des Sel 2002; 15:727-37. [PMID: 12456871 DOI: 10.1093/protein/15.9.727] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
A theoretical model of human Janus kinase 2 (JAK2) comprising all seven Janus homology domains is presented. The model was generated by application of homology modelling approaches. The three-dimensional structure contains, starting from the N-terminus, FERM (4.1, ezrin, radixin, moesin), SH2 (Src homology region 2), tyrosine kinase-like, and tyrosine kinase domains. The predicted inter-domain orientation in JAK2 is discussed and the currently existing mutational data for Janus kinases are evaluated. Structural details of the SH2 and the FERM domains are presented. The predictions indicate that the SH2 domain is not fully functional. A number of hydrophobic amino acids of the FERM domain that are predicted to be involved in the constitutive association with the cytokine receptors are highlighted. The model gives new insights into the structure-function relationship of this important protein, and areas that could be investigated by mutation studies are highlighted.
Collapse
Affiliation(s)
- Fabrizio Giordanetto
- Department of Chemistry, Queen Mary and Westfield College,University of London, Mile End Road, London E1 4NS, UK
| | | |
Collapse
|
11
|
Abstract
Various bioinformatics problems require optimizing several different properties simultaneously. For example, in the protein threading problem, a scoring function combines the values for different parameters of possible sequence-to-structure alignments into a single score to allow for unambiguous optimization. In this context, an essential question is how each property should be weighted. As the native structures are known for some sequences, a partial ordering on optimal alignments to other structures, e.g., derived from structural comparisons, may be used to adjust the weights. To resolve the arising interdependence of weights and computed solutions, we propose a heuristic approach: iterating the computation of solutions (here, threading alignments) given the weights and the estimation of optimal weights of the scoring function given these solutions via systematic calibration methods. For our application (i.e., threading), this iterative approach results in structurally meaningful weights that significantly improve performance on both the training and the test data sets. In addition, the optimized parameters show significant improvements on the recognition rate for a grossly enlarged comprehensive benchmark, a modified recognition protocol as well as modified alignment types (local instead of global and profiles instead of single sequences). These results show the general validity of the optimized weights for the given threading program and the associated scoring contributions.
Collapse
Affiliation(s)
- A Zien
- GMD-German National Research Center for Information Technology, Institute for Algorithms and Scientific Computing (SCAI), Sankt Augustin.
| | | | | |
Collapse
|
12
|
|
13
|
von Ohsen N, Zimmer R. Improving Profile-Profile Alignments via Log Average Scoring. LECTURE NOTES IN COMPUTER SCIENCE 2001. [DOI: 10.1007/3-540-44696-6_2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
|
14
|
Lindauer K, Loerting T, Liedl KR, Kroemer RT. Prediction of the structure of human Janus kinase 2 (JAK2) comprising the two carboxy-terminal domains reveals a mechanism for autoregulation. PROTEIN ENGINEERING 2001; 14:27-37. [PMID: 11287676 DOI: 10.1093/protein/14.1.27] [Citation(s) in RCA: 127] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
The structure of human Janus kinase 2 (JAK2) comprising the two C-terminal domains (JH1 and JH2) was predicted by application of homology modelling techniques. JH1 and JH2 represent the tyrosine kinase and tyrosine kinase-like domains, respectively, and are crucial for function and regulation of the protein. A comparison between the structures of the two domains is made and structural differences are highlighted. Prediction of the relative orientation of JH1 and JH2 was aided by a newly developed method for the detection of correlated amino acid mutations. Analysis of the interactions between the two domains led to a model for the regulatory effect of JH2 on JH1. The predictions are consistent with available experimental data on JAK2 or related proteins and provide an explanation for inhibition of JH1 tyrosine kinase activity by the adjacent JH2 domain.
Collapse
Affiliation(s)
- K Lindauer
- Department of Chemistry, Queen Mary and Westfield College, University of London, Mile End Road, London E1 4NS, UK
| | | | | | | |
Collapse
|
15
|
Mirny LA, Finkelstein AV, Shakhnovich EI. Statistical significance of protein structure prediction by threading. Proc Natl Acad Sci U S A 2000; 97:9978-83. [PMID: 10954732 PMCID: PMC27644 DOI: 10.1073/pnas.160271197] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In this study, we estimate the statistical significance of structure prediction by threading. We introduce a single parameter epsilon that serves as a universal measure determining the probability that the best alignment is indeed a native-like analog. Parameter epsilon takes into account both length and composition of the query sequence and the number of decoys in threading simulation. It can be computed directly from the query sequence and potential of interactions, eliminating the need for sequence reshuffling and realignment. Although our theoretical analysis is general, here we compare its predictions with the results of gapless threading. Finally we estimate the number of decoys from which the native structure can be found by existing potentials of interactions. We discuss how this analysis can be extended to determine the optimal gap penalties for any sequence-structure alignment (threading) method, thus optimizing it to maximum possible performance.
Collapse
Affiliation(s)
- L A Mirny
- Department of Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA 02138, USA
| | | | | |
Collapse
|
16
|
Abstract
We present a protein fold-recognition method that uses a comprehensive statistical interpretation of structural Hidden Markov Models (HMMs). The structure/fold recognition is done by summing the probabilities of all sequence-to-structure alignments. The optimal alignment can be defined as the most probable, but suboptimal alignments may have comparable probabilities. These suboptimal alignments can be interpreted as optimal alignments to the "other" structures from the ensemble or optimal alignments under minor fluctuations in the scoring function. Summing probabilities for all alignments gives a complete estimate of sequence-model compatibility. In the case of HMMs that produce a sequence, this reflects the fact that due to our indifference to exactly how the HMM produced the sequence, we should sum over all possibilities. We have built a set of structural HMMs for 188 protein structures and have compared two methods for identifying the structure compatible with a sequence: by the optimal alignment probability and by the total probability. Fold recognition by total probability was 40% more accurate than fold recognition by the optimal alignment probability. Proteins 2000;40:451-462.
Collapse
Affiliation(s)
- J R Bienkowska
- BioMolecular Engineering Research Center, College of Engineering, Boston University, Boston, Massachusetts 02215, USA.
| | | | | | | | | |
Collapse
|
17
|
Kelley LA, MacCallum RM, Sternberg MJ. Enhanced genome annotation using structural profiles in the program 3D-PSSM. J Mol Biol 2000; 299:499-520. [PMID: 10860755 DOI: 10.1006/jmbi.2000.3741] [Citation(s) in RCA: 1198] [Impact Index Per Article: 49.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A method (three-dimensional position-specific scoring matrix, 3D-PSSM) to recognise remote protein sequence homologues is described. The method combines the power of multiple sequence profiles with knowledge of protein structure to provide enhanced recognition and thus functional assignment of newly sequenced genomes. The method uses structural alignments of homologous proteins of similar three-dimensional structure in the structural classification of proteins (SCOP) database to obtain a structural equivalence of residues. These equivalences are used to extend multiply aligned sequences obtained by standard sequence searches. The resulting large superfamily-based multiple alignment is converted into a PSSM. Combined with secondary structure matching and solvation potentials, 3D-PSSM can recognise structural and functional relationships beyond state-of-the-art sequence methods. In a cross-validated benchmark on 136 homologous relationships unambiguously undetectable by position-specific iterated basic local alignment search tool (PSI-Blast), 3D-PSSM can confidently assign 18 %. The method was applied to the remaining unassigned regions of the Mycoplasma genitalium genome and an additional 13 regions were assigned with 95 % confidence. 3D-PSSM is available to the community as a web server: http://www.bmm.icnet.uk/servers/3dpssm
Collapse
Affiliation(s)
- L A Kelley
- Biomolecular Modelling Laboratory, Imperial Cancer Research Fund, 44 Lincoln's Inn Fields, London, WC2A 3PX, England
| | | | | |
Collapse
|
18
|
|