Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Rice DW, Eisenberg D. A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 1997;267:1026-38. [PMID: 9135128 DOI: 10.1006/jmbi.1997.0924] [Citation(s) in RCA: 131] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

For:	Rice DW, Eisenberg D. A 3D-1D substitution matrix for protein fold recognition that includes predicted secondary structure of the sequence. J Mol Biol 1997;267:1026-38. [PMID: 9135128 DOI: 10.1006/jmbi.1997.0924] [Citation(s) in RCA: 131] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]

Number

Cited by Other Article(s)

Dunham AS, Beltrao P. Exploring amino acid functions in a deep mutational landscape. Mol Syst Biol 2021;17:e10305. [PMID: 34292650 PMCID: PMC8297461 DOI: 10.15252/msb.202110305] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 06/29/2021] [Accepted: 06/30/2021] [Indexed: 12/21/2022] Open

Trivedi R, Nagarajaram HA. Substitution scoring matrices for proteins - An overview. Protein Sci 2020;29:2150-2163. [PMID: 32954566 DOI: 10.1002/pro.3954] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2020] [Revised: 09/17/2020] [Accepted: 09/18/2020] [Indexed: 01/17/2023]

Niu G, Shao Z, Liu C, Chen T, Jiao Q, Hong Z. Comparative and evolutionary analyses of the divergence of plant oligosaccharyltransferase STT3 isoforms. FEBS Open Bio 2020;10:468-483. [PMID: 32011067 PMCID: PMC7050244 DOI: 10.1002/2211-5463.12804] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Revised: 01/11/2020] [Accepted: 01/30/2020] [Indexed: 11/08/2022] Open

Depth dependent amino acid substitution matrices and their use in predicting deleterious mutations. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2017;128:14-23. [DOI: 10.1016/j.pbiomolbio.2017.02.004] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2016] [Revised: 01/06/2017] [Accepted: 02/07/2017] [Indexed: 12/31/2022]

Chrysostomou C, Seker H. Novel protein weight matrix generated from amino acid indices. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2016;2015:8181-4. [PMID: 26738193 DOI: 10.1109/embc.2015.7320293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014;53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]

Joseph AP, de Brevern AG. From local structure to a global framework: recognition of protein folds. J R Soc Interface 2014;11:20131147. [PMID: 24740960 DOI: 10.1098/rsif.2013.1147] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Improvement in low-homology template-based modeling by employing a model evaluation method with focus on topology. PLoS One 2014;9:e89935. [PMID: 24587135 PMCID: PMC3935967 DOI: 10.1371/journal.pone.0089935] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2013] [Accepted: 01/24/2014] [Indexed: 01/22/2023] Open

Angermüller C, Biegert A, Söding J. Discriminative modelling of context-specific amino acid substitution probabilities. ACTA ACUST UNITED AC 2012;28:3240-7. [PMID: 23080114 DOI: 10.1093/bioinformatics/bts622] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Söding J, Remmert M. Protein sequence comparison and fold recognition: progress and good-practice benchmarking. Curr Opin Struct Biol 2011;21:404-11. [PMID: 21458982 DOI: 10.1016/j.sbi.2011.03.005] [Citation(s) in RCA: 55] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Revised: 03/01/2011] [Accepted: 03/09/2011] [Indexed: 11/26/2022]

Hu Y, Dong X, Wu A, Cao Y, Tian L, Jiang T. Incorporation of local structural preference potential improves fold recognition. PLoS One 2011;6:e17215. [PMID: 21365008 PMCID: PMC3041821 DOI: 10.1371/journal.pone.0017215] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2010] [Accepted: 01/25/2011] [Indexed: 11/19/2022] Open

Xu HS, Ren WK, Liu XH, Li XQ. Aligning protein sequence and analysing substitution pattern using a class-specific matrix. J Biosci 2011;35:295-314. [PMID: 20689185 DOI: 10.1007/s12038-010-0033-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Shen HD, Tam MF, Huang CH, Chou H, Tai HY, Chen YS, Sheu SY, Thomas WR. Homology modeling and monoclonal antibody binding of the Der f 7 dust mite allergen. Immunol Cell Biol 2010;89:225-30. [DOI: 10.1038/icb.2010.77] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Chen CC, Hwang JK, Yang JM. (PS)2-v2: template-based protein structure prediction server. BMC Bioinformatics 2009;10:366. [PMID: 19878598 PMCID: PMC2775752 DOI: 10.1186/1471-2105-10-366] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2009] [Accepted: 10/31/2009] [Indexed: 03/11/2024] Open

Abstract

Background

Template selection and target-template alignment are critical steps for template-based modeling (TBM) methods. To identify the template for the twilight zone of 15~25% sequence similarity between targets and templates is still difficulty for template-based protein structure prediction. This study presents the (PS)²-v2 server, based on our original server with numerous enhancements and modifications, to improve reliability and applicability.

Results

To detect homologous proteins with remote similarity, the (PS)²-v2 server utilizes the S2A2 matrix, which is a 60 × 60 substitution matrix using the secondary structure propensities of 20 amino acids, and the position-specific sequence profile (PSSM) generated by PSI-BLAST. In addition, our server uses multiple templates and multiple models to build and assess models. Our method was evaluated on the Lindahl benchmark for fold recognition and ProSup benchmark for sequence alignment. Evaluation results indicated that our method outperforms sequence-profile approaches, and had comparable performance to that of structure-based methods on these benchmarks. Finally, we tested our method using the 154 TBM targets of the CASP8 (Critical Assessment of Techniques for Protein Structure Prediction) dataset. Experimental results show that (PS)²-v2 is ranked 6^thamong 72 severs and is faster than the top-rank five serves, which utilize ab initio methods.

Conclusion

Experimental results demonstrate that (PS)²-v2 with the S2A2 matrix is useful for template selections and target-template alignments by blending the amino acid and structural propensities. The multiple-template and multiple-model strategies are able to significantly improve the accuracies for target-template alignments in the twilight zone. We believe that this server is useful in structure prediction and modeling, especially in detecting homologous templates with sequence similarity in the twilight zone.

Collapse

Tångrot JE, Kågström B, Sauer UH. Accurate domain identification with structure-anchored hidden Markov models, saHMMs. Proteins 2009;76:343-52. [PMID: 19173309 DOI: 10.1002/prot.22349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Abstract

The ever increasing speed of DNA sequencing widens the discrepancy between the number of known gene products, and the knowledge of their function and structure. Proper annotation of protein sequences is therefore crucial if the missing information is to be deduced from sequence-based similarity comparisons. These comparisons become exceedingly difficult as the pairwise identities drop to very low values. To improve the accuracy of domain identification, we exploit the fact that the three-dimensional structures of domains are much more conserved than their sequences. Based on structure-anchored multiple sequence alignments of low identity homologues we constructed 850 structure-anchored hidden Markov models (saHMMs), each representing one domain family. Since the saHMMs are highly family specific, they can be used to assign a domain to its correct family and clearly distinguish it from domains belonging to other families, even within the same superfamily. This task is not trivial and becomes particularly difficult if the unknown domain is distantly related to the rest of the domain sequences within the family. In a search with full length protein sequences, harbouring at least one domain as defined by the structural classification of proteins database (SCOP), version 1.71, versus the saHMM database based on SCOP version 1.69, we achieve an accuracy of 99.0%. All of the few hits outside the family fall within the correct superfamily. Compared to Pfam_ls HMMs, the saHMMs obtain about 11% higher coverage. A comparison with BLAST and PSI-BLAST demonstrates that the saHMMs have consistently fewer errors per query at a given coverage. Within our recommended E-value range, the same is true for a comparison with SUPERFAMILY. Furthermore, we are able to annotate 232 proteins with 530 nonoverlapping domains belonging to 102 different domain families among human proteins labelled "unknown" in the NCBI protein database. Our results demonstrate that the saHMM database represents a versatile and reliable tool for identification of domains in protein sequences. With the aid of saHMMs, homology on the family level can be assigned, even for distantly related sequences. Due to the construction of the saHMMs, the hits they provide are always associated with high quality crystal structures. The saHMM database can be accessed via the FISH server at http://babel.ucmp.umu.se/fish/.

Collapse

Song CM, Lim SJ, Tong JC. Recent advances in computer-aided drug design. Brief Bioinform 2009;10:579-91. [PMID: 19433475 DOI: 10.1093/bib/bbp023] [Citation(s) in RCA: 152] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023] Open

Xia X, Zhang S, Su Y, Sun Z. MICAlign: a sequence-to-structure alignment tool integrating multiple sources of information in conditional random fields. Bioinformatics 2009;25:1433-4. [DOI: 10.1093/bioinformatics/btp251] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Sequence context-specific profiles for homology searching. Proc Natl Acad Sci U S A 2009;106:3770-5. [PMID: 19234132 DOI: 10.1073/pnas.0810767106] [Citation(s) in RCA: 125] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

PROSIGN: A method for protein secondary structure assignment based on three-dimensional coordinates of consecutive Cα atoms. Comput Biol Chem 2008;32:406-11. [DOI: 10.1016/j.compbiolchem.2008.07.027] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2007] [Revised: 07/15/2008] [Accepted: 07/24/2008] [Indexed: 11/18/2022]

Improved scoring function for comparative modeling using the M4T method. ACTA ACUST UNITED AC 2008;10:95-9. [PMID: 18985440 DOI: 10.1007/s10969-008-9044-9] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2008] [Accepted: 10/16/2008] [Indexed: 10/21/2022]

Gong S, Blundell TL. Discarding functional residues from the substitution table improves predictions of active sites within three-dimensional structures. PLoS Comput Biol 2008;4:e1000179. [PMID: 18833291 PMCID: PMC2527532 DOI: 10.1371/journal.pcbi.1000179] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2008] [Accepted: 08/07/2008] [Indexed: 11/21/2022] Open

Abstract

Substitutions of individual amino acids in proteins may be under very different evolutionary restraints depending on their structural and functional roles. The Environment Specific Substitution Table (ESST) describes the pattern of substitutions in terms of amino acid location within elements of secondary structure, solvent accessibility, and the existence of hydrogen bonds between side chains and neighbouring amino acid residues. Clearly amino acids that have very different local environments in their functional state compared to those in the protein analysed will give rise to inconsistencies in the calculation of amino acid substitution tables. Here, we describe how the calculation of ESSTs can be improved by discarding the functional residues from the calculation of substitution tables. Four categories of functions are examined in this study: protein–protein interactions, protein–nucleic acid interactions, protein–ligand interactions, and catalytic activity of enzymes. Their contributions to residue conservation are measured and investigated. We test our new ESSTs using the program CRESCENDO, designed to predict functional residues by exploiting knowledge of amino acid substitutions, and compare the benchmark results with proteins whose functions have been defined experimentally. The new methodology increases the Z-score by 98% at the active site residues and finds 16% more active sites compared with the old ESST. We also find that discarding amino acids responsible for protein–protein interactions helps in the prediction of those residues although they are not as conserved as the residues of active sites. Our methodology can make the substitution tables better reflect and describe the substitution patterns of amino acids that are under structural restraints only.

Identification of residues responsible for a specific function of a protein can provide clues about the mechanism of action. Computational approaches to identifying functional residues have emerged as low-cost alternatives to experimental methods by providing fast and large-scale analyses. Moreover, the demand for such approaches is increasing as more sequences become available from genome sequencing projects. Here, we focus on the use of CRESCENDO to identify functional residues in proteins of known structure by comparing the amino acid substitutions observed in a family of proteins with those predicted on the basis of the protein structure. CRESCENDO uses Environment Specific Substitution Tables, or ESSTs, which define the way that accepted amino acid substitutions are influenced by the local structural environment. We describe how the calculation of ESSTs can be improved by using only amino acids that are not involved in catalytic activity, metal or ligand binding, nucleic acid or protein interactions, and other molecular functions. Our new substitution table can better describe the degree of amino acids substitutions that are under structural restraints. It should be of value in all applications of ESSTs, including their use in sequence–structure homology recognition, structure validation, and structure prediction in addition to their use in the identification of functional residues. These approaches should enhance the understanding of protein structure and function, which is critically important in the postgenomic era.

Collapse

Bernsel A, Viklund H, Elofsson A. Remote homology detection of integral membrane proteins using conserved sequence features. Proteins 2008;71:1387-99. [PMID: 18076048 DOI: 10.1002/prot.21825] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]

Lee MM, Bundschuh R, Chan MK. Distant homology detection using a LEngth and STructure-based sequence Alignment Tool (LESTAT). Proteins 2007;71:1409-19. [DOI: 10.1002/prot.21830] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Goonesekere NCW, Lee B. Context-specific amino acid substitution matrices and their use in the detection of protein homologs. Proteins 2007;71:910-9. [DOI: 10.1002/prot.21775] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Smith RE, Lovell SC, Burke DF, Montalvao RW, Blundell TL. Andante: reducing side-chain rotamer search space during comparative modeling using environment-specific substitution probabilities. Bioinformatics 2007;23:1099-105. [PMID: 17341496 DOI: 10.1093/bioinformatics/btm073] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Baussand J, Deremble C, Carbone A. Periodic distributions of hydrophobic amino acids allows the definition of fundamental building blocks to align distantly related proteins. Proteins 2007;67:695-708. [PMID: 17299747 DOI: 10.1002/prot.21319] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

Several studies on large and small families of proteins proved in a general manner that hydrophobic amino acids are globally conserved even if they are subjected to high rate substitution. Statistical analysis of amino acids evolution within blocks of hydrophobic amino acids detected in sequences suggests their usage as a basic structural pattern to align pairs of proteins of less than 25% sequence identity, with no need of knowing their 3D structure. The authors present a new global alignment method and an automatic tool for Proteins with HYdrophobic Blocks ALignment (PHYBAL) based on the combinatorics of overlapping hydrophobic blocks. Two substitution matrices modeling a different selective pressure inside and outside hydrophobic blocks are constructed, the Inside Hydrophobic Blocks Matrix and the Outside Hydrophobic Blocks Matrix, and a 4D space of gap values is explored. PHYBAL performance is evaluated against Needleman and Wunsch algorithm run with Blosum 30, Blosum 45, Blosum 62, Gonnet, HSDM, PAM250, Johnson and Remote Homo matrices. PHYBAL behavior is analyzed on eight randomly selected pairs of proteins of >30% sequence identity that cover a large spectrum of structural properties. It is also validated on two large datasets, the 127 pairs of the Domingues dataset with >30% sequence identity, and 181 pairs issued from BAliBASE 2.0 and ranked by percentage of identity from 7 to 25%. Results confirm the importance of considering substitution matrices modeling hydrophobic contexts and a 4D space of gap values in aligning distantly related proteins. Two new notions of local and global stability are defined to assess the robustness of an alignment algorithm and the accuracy of PHYBAL. A new notion, the SAD-coefficient, to assess the difficulty of structural alignment is also introduced. PHYBAL has been compared with Hydrophobic Cluster Analysis and HMMSUM methods.

Collapse

Rai BK, Madrid-Aliste CJ, Fajardo JE, Fiser A. MMM: a sequence-to-structure alignment protocol. Bioinformatics 2006;22:2691-2. [PMID: 16928737 DOI: 10.1093/bioinformatics/btl449] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Rai BK, Fiser A. Multiple mapping method: a novel approach to the sequence-to-structure alignment problem in comparative protein structure modeling. Proteins 2006;63:644-61. [PMID: 16437570 DOI: 10.1002/prot.20835] [Citation(s) in RCA: 59] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Xiang Z. Advances in homology protein structure modeling. Curr Protein Pept Sci 2006;7:217-27. [PMID: 16787261 PMCID: PMC1839925 DOI: 10.2174/138920306777452312] [Citation(s) in RCA: 302] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Congreve M, Murray CW, Blundell TL. Structural biology and drug discovery. Drug Discov Today 2006;10:895-907. [PMID: 15993809 DOI: 10.1016/s1359-6446(05)03484-7] [Citation(s) in RCA: 163] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Dunbrack RL. Sequence comparison and protein structure prediction. Curr Opin Struct Biol 2006;16:374-84. [PMID: 16713709 DOI: 10.1016/j.sbi.2006.05.006] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2006] [Revised: 03/22/2006] [Accepted: 05/08/2006] [Indexed: 10/24/2022]

Blundell TL, Sibanda BL, Montalvão RW, Brewerton S, Chelliah V, Worth CL, Harmer NJ, Davies O, Burke D. Structural biology and bioinformatics in drug design: opportunities and challenges for target identification and lead discovery. Philos Trans R Soc Lond B Biol Sci 2006;361:413-23. [PMID: 16524830 PMCID: PMC1609333 DOI: 10.1098/rstb.2005.1800] [Citation(s) in RCA: 98] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open

Litvinov II, Lobanov MY, Mironov AA, Finkelshtein AV, Roytberg MA. Information on the secondary structure improves the quality of protein sequence alignment. Mol Biol 2006. [DOI: 10.1134/s0026893306030149] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]

Fernandez-Fuentes N, Oliva B, Fiser A. A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucleic Acids Res 2006;34:2085-97. [PMID: 16617149 PMCID: PMC1440879 DOI: 10.1093/nar/gkl156] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105 950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed ϕ/ψ main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 Å of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a ∼5:1 ratio an earlier developed database search method.

Collapse

Qiu J, Elber R. SSALN: an alignment algorithm using structure-dependent substitution matrices and gap penalties learned from structurally aligned protein pairs. Proteins 2006;62:881-91. [PMID: 16385554 DOI: 10.1002/prot.20854] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Chu CK, Feng LL, Wouters MA. Comparison of sequence and structure-based datasets for nonredundant structural data mining. Proteins 2006;60:577-83. [PMID: 16001417 DOI: 10.1002/prot.20505] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Skolnick J. In quest of an empirical potential for protein structure prediction. Curr Opin Struct Biol 2006;16:166-71. [PMID: 16524716 DOI: 10.1016/j.sbi.2006.02.004] [Citation(s) in RCA: 112] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Revised: 02/10/2006] [Accepted: 02/23/2006] [Indexed: 11/19/2022]

Chen Y, Crippen GM. A novel approach to structural alignment using realistic structural and environmental information. Protein Sci 2005;14:2935-46. [PMID: 16260755 PMCID: PMC2253243 DOI: 10.1110/ps.051428205] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Johnston CR, Shields DC. A sequence sub-sampling algorithm increases the power to detect distant homologues. Nucleic Acids Res 2005;33:3772-8. [PMID: 16006623 PMCID: PMC1174907 DOI: 10.1093/nar/gki687] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Sillitoe I, Dibley M, Bray J, Addou S, Orengo C. Assessing strategies for improved superfamily recognition. Protein Sci 2005;14:1800-10. [PMID: 15937274 PMCID: PMC2253352 DOI: 10.1110/ps.041056105] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]

Hamelryck T. An amino acid has two sides: A new 2D measure provides a different view of solvent exposure. Proteins 2005;59:38-48. [DOI: 10.1002/prot.20379] [Citation(s) in RCA: 111] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

EvDTree: structure-dependent substitution profiles based on decision tree classification of 3D environments. BMC Bioinformatics 2005;6:4. [PMID: 15638949 PMCID: PMC545998 DOI: 10.1186/1471-2105-6-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2004] [Accepted: 01/10/2005] [Indexed: 12/04/2022] Open

Abstract

Background

Structure-dependent substitution matrices increase the accuracy of sequence alignments when the 3D structure of one sequence is known, and are successful e.g. in fold recognition. We propose a new automated method, EvDTree, based on a decision tree algorithm, for automatic derivation of amino acid substitution probabilities from a set of sequence-structure alignments. The main advantage over other approaches is an unbiased automatic selection of the most informative structural descriptors and associated values or thresholds. This feature allows automatic derivation of structure-dependent substitution scores for any specific set of structures, without the need to empirically determine best descriptors and parameters.

Results

Decision trees for residue substitutions were constructed for each residue type from sequence-structure alignments extracted from the HOMSTRAD database. For each tree cluster, environment-dependent substitution profiles were derived. The resulting structure-dependent substitution scores were assessed using a criterion based on the mean ranking of observed substitution among all possible substitutions and in sequence-structure alignments. The automatically built EvDTree substitution scores provide significantly better results than conventional matrices and similar or slightly better results than other structure-dependent matrices. EvDTree has been applied to small disulfide-rich proteins as a test case to automatically derive specific substitutions scores providing better results than non-specific substitution scores. Analyses of the decision tree classifications provide useful information on the relative importance of different structural descriptors.

Conclusions

We propose a fully automatic method for the classification of structural environments and inference of structure-dependent substitution profiles. We show that this approach is more accurate than existing methods for various applications. The easy adaptation of EvDTree to any specific data set opens the way for class-specific structure-dependent substitution scores which can be used in threading-based remote homology searches.

Collapse

Pei J, Grishin NV. Combining evolutionary and structural information for local protein structure prediction. Proteins 2004;56:782-94. [PMID: 15281130 DOI: 10.1002/prot.20158] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

We study the effects of various factors in representing and combining evolutionary and structural information for local protein structural prediction based on fragment selection. We prepare databases of fragments from a set of non-redundant protein domains. For each fragment, evolutionary information is derived from homologous sequences and represented as estimated effective counts and frequencies of amino acids (evolutionary frequencies) at each position. Position-specific amino acid preferences called structural frequencies are derived from statistical analysis of discrete local structural environments in database structures. Our method for local structure prediction is based on ranking and selecting database fragments that are most similar to a target fragment. Using secondary structure type as a local structural property, we test our method in a number of settings. The major findings are: (1) the COMPASS-type scoring function for fragment similarity comparison gives better prediction accuracy than three other tested scoring functions for profile-profile comparison. We show that the COMPASS-type scoring function can be derived both in the probabilistic framework and in the framework of statistical potentials. (2) Using the evolutionary frequencies of database fragments gives better prediction accuracy than using structural frequencies. (3) Finer definition of local environments, such as including more side-chain solvent accessibility classes and considering the backbone conformations of neighboring residues, gives increasingly better prediction accuracy using structural frequencies. (4) Combining evolutionary and structural frequencies of database fragments, either in a linear fashion or using a pseudocount mixture formula, results in improvement of prediction accuracy. Combination at the log-odds score level is not as effective as combination at the frequency level. This suggests that there might be better ways of combining sequence and structural information than the commonly used linear combination of log-odds scores. Our method of fragment selection and frequency combination gives reasonable results of secondary structure prediction tested on 56 CASP5 targets (average SOV score 0.77), suggesting that it is a valid method for local protein structure prediction. Mixture of predicted structural frequencies and evolutionary frequencies improve the quality of local profile-to-profile alignment by COMPASS.

Collapse

Massé N, Ainouze M, Néel B, Wild TF, Buckland R, Langedijk JPM. Measles virus (MV) hemagglutinin: evidence that attachment sites for MV receptors SLAM and CD46 overlap on the globular head. J Virol 2004;78:9051-63. [PMID: 15308701 PMCID: PMC506930 DOI: 10.1128/jvi.78.17.9051-9063.2004] [Citation(s) in RCA: 73] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2004] [Accepted: 04/20/2004] [Indexed: 11/20/2022] Open

Sadreyev RI, Grishin NV. Estimates of statistical significance for comparison of individual positions in multiple sequence alignments. BMC Bioinformatics 2004;5:106. [PMID: 15296518 PMCID: PMC516024 DOI: 10.1186/1471-2105-5-106] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2004] [Accepted: 08/05/2004] [Indexed: 11/17/2022] Open

Przybylski D, Rost B. Improving Fold Recognition Without Folds. J Mol Biol 2004;341:255-69. [PMID: 15312777 DOI: 10.1016/j.jmb.2004.05.041] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2004] [Revised: 05/18/2004] [Accepted: 05/18/2004] [Indexed: 11/21/2022]

Zhang Z, Kochhar S, Grigorov M. Exploring the sequence-structure protein landscape in the glycosyltransferase family. Protein Sci 2004;12:2291-302. [PMID: 14500887 PMCID: PMC2366918 DOI: 10.1110/ps.03131303] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Cherkasov A, Jones SJM. An approach to large scale identification of non-obvious structural similarities between proteins. BMC Bioinformatics 2004;5:61. [PMID: 15147578 PMCID: PMC434491 DOI: 10.1186/1471-2105-5-61] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2004] [Accepted: 05/17/2004] [Indexed: 11/13/2022] Open

Cherkasov A, Jones SJM. Structural characterization of genomes by large scale sequence-structure threading. BMC Bioinformatics 2004;5:37. [PMID: 15061866 PMCID: PMC419331 DOI: 10.1186/1471-2105-5-37] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2003] [Accepted: 04/03/2004] [Indexed: 12/02/2022] Open

Wallner B, Fang H, Ohlson T, Frey-Skött J, Elofsson A. Using evolutionary information for the query and target improves fold recognition. Proteins 2004;54:342-50. [PMID: 14696196 DOI: 10.1002/prot.10565] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]