Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Kundrotas PJ, Alexov EG. Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives. BMC Bioinformatics 2006;7:503. [PMID: 17109752 DOI: 10.1186/1471-2105-7-503] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Accepted: 11/16/2006] [Indexed: 11/10/2022] Open

For:	Kundrotas PJ, Alexov EG. Predicting residue contacts using pragmatic correlated mutations method: reducing the false positives. BMC Bioinformatics 2006;7:503. [PMID: 17109752 DOI: 10.1186/1471-2105-7-503] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2006] [Accepted: 11/16/2006] [Indexed: 11/10/2022] Open

Number

Cited by Other Article(s)

Subramanian K, Mitusińska K, Raedts J, Almourfi F, Joosten HJ, Hendriks S, Sedelnikova SE, Kengen SWM, Hagen WR, Góra A, Martins Dos Santos VAP, Baker PJ, van der Oost J, Schaap PJ. Distant Non-Obvious Mutations Influence the Activity of a Hyperthermophilic Pyrococcus furiosus Phosphoglucose Isomerase. Biomolecules 2019;9:biom9060212. [PMID: 31159273 PMCID: PMC6627849 DOI: 10.3390/biom9060212] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2019] [Revised: 05/20/2019] [Accepted: 05/28/2019] [Indexed: 01/06/2023] Open

Affiliation(s)

Kalyanasundaram Subramanian Laboratory of Systems and Synthetic Biology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.
Karolina Mitusińska Biotechnology Center, Silesian University of Technology, ul. Krzywoustego 8, 44-100 Gliwice, Poland. Faculty of Chemistry, Silesian University of Technology, ul. Strzody 9, 44-100 Gliwice, Poland.
John Raedts Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.
Feras Almourfi Saudi Human Genome Project, National Center of Genome Technology, King Abdulaziz City for Science and Technology (KACST), Riyadh 11442, Saudi Arabia.
Henk-Jan Joosten Bio-Prodict, Nieuwe Marktstraat 54E, 6511 AA Nijmegen, The Netherlands.
Sjon Hendriks Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.
Svetlana E Sedelnikova The Krebs Institute for Biomolecular Research, Department of Molecular Biology and Biotechnology, University of Sheffield, Sheffield S10 2TN, UK.
Servé W M Kengen Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.
Wilfred R Hagen Department of Biotechnology, Delft University of Technology, Van der Maasweg 9, 2629 HZ Delft, The Netherlands.
Artur Góra Biotechnology Center, Silesian University of Technology, ul. Krzywoustego 8, 44-100 Gliwice, Poland.
Vitor A P Martins Dos Santos Laboratory of Systems and Synthetic Biology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.
Patrick J Baker The Krebs Institute for Biomolecular Research, Department of Molecular Biology and Biotechnology, University of Sheffield, Sheffield S10 2TN, UK.
John van der Oost Laboratory of Microbiology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.
Peter J Schaap Laboratory of Systems and Synthetic Biology, Wageningen University, Stippeneng 4, 6708 WE Wageningen, The Netherlands.

Collapse

Bordner AJ, Mittelmann HD. A new formulation of protein evolutionary models that account for structural constraints. Mol Biol Evol 2013;31:736-49. [PMID: 24307688 DOI: 10.1093/molbev/mst240] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open

Abstract

Despite the importance of a thermodynamically stable structure with a conserved fold for protein function, almost all evolutionary models neglect site-site correlations that arise from physical interactions between neighboring amino acid sites. This is mainly due to the difficulty in formulating a computationally tractable model since rate matrices can no longer be used. Here, we introduce a general framework, based on factor graphs, for constructing probabilistic models of protein evolution with site interdependence. Conveniently, efficient approximate inference algorithms, such as Belief Propagation, can be used to calculate likelihoods for these models. We fit an amino acid substitution model of this type that accounts for both solvent accessibility and site-site correlations. Comparisons of the new model with rate matrix models and alternative structure-dependent models demonstrate that it better fits the sequence data. We also examine evolution within a family of homohexameric enzymes and find that site-site correlations between most contacting subunits contribute to a higher likelihood. In addition, we show that the new substitution model has a similar mathematical form to the one introduced in Rodrigue et al. (Rodrigue N, Lartillot N, Bryant D, Philippe H. 2005. Site interdependence attributed to tertiary structure in amino acid sequence evolution. Gene 347:207-217), although with different parameter interpretations and values. We also perform a statistical analysis of the effects of amino acids at neighboring sites on substitution probabilities and find a significant perturbation of most probabilities, further supporting the significant role of site-site interactions in protein evolution and motivating the development of new evolutionary models similar to the one described here. Finally, we discuss possible extensions and applications of the new substitution model.

Collapse

Ding W, Xie J, Dai D, Zhang H, Xie H, Zhang W. CNNcon: improved protein contact maps prediction using cascaded neural networks. PLoS One 2013;8:e61533. [PMID: 23626696 PMCID: PMC3634008 DOI: 10.1371/journal.pone.0061533] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2012] [Accepted: 03/11/2013] [Indexed: 11/18/2022] Open

Abstract

BACKGROUNDS

Despite continuing progress in X-ray crystallography and high-field NMR spectroscopy for determination of three-dimensional protein structures, the number of unsolved and newly discovered sequences grows much faster than that of determined structures. Protein modeling methods can possibly bridge this huge sequence-structure gap with the development of computational science. A grand challenging problem is to predict three-dimensional protein structure from its primary structure (residues sequence) alone. However, predicting residue contact maps is a crucial and promising intermediate step towards final three-dimensional structure prediction. Better predictions of local and non-local contacts between residues can transform protein sequence alignment to structure alignment, which can finally improve template based three-dimensional protein structure predictors greatly.

METHODS

CNNcon, an improved multiple neural networks based contact map predictor using six sub-networks and one final cascade-network, was developed in this paper. Both the sub-networks and the final cascade-network were trained and tested with their corresponding data sets. While for testing, the target protein was first coded and then input to its corresponding sub-networks for prediction. After that, the intermediate results were input to the cascade-network to finish the final prediction.

RESULTS

The CNNcon can accurately predict 58.86% in average of contacts at a distance cutoff of 8 Å for proteins with lengths ranging from 51 to 450. The comparison results show that the present method performs better than the compared state-of-the-art predictors. Particularly, the prediction accuracy keeps steady with the increase of protein sequence length. It indicates that the CNNcon overcomes the thin density problem, with which other current predictors have trouble. This advantage makes the method valuable to the prediction of long length proteins. As a result, the effective prediction of long length proteins could be possible by the CNNcon.

Collapse

Statistical Analysis of Terminal Extensions of Protein β-Strand Pairs. Adv Bioinformatics 2013;2013:909436. [PMID: 23424587 PMCID: PMC3569888 DOI: 10.1155/2013/909436] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2012] [Revised: 12/30/2012] [Accepted: 12/30/2012] [Indexed: 11/17/2022] Open

Zhang N, Feng Y, Gao S, Ruan J, Zhang T. New insights regarding protein folding as learned from beta-sheets. EXCLI JOURNAL 2012;11:543-55. [PMID: 27540347 PMCID: PMC4983712] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2012] [Accepted: 08/22/2012] [Indexed: 10/28/2022]

Abstract

The folding of denatured proteins into their native conformations is called Anfinsen's dogma, and is the rationale for predicting protein structures based on primary sequences. Through the last 40 years of study, all available algorithms which either predict 3D or 2D protein structures, or predict the rate of protein folding based on the amino acid sequence alone, are limited in accuracy (80 %). This fact has led some researchers to look for the lost information, from mRNA to protein sequences, and it encourages us to rethink the rationale of Anfinsen's dogma. In this study, we focus on the relationship between the strand and its partners. We find two rules based on a non-redundant dataset taken from the PDB database. We refer to these two rules as the "first coming first pairing" rule and the "loveless" rule. The first coming first pairing rule indicates that a given strand prefers to pair with the next strand, if the connected region is flexible enough. The loveless rule means that the affinities between a given strand and another strand are comparable to the affinity between the given strand and its partner. Of course, the affinities between the given strand and a helix/coil peptide are significantly less than the affinity between the given strand and its partner. These two rules suggest that in protein folding, we have folding taking place during translation, and suggest also that a denatured protein is not the same as its primary sequence. Rechecking the original Anfinsen experiments, we find that the method used to denature protein in the experiment simply breaks the disulfide bonds, while the helices and sheets remain intact. In other words, denatured proteins still retain all helices and beta sheets, while the primary sequence does not. Although further verification via biological experiments is needed, our results as shown in this study may reveal a new insight for studying protein folding.

Collapse

Biniashvili T, Schreiber E, Kliger Y. Improving Classical Substructure-Based Virtual Screening to Handle Extrapolation Challenges. J Chem Inf Model 2012;52:678-85. [DOI: 10.1021/ci200472s] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Livesay DR, Kreth KE, Fodor AA. A critical evaluation of correlated mutation algorithms and coevolution within allosteric mechanisms. Methods Mol Biol 2012;796:385-398. [PMID: 22052502 DOI: 10.1007/978-1-61779-334-9_21] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]

Henriksen SB, Mortensen RJ, Geertz-Hansen HM, Neves-Petersen MT, Arnason O, Söring J, Petersen SB. Hyperdimensional analysis of amino acid pair distributions in proteins. PLoS One 2011;6:e25638. [PMID: 22174733 PMCID: PMC3235099 DOI: 10.1371/journal.pone.0025638] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 09/08/2011] [Indexed: 01/06/2023] Open

Abstract

Our manuscript presents a novel approach to protein structure analyses. We have organized an 8-dimensional data cube with protein 3D-structural information from 8706 high-resolution non-redundant protein-chains with the aim of identifying packing rules at the amino acid pair level. The cube contains information about amino acid type, solvent accessibility, spatial and sequence distance, secondary structure and sequence length. We are able to pose structural queries to the data cube using program ProPack. The response is a 1, 2 or 3D graph. Whereas the response is of a statistical nature, the user can obtain an instant list of all PDB-structures where such pair is found. The user may select a particular structure, which is displayed highlighting the pair in question. The user may pose millions of different queries and for each one he will receive the answer in a few seconds. In order to demonstrate the capabilities of the data cube as well as the programs, we have selected well known structural features, disulphide bridges and salt bridges, where we illustrate how the queries are posed, and how answers are given. Motifs involving cysteines such as disulphide bridges, zinc-fingers and iron-sulfur clusters are clearly identified and differentiated. ProPack also reveals that whereas pairs of Lys residues virtually never appear in close spatial proximity, pairs of Arg are abundant and appear at close spatial distance, contrasting the belief that electrostatic repulsion would prevent this juxtaposition and that Arg-Lys is perceived as a conservative mutation. The presented programs can find and visualize novel packing preferences in proteins structures allowing the user to unravel correlations between pairs of amino acids. The new tools allow the user to view statistical information and visualize instantly the structures that underpin the statistical information, which is far from trivial with most other SW tools for protein structure analysis.

Collapse

Wu S, Szilagyi A, Zhang Y. Improving protein structure prediction using multiple sequence-based contact predictions. Structure 2011;19:1182-91. [PMID: 21827953 DOI: 10.1016/j.str.2011.05.004] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2011] [Revised: 04/13/2011] [Accepted: 05/12/2011] [Indexed: 11/25/2022]

Eickholt J, Wang Z, Cheng J. A conformation ensemble approach to protein residue-residue contact. BMC STRUCTURAL BIOLOGY 2011;11:38. [PMID: 21989082 PMCID: PMC3200154 DOI: 10.1186/1472-6807-11-38] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Accepted: 10/12/2011] [Indexed: 11/20/2022]

Monastyrskyy B, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residue-residue contact predictions in CASP9. Proteins 2011;79 Suppl 10:119-25. [PMID: 21928322 DOI: 10.1002/prot.23160] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2011] [Revised: 06/25/2011] [Accepted: 07/27/2011] [Indexed: 01/03/2023]

Use of mutual information arrays to predict coevolving sites in the full length HIV gp120 protein for subtypes B and C. Virol Sin 2011;26:95-104. [PMID: 21468932 DOI: 10.1007/s12250-011-3188-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Accepted: 02/22/2011] [Indexed: 10/18/2022] Open

A Consensus Approach to Predicting Protein Contact Map via Logistic Regression. BIOINFORMATICS RESEARCH AND APPLICATIONS 2011. [DOI: 10.1007/978-3-642-21260-4_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Kowarsch A, Fuchs A, Frishman D, Pagel P. Correlated mutations: a hallmark of phenotypic amino acid substitutions. PLoS Comput Biol 2010;6. [PMID: 20862353 PMCID: PMC2940720 DOI: 10.1371/journal.pcbi.1000923] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2009] [Accepted: 08/09/2010] [Indexed: 11/18/2022] Open

Abstract

Point mutations resulting in the substitution of a single amino acid can cause severe functional consequences, but can also be completely harmless. Understanding what determines the phenotypical impact is important both for planning targeted mutation experiments in the laboratory and for analyzing naturally occurring mutations found in patients. Common wisdom suggests using the extent of evolutionary conservation of a residue or a sequence motif as an indicator of its functional importance and thus vulnerability in case of mutation. In this work, we put forward the hypothesis that in addition to conservation, co-evolution of residues in a protein influences the likelihood of a residue to be functionally important and thus associated with disease. While the basic idea of a relation between co-evolution and functional sites has been explored before, we have conducted the first systematic and comprehensive analysis of point mutations causing disease in humans with respect to correlated mutations. We included 14,211 distinct positions with known disease-causing point mutations in 1,153 human proteins in our analysis. Our data show that (1) correlated positions are significantly more likely to be disease-associated than expected by chance, and that (2) this signal cannot be explained by conservation patterns of individual sequence positions. Although correlated residues have primarily been used to predict contact sites, our data are in agreement with previous observations that (3) many such correlations do not relate to physical contacts between amino acid residues. Access to our analysis results are provided at http://webclu.bio.wzw.tum.de/~pagel/supplements/correlated-positions/.

Point mutations (i.e., changes of a single sequence element) can have a severe impact on protein function. Many diseases are caused by such minute defects. On the other hand, the majority of such mutations does not lead to noticeable effects. Although previous research has revealed important aspects that influence or predict the chance of a mutation to cause disease, much remains to be learned before we fully understand this complex problem. In our work, we use the observation that sometimes certain positions in a protein mutate in an apparently correlated fashion and analyze this correlation with respect to mutation vulnerability. Our results show that positions exhibiting evolutionary correlation are significantly more likely to be vulnerable to mutation than average positions. On one hand, our data further support the concept of correlated positions to not only be associated with protein contacts but also functional sites and/or disease positions (as introduced by others). On the other hand, this could be useful to further improve the understanding and prediction of the consequences of mutations. Our work is the first to attempt a large-scale quantitation of this relationship.

Collapse

Fromer M, Linial M. Exposing the co-adaptive potential of protein-protein interfaces through computational sequence design. ACTA ACUST UNITED AC 2010;26:2266-72. [PMID: 20679332 DOI: 10.1093/bioinformatics/btq412] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Tress ML, Valencia A. Predicted residue-residue contacts can help the scoring of 3D models. Proteins 2010;78:1980-91. [PMID: 20408174 DOI: 10.1002/prot.22714] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Rajgaria R, Wei Y, Floudas CA. Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 2010;78:1825-46. [PMID: 20225257 PMCID: PMC2858251 DOI: 10.1002/prot.22696] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Karakaş M, Woetzel N, Meiler J. BCL::contact-low confidence fold recognition hits boost protein contact prediction and de novo structure determination. J Comput Biol 2010;17:153-68. [PMID: 19772383 DOI: 10.1089/cmb.2009.0030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Ashkenazy H, Kliger Y. Reducing phylogenetic bias in correlated mutation analysis. Protein Eng Des Sel 2010;23:321-6. [PMID: 20067922 DOI: 10.1093/protein/gzp078] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Kuipers RKP, Joosten HJ, Verwiel E, Paans S, Akerboom J, van der Oost J, Leferink NGH, van Berkel WJH, Vriend G, Schaap PJ. Correlated mutation analyses on super-family alignments reveal functionally important residues. Proteins 2009;76:608-16. [DOI: 10.1002/prot.22374] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Xue B, Faraggi E, Zhou Y. Predicting residue-residue contact maps by a two-layer, integrated neural-network method. Proteins 2009;76:176-83. [PMID: 19137600 DOI: 10.1002/prot.22329] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Tegge AN, Wang Z, Eickholt J, Cheng J. NNcon: improved protein contact map prediction using 2D-recursive neural networks. Nucleic Acids Res 2009;37:W515-8. [PMID: 19420062 PMCID: PMC2703959 DOI: 10.1093/nar/gkp305] [Citation(s) in RCA: 110] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2009] [Revised: 04/13/2009] [Accepted: 04/16/2009] [Indexed: 11/13/2022] Open

Chakrabarti S, Panchenko AR. Coevolution in defining the functional specificity. Proteins 2009;75:231-40. [PMID: 18831050 DOI: 10.1002/prot.22239] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Gao X, Bu D, Xu J, Li M. Improving consensus contact prediction via server correlation reduction. BMC STRUCTURAL BIOLOGY 2009;9:28. [PMID: 19419562 PMCID: PMC2689239 DOI: 10.1186/1472-6807-9-28] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Accepted: 05/06/2009] [Indexed: 11/10/2022]

Abstract

Background

Protein inter-residue contacts play a crucial role in the determination and prediction of protein structures. Previous studies on contact prediction indicate that although template-based consensus methods outperform sequence-based methods on targets with typical templates, such consensus methods perform poorly on new fold targets. However, we find out that even for new fold targets, the models generated by threading programs can contain many true contacts. The challenge is how to identify them.

Results

In this paper, we develop an integer linear programming model for consensus contact prediction. In contrast to the simple majority voting method assuming that all the individual servers are equally important and independent, the newly developed method evaluates their correlation by using maximum likelihood estimation and extracts independent latent servers from them by using principal component analysis. An integer linear programming method is then applied to assign a weight to each latent server to maximize the difference between true contacts and false ones. The proposed method is tested on the CASP7 data set. If the top L/5 predicted contacts are evaluated where L is the protein size, the average accuracy is 73%, which is much higher than that of any previously reported study. Moreover, if only the 15 new fold CASP7 targets are considered, our method achieves an average accuracy of 37%, which is much better than that of the majority voting method, SVM-LOMETS, SVM-SEQ, and SAM-T06. These methods demonstrate an average accuracy of 13.0%, 10.8%, 25.8% and 21.2%, respectively.

Conclusion

Reducing server correlation and optimally combining independent latent servers show a significant improvement over the traditional consensus methods. This approach can hopefully provide a powerful tool for protein structure refinement and prediction use.

Collapse

Samsonov SA, Teyra J, Anders G, Pisabarro MT. Analysis of the impact of solvent on contacts prediction in proteins. BMC STRUCTURAL BIOLOGY 2009;9:22. [PMID: 19368710 PMCID: PMC2676287 DOI: 10.1186/1472-6807-9-22] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/15/2008] [Accepted: 04/15/2009] [Indexed: 11/10/2022]

Björkholm P, Daniluk P, Kryshtafovych A, Fidelis K, Andersson R, Hvidsten TR. Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue-residue contacts. ACTA ACUST UNITED AC 2009;25:1264-70. [PMID: 19289446 DOI: 10.1093/bioinformatics/btp149] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Samsonov S, Teyra J, Pisabarro MT. A molecular dynamics approach to study the importance of solvent in protein interactions. Proteins 2009;73:515-25. [PMID: 18452208 DOI: 10.1002/prot.22076] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Abstract

Water constitutes the cellular environment for biomolecules to interact. Solvent is important for protein folding and stability, and it is also known to actively participate in many catalytic processes in the cell. However, solvent is often ignored in molecular recognition and not taken into account in protein-protein interaction studies and rational design. Previously we developed SCOWLP, a database and its web application (http://www.scowlp.org), to perform studies on the contribution of solvent to protein interface definition in all protein complexes of the PDB. We introduced the concept of wet spots, interfacial residues interacting only through one water molecule, which were shown to considerably enrich protein interface descriptions. Analysis of interfacial solvent in a nonredundant dataset of protein complexes suggested the importance of including interfacial water molecules in protein interaction studies. In this work we use a molecular dynamics approach to gain deeper insights into solvent contribution to protein interfaces. We characterize the dynamic and energetic properties of water-mediated protein interactions by comparing different interfacial interaction types (direct, dual and wet spot) at residue and solvent level. For this purpose, we perform an analysis of 17 representative complexes from two protein families of different interface nature. Energetically wet spots are quantitatively comparable to other residues in interfaces, and their mobility is shown to be lower than protein surface residues. The residence time of water molecules in wet spots sites is higher than of those on the surface of the protein. In terms of free energy, though wet-spots-forming water molecules are very heterogeneous, their contribution to the free energy of complex formation is considerable. We find that water molecules can play an important role in interaction conservation in protein interfaces by allowing sequence variability in the corresponding binding partner, and we discuss the important implications of our observations related to the use of the correlated mutations concept in protein interactions studies. The results obtained in this work help to deepen our understanding of the physico-chemical nature underlying protein-protein interactions and strengthen the idea of using the wet spots concept to qualitatively improve the accuracy of folding, docking and rational design algorithms.

Collapse

Fatakia SN, Costanzi S, Chow CC. Computing highly correlated positions using mutual information and graph theory for G protein-coupled receptors. PLoS One 2009;4:e4681. [PMID: 19262747 PMCID: PMC2650788 DOI: 10.1371/journal.pone.0004681] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2008] [Accepted: 01/07/2009] [Indexed: 01/06/2023] Open

Rajgaria R, McAllister SR, Floudas CA. Towards accurate residue-residue hydrophobic contact prediction for alpha helical proteins via integer linear optimization. Proteins 2009;74:929-47. [PMID: 18767158 DOI: 10.1002/prot.22202] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Fuchs A, Kirschner A, Frishman D. Prediction of helix-helix contacts and interacting helices in polytopic membrane proteins using neural networks. Proteins 2009;74:857-71. [PMID: 18704938 DOI: 10.1002/prot.22194] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Abstract

Despite rapidly increasing numbers of available 3D structures, membrane proteins still account for less than 1% of all structures in the Protein Data Bank. Recent high-resolution structures indicate a clearly broader structural diversity of membrane proteins than initially anticipated, motivating the development of reliable structure prediction methods specifically tailored for this class of molecules. One important prediction target capturing all major aspects of a protein's 3D structure is its contact map. Our analysis shows that computational methods trained to predict residue contacts in globular proteins perform poorly when applied to membrane proteins. We have recently published a method to identify interacting alpha-helices in membrane proteins based on the analysis of coevolving residues in predicted transmembrane regions. Here, we present a substantially improved algorithm for the same problem, which uses a newly developed neural network approach to predict helix-helix contacts. In addition to the input features commonly used for contact prediction of soluble proteins, such as windowed residue profiles and residue distance in the sequence, our network also incorporates features that apply to membrane proteins only, such as residue position within the transmembrane segment and its orientation toward the lipophilic environment. The obtained neural network can predict contacts between residues in transmembrane segments with nearly 26% accuracy. It is therefore the first published contact predictor developed specifically for membrane proteins performing with equal accuracy to state-of-the-art contact predictors available for soluble proteins. The predicted helix-helix contacts were employed in a second step to identify interacting helices. For our dataset consisting of 62 membrane proteins of solved structure, we gained an accuracy of 78.1%. Because the reliable prediction of helix interaction patterns is an important step in the classification and prediction of membrane protein folds, our method will be a helpful tool in compiling a structural census of membrane proteins.

Collapse

Ashkenazy H, Unger R, Kliger Y. Optimal data collection for correlated mutation analysis. Proteins 2009;74:545-55. [PMID: 18655065 DOI: 10.1002/prot.22168] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Miteva MA, Alexov E, Villoutreix BO. Protein structure analysis online. ACTA ACUST UNITED AC 2008;Chapter 2:Unit 2.13. [PMID: 18429316 DOI: 10.1002/0471140864.ps0213s50] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

The twilight zone between protein order and disorder. Biophys J 2008;95:1612-26. [PMID: 18441033 DOI: 10.1529/biophysj.108.131151] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. ACTA ACUST UNITED AC 2008;24:924-31. [PMID: 18296462 DOI: 10.1093/bioinformatics/btn069] [Citation(s) in RCA: 151] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

MOTIVATION

Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions.

RESULTS

We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into 'Easy', 'Medium', 'Hard' and 'Very Hard' targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12-25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation > or =24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.

Collapse

Fourty G, Callebaut I, Mornon JP. Characterization of non-trivial neighborhood fold constraints from protein sequences using generalized topohydrophobicity. Bioinform Biol Insights 2008;2:47-66. [PMID: 19812765 PMCID: PMC2735972 DOI: 10.4137/bbi.s426] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Izarzugaza JMG, Graña O, Tress ML, Valencia A, Clarke ND. Assessment of intramolecular contact predictions for CASP7. Proteins 2008;69 Suppl 8:152-8. [PMID: 17671976 DOI: 10.1002/prot.21637] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Fuchs A, Martin-Galiano AJ, Kalman M, Fleishman S, Ben-Tal N, Frishman D. Co-evolving residues in membrane proteins. Bioinformatics 2007;23:3312-9. [DOI: 10.1093/bioinformatics/btm515] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Gouveia-Oliveira R, Pedersen AG. Finding coevolving amino acid residues using row and column weighting of mutual information and multi-dimensional amino acid representation. Algorithms Mol Biol 2007;2:12. [PMID: 17915013 PMCID: PMC2234412 DOI: 10.1186/1748-7188-2-12] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2007] [Accepted: 10/03/2007] [Indexed: 11/10/2022] Open

Eyal E, Pietrokovski S, Bahar I. Rapid assessment of correlated amino acids from pair-to-pair (P2P) substitution matrices. Bioinformatics 2007;23:1837-9. [PMID: 17496318 DOI: 10.1093/bioinformatics/btm256] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Cheng J, Baldi P. Improved residue contact prediction using support vector machines and a large feature set. BMC Bioinformatics 2007;8:113. [PMID: 17407573 PMCID: PMC1852326 DOI: 10.1186/1471-2105-8-113] [Citation(s) in RCA: 142] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2006] [Accepted: 04/02/2007] [Indexed: 11/12/2022] Open