Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Graña O, Baker D, MacCallum RM, Meiler J, Punta M, Rost B, Tress ML, Valencia A. CASP6 assessment of contact prediction. Proteins 2006;61 Suppl 7:214-224. [PMID: 16187364 DOI: 10.1002/prot.20739] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

For:	Graña O, Baker D, MacCallum RM, Meiler J, Punta M, Rost B, Tress ML, Valencia A. CASP6 assessment of contact prediction. Proteins 2006;61 Suppl 7:214-224. [PMID: 16187364 DOI: 10.1002/prot.20739] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Number

Cited by Other Article(s)

Tamborski J, Seong K, Liu F, Staskawicz BJ, Krasileva KV. Altering Specificity and Autoactivity of Plant Immune Receptors Sr33 and Sr50 Via a Rational Engineering Approach. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2023;36:434-446. [PMID: 36867580 PMCID: PMC10561695 DOI: 10.1094/mpmi-07-22-0154-r] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]

Konecki DM, Hamrick S, Wang C, Agosto MA, Wensel TG, Lichtarge O. CovET: A covariation-evolutionary trace method that identifies protein structure-function modules. J Biol Chem 2023;299:104896. [PMID: 37290531 PMCID: PMC10338321 DOI: 10.1016/j.jbc.2023.104896] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 06/01/2023] [Accepted: 06/02/2023] [Indexed: 06/10/2023] Open

Ruiz-Serra V, Pontes C, Milanetti E, Kryshtafovych A, Lepore R, Valencia A. Assessing the accuracy of contact and distance predictions in CASP14. Proteins 2021;89:1888-1900. [PMID: 34595772 DOI: 10.1002/prot.26248] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2021] [Revised: 09/06/2021] [Accepted: 09/21/2021] [Indexed: 12/26/2022]

Wu T, Liu J, Guo Z, Hou J, Cheng J. MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction. Sci Rep 2021;11:13155. [PMID: 34162922 PMCID: PMC8222248 DOI: 10.1038/s41598-021-92395-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/09/2021] [Indexed: 11/09/2022] Open

Abstract

Protein structure prediction is an important problem in bioinformatics and has been studied for decades. However, there are still few open-source comprehensive protein structure prediction packages publicly available in the field. In this paper, we present our latest open-source protein tertiary structure prediction system—MULTICOM2, an integration of template-based modeling (TBM) and template-free modeling (FM) methods. The template-based modeling uses sequence alignment tools with deep multiple sequence alignments to search for structural templates, which are much faster and more accurate than MULTICOM1. The template-free (ab initio or de novo) modeling uses the inter-residue distances predicted by DeepDist to reconstruct tertiary structure models without using any known structure as template. In the blind CASP14 experiment, the average TM-score of the models predicted by our server predictor based on the MULTICOM2 system is 0.720 for 58 TBM (regular) domains and 0.514 for 38 FM and FM/TBM (hard) domains, indicating that MULTICOM2 is capable of predicting good tertiary structures across the board. It can predict the correct fold for 76 CASP14 domains (95% regular domains and 55% hard domains) if only one prediction is made for a domain. The success rate is increased to 3% for both regular and hard domains if five predictions are made per domain. Moreover, the prediction accuracy of the pure template-free structure modeling method on both TBM and FM targets is very close to the combination of template-based and template-free modeling methods. This demonstrates that the distance-based template-free modeling method powered by deep learning can largely replace the traditional template-based modeling method even on TBM targets that TBM methods used to dominate and therefore provides a uniform structure modeling approach to any protein. Finally, on the 38 CASP14 FM and FM/TBM hard domains, MULTICOM2 server predictors (MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM-DIST) were ranked among the top 20 automated server predictors in the CASP14 experiment. After combining multiple predictors from the same research group as one entry, MULTICOM-HYBRID was ranked no. 5. The source code of MULTICOM2 is freely available at https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0.

Collapse

Shrestha R, Fajardo E, Gil N, Fidelis K, Kryshtafovych A, Monastyrskyy B, Fiser A. Assessing the accuracy of contact predictions in CASP13. Proteins 2019;87:1058-1068. [PMID: 31587357 PMCID: PMC6851495 DOI: 10.1002/prot.25819] [Citation(s) in RCA: 48] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Revised: 09/17/2019] [Accepted: 09/17/2019] [Indexed: 01/07/2023]

Jing X, Dong Q, Lu R, Dong Q. Protein Inter-Residue Contacts Prediction: Methods, Performances and Applications. Curr Bioinform 2019. [DOI: 10.2174/1574893613666181109130430] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Wuyun Q, Zheng W, Peng Z, Yang J. A large-scale comparative assessment of methods for residue-residue contact prediction. Brief Bioinform 2019;19:219-230. [PMID: 27802931 DOI: 10.1093/bib/bbw106] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Indexed: 11/14/2022] Open

Jones DT, Kandathil SM. High precision in protein contact prediction using fully convolutional neural networks and minimal sequence features. Bioinformatics 2018;34:3308-3315. [PMID: 29718112 PMCID: PMC6157083 DOI: 10.1093/bioinformatics/bty341] [Citation(s) in RCA: 112] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Revised: 03/06/2018] [Accepted: 04/25/2018] [Indexed: 12/22/2022] Open

Abstract

Motivation

In addition to substitution frequency data from protein sequence alignments, many state-of-the-art methods for contact prediction rely on additional sources of information, or features, of protein sequences in order to predict residue-residue contacts, such as solvent accessibility, predicted secondary structure, and scores from other contact prediction methods. It is unclear how much of this information is needed to achieve state-of-the-art results. Here, we show that using deep neural network models, simple alignment statistics contain sufficient information to achieve state-of-the-art precision. Our prediction method, DeepCov, uses fully convolutional neural networks operating on amino-acid pair frequency or covariance data derived directly from sequence alignments, without using global statistical methods such as sparse inverse covariance or pseudolikelihood estimation.

Results

Comparisons against CCMpred and MetaPSICOV2 show that using pairwise covariance data calculated from raw alignments as input allows us to match or exceed the performance of both of these methods. Almost all of the achieved precision is obtained when considering relatively local windows (around 15 residues) around any member of a given residue pairing; larger window sizes have comparable performance. Assessment on a set of shallow sequence alignments (fewer than 160 effective sequences) indicates that the new method is substantially more precise than CCMpred and MetaPSICOV2 in this regime, suggesting that improved precision is attainable on smaller sequence families. Overall, the performance of DeepCov is competitive with the state of the art, and our results demonstrate that global models, which employ features from all parts of the input alignment when predicting individual contacts, are not strictly needed in order to attain precise contact predictions.

Availability and implementation

DeepCov is freely available at https://github.com/psipred/DeepCov.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Schaarschmidt J, Monastyrskyy B, Kryshtafovych A, Bonvin AM. Assessment of contact predictions in CASP12: Co-evolution and deep learning coming of age. Proteins 2018;86 Suppl 1:51-66. [PMID: 29071738 PMCID: PMC5820169 DOI: 10.1002/prot.25407] [Citation(s) in RCA: 126] [Impact Index Per Article: 21.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2017] [Revised: 10/06/2017] [Accepted: 10/24/2017] [Indexed: 12/20/2022]

Teixeira PL, Mendenhall JL, Heinze S, Weiner B, Skwark MJ, Meiler J. Membrane protein contact and structure prediction using co-evolution in conjunction with machine learning. PLoS One 2017;12:e0177866. [PMID: 28542325 PMCID: PMC5443516 DOI: 10.1371/journal.pone.0177866] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2016] [Accepted: 05/04/2017] [Indexed: 11/18/2022] Open

Adhikari B, Nowotny J, Bhattacharya D, Hou J, Cheng J. ConEVA: a toolbox for comprehensive assessment of protein contacts. BMC Bioinformatics 2016;17:517. [PMID: 27923350 PMCID: PMC5142288 DOI: 10.1186/s12859-016-1404-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 12/01/2016] [Indexed: 12/31/2022] Open

Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. New encouraging developments in contact prediction: Assessment of the CASP11 results. Proteins 2016;84 Suppl 1:131-44. [PMID: 26474083 PMCID: PMC4834069 DOI: 10.1002/prot.24943] [Citation(s) in RCA: 69] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Revised: 09/15/2015] [Accepted: 10/11/2015] [Indexed: 12/27/2022]

Liu Y, Yan Z, Lu X, Xiao D, Jiang H. Improving the catalytic activity of isopentenyl phosphate kinase through protein coevolution analysis. Sci Rep 2016;6:24117. [PMID: 27052337 PMCID: PMC4823809 DOI: 10.1038/srep24117] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2015] [Accepted: 03/21/2016] [Indexed: 11/20/2022] Open

Pietal MJ, Bujnicki JM, Kozlowski LP. GDFuzz3D: a method for protein 3D structure reconstruction from contact maps, based on a non-Euclidean distance function. Bioinformatics 2015;31:3499-505. [PMID: 26130575 DOI: 10.1093/bioinformatics/btv390] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2014] [Accepted: 06/23/2015] [Indexed: 11/13/2022] Open

Banach M, Prudhomme N, Carpentier M, Duprat E, Papandreou N, Kalinowska B, Chomilier J, Roterman I. Contribution to the prediction of the fold code: application to immunoglobulin and flavodoxin cases. PLoS One 2015;10:e0125098. [PMID: 25915049 PMCID: PMC4411048 DOI: 10.1371/journal.pone.0125098] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2014] [Accepted: 03/20/2015] [Indexed: 12/19/2022] Open

Abstract

Background

Folding nucleus of globular proteins formation starts by the mutual interaction of a group of hydrophobic amino acids whose close contacts allow subsequent formation and stability of the 3D structure. These early steps can be predicted by simulation of the folding process through a Monte Carlo (MC) coarse grain model in a discrete space. We previously defined MIRs (Most Interacting Residues), as the set of residues presenting a large number of non-covalent neighbour interactions during such simulation. MIRs are good candidates to define the minimal number of residues giving rise to a given fold instead of another one, although their proportion is rather high, typically [15-20]% of the sequences. Having in mind experiments with two sequences of very high levels of sequence identity (up to 90%) but different folds, we combined the MIR method, which takes sequence as single input, with the “fuzzy oil drop” (FOD) model that requires a 3D structure, in order to estimate the residues coding for the fold. FOD assumes that a globular protein follows an idealised 3D Gaussian distribution of hydrophobicity density, with the maximum in the centre and minima at the surface of the “drop”. If the actual local density of hydrophobicity around a given amino acid is as high as the ideal one, then this amino acid is assigned to the core of the globular protein, and it is assumed to follow the FOD model. Therefore one obtains a distribution of the amino acids of a protein according to their agreement or rejection with the FOD model.

Results

We compared and combined MIR and FOD methods to define the minimal nucleus, or keystone, of two populated folds: immunoglobulin-like (Ig) and flavodoxins (Flav). The combination of these two approaches defines some positions both predicted as a MIR and assigned as accordant with the FOD model. It is shown here that for these two folds, the intersection of the predicted sets of residues significantly differs from random selection. It reduces the number of selected residues by each individual method and allows a reasonable agreement with experimentally determined key residues coding for the particular fold. In addition, the intersection of the two methods significantly increases the specificity of the prediction, providing a robust set of residues that constitute the folding nucleus.

Collapse

Kaján L, Hopf TA, Kalaš M, Marks DS, Rost B. FreeContact: fast and free software for protein contact prediction from residue co-evolution. BMC Bioinformatics 2014;15:85. [PMID: 24669753 PMCID: PMC3987048 DOI: 10.1186/1471-2105-15-85] [Citation(s) in RCA: 128] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2013] [Accepted: 03/18/2014] [Indexed: 11/10/2022] Open

Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residue-residue contact prediction in CASP10. Proteins 2013;82 Suppl 2:138-53. [PMID: 23760879 DOI: 10.1002/prot.24340] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2013] [Revised: 05/14/2013] [Accepted: 05/21/2013] [Indexed: 12/13/2022]

de Juan D, Pazos F, Valencia A. Emerging methods in protein co-evolution. Nat Rev Genet 2013;14:249-61. [PMID: 23458856 DOI: 10.1038/nrg3414] [Citation(s) in RCA: 412] [Impact Index Per Article: 37.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Statistical Analysis of Terminal Extensions of Protein β-Strand Pairs. Adv Bioinformatics 2013;2013:909436. [PMID: 23424587 PMCID: PMC3569888 DOI: 10.1155/2013/909436] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2012] [Revised: 12/30/2012] [Accepted: 12/30/2012] [Indexed: 11/17/2022] Open

Heinke F, Schildbach S, Stockmann D, Labudde D. eProS--a database and toolbox for investigating protein sequence-structure-function relationships through energy profiles. Nucleic Acids Res 2012;41:D320-6. [PMID: 23161695 PMCID: PMC3531212 DOI: 10.1093/nar/gks1079] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Karakaş M, Woetzel N, Staritzbichler R, Alexander N, Weiner BE, Meiler J. BCL::Fold--de novo prediction of complex and large protein topologies by assembly of secondary structure elements. PLoS One 2012;7:e49240. [PMID: 23173050 PMCID: PMC3500284 DOI: 10.1371/journal.pone.0049240] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2012] [Accepted: 10/07/2012] [Indexed: 01/10/2023] Open

Chitsaz M, Mayo SL. GRID: a high-resolution protein structure refinement algorithm. J Comput Chem 2012;34:445-50. [PMID: 23065773 DOI: 10.1002/jcc.23151] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Revised: 07/31/2012] [Accepted: 08/27/2012] [Indexed: 12/27/2022]

Eickholt J, Cheng J. Predicting protein residue-residue contacts using deep networks and boosting. Bioinformatics 2012;28:3066-72. [PMID: 23047561 DOI: 10.1093/bioinformatics/bts598] [Citation(s) in RCA: 122] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open

Jones DT, Buchan DWA, Cozzetto D, Pontil M. PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments. ACTA ACUST UNITED AC 2011;28:184-90. [PMID: 22101153 DOI: 10.1093/bioinformatics/btr638] [Citation(s) in RCA: 525] [Impact Index Per Article: 40.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Eickholt J, Wang Z, Cheng J. A conformation ensemble approach to protein residue-residue contact. BMC STRUCTURAL BIOLOGY 2011;11:38. [PMID: 21989082 PMCID: PMC3200154 DOI: 10.1186/1472-6807-11-38] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Accepted: 10/12/2011] [Indexed: 11/20/2022]

Wei Y, Floudas CA. Enhanced Inter-helical Residue Contact Prediction in Transmembrane Proteins. Chem Eng Sci 2011;66:4356-4369. [PMID: 21892227 PMCID: PMC3164537 DOI: 10.1016/j.ces.2011.04.033] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Monastyrskyy B, Fidelis K, Tramontano A, Kryshtafovych A. Evaluation of residue-residue contact predictions in CASP9. Proteins 2011;79 Suppl 10:119-25. [PMID: 21928322 DOI: 10.1002/prot.23160] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2011] [Revised: 06/25/2011] [Accepted: 07/27/2011] [Indexed: 01/03/2023]

Ashkenazy H, Unger R, Kliger Y. Hidden conformations in protein structures. Bioinformatics 2011;27:1941-7. [DOI: 10.1093/bioinformatics/btr292] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Nepomnyashchikh TS, Antonets DV, Lebedev LR, Gileva IP, Shchelkunov SN. 3D structure modeling of complexes formed by CrmB TNF-binding proteins of Variola and cowpox viruses with murine and human TNFs. Mol Biol 2010. [DOI: 10.1134/s0026893310060117] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Gershoni M, Fuchs A, Shani N, Fridman Y, Corral-Debrinski M, Aharoni A, Frishman D, Mishmar D. Coevolution predicts direct interactions between mtDNA-encoded and nDNA-encoded subunits of oxidative phosphorylation complex i. J Mol Biol 2010;404:158-71. [PMID: 20868692 DOI: 10.1016/j.jmb.2010.09.029] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2009] [Revised: 09/05/2010] [Accepted: 09/13/2010] [Indexed: 10/19/2022]

Abstract

Despite years of research, the structure of the largest mammalian oxidative phosphorylation (OXPHOS) complex, NADH-ubiquinone oxidoreductase (complex I), and the interactions among its 45 subunits are not fully understood. Since complex I harbors subunits encoded by mitochondrial DNA (mtDNA) and nuclear DNA (nDNA) genomes, with the former evolving ∼10 times faster than the latter, tight cytonuclear coevolution is expected and observed. Recently, we identified three nDNA-encoded complex I subunits that underwent accelerated amino acid replacement, suggesting their adjustment to the elevated mtDNA rate of change. Hence, they constitute excellent candidates for binding mtDNA-encoded subunits. Here, we further disentangle the network of physical cytonuclear interactions within complex I by analyzing subunits coevolution. Firstly, relying on the bioinformatic analysis of 10 protein complexes possessing solved structures, we show that signals of coevolution identified physically interacting subunits with nearly 90% accuracy, thus lending support to our approach. When applying this approach to cytonuclear interaction within complex I, we predict that the 'rate-accelerated' nDNA-encoded subunits of complex I, NDUFC2 and NDUFA1, likely interact with the mtDNA-encoded subunits ND5/ND4 and ND5/ND4/ND1, respectively. Furthermore, we predicted interactions among mtDNA-encoded complex I subunits. Using the yeast two-hybrid system, we experimentally confirmed the predicted interactions of human NDUFC2 with ND4, the interactions of human NDUFA1 with ND1 and ND4, and the lack of interaction of NDUFC2 with ND3 and NDUFA1, thus providing a proof of concept for our approach. Our study shows, for the first time, evidence for direct interactions between nDNA-encoded and mtDNA-encoded subunits of human OXPHOS complex I and paves the path towards deciphering subunit interactions within complexes lacking three-dimensional structures. Our subunit-interactions-predicting method, ComplexCorr, is available at http://webclu.bio.wzw.tum.de/complexcorr.

Collapse

Tress ML, Valencia A. Predicted residue-residue contacts can help the scoring of 3D models. Proteins 2010;78:1980-91. [PMID: 20408174 DOI: 10.1002/prot.22714] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]

Rajgaria R, Wei Y, Floudas CA. Contact prediction for beta and alpha-beta proteins using integer linear optimization and its impact on the first principles 3D structure prediction method ASTRO-FOLD. Proteins 2010;78:1825-46. [PMID: 20225257 PMCID: PMC2858251 DOI: 10.1002/prot.22696] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Duarte JM, Sathyapriya R, Stehr H, Filippis I, Lappe M. Optimal contact definition for reconstruction of contact maps. BMC Bioinformatics 2010;11:283. [PMID: 20507547 PMCID: PMC3583236 DOI: 10.1186/1471-2105-11-283] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2009] [Accepted: 05/27/2010] [Indexed: 11/23/2022] Open

Abstract

Background

Contact maps have been extensively used as a simplified representation of protein structures. They capture most important features of a protein's fold, being preferred by a number of researchers for the description and study of protein structures. Inspired by the model's simplicity many groups have dedicated a considerable amount of effort towards contact prediction as a proxy for protein structure prediction. However a contact map's biological interest is subject to the availability of reliable methods for the 3-dimensional reconstruction of the structure.

Results

We use an implementation of the well-known distance geometry protocol to build realistic protein 3-dimensional models from contact maps, performing an extensive exploration of many of the parameters involved in the reconstruction process. We try to address the questions: a) to what accuracy does a contact map represent its corresponding 3D structure, b) what is the best contact map representation with regard to reconstructability and c) what is the effect of partial or inaccurate contact information on the 3D structure recovery. Our results suggest that contact maps derived from the application of a distance cutoff of 9 to 11Å around the C_βatoms constitute the most accurate representation of the 3D structure. The reconstruction process does not provide a single solution to the problem but rather an ensemble of conformations that are within 2Å RMSD of the crystal structure and with lower values for the pairwise average ensemble RMSD. Interestingly it is still possible to recover a structure with partial contact information, although wrong contacts can lead to dramatic loss in reconstruction fidelity.

Conclusions

Thus contact maps represent a valid approximation to the structures with an accuracy comparable to that of experimental methods. The optimal contact definitions constitute key guidelines for methods based on contact maps such as structure prediction through contacts and structural alignments based on maximum contact map overlap.

Collapse

Karakaş M, Woetzel N, Meiler J. BCL::contact-low confidence fold recognition hits boost protein contact prediction and de novo structure determination. J Comput Biol 2010;17:153-68. [PMID: 19772383 DOI: 10.1089/cmb.2009.0030] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Prediction of protein long-range contacts using an ensemble of genetic algorithm classifiers with sequence profile centers. BMC STRUCTURAL BIOLOGY 2010;10 Suppl 1:S2. [PMID: 20487509 PMCID: PMC2873825 DOI: 10.1186/1472-6807-10-s1-s2] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]

Izarzugaza JMG, Redfern OC, Orengo CA, Valencia A. Cancer-associated mutations are preferentially distributed in protein kinase functional sites. Proteins 2010;77:892-903. [PMID: 19626714 DOI: 10.1002/prot.22512] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]

Ezkurdia I, Graña O, Izarzugaza JMG, Tress ML. Assessment of domain boundary predictions and the prediction of intramolecular contacts in CASP8. Proteins 2010;77 Suppl 9:196-209. [PMID: 19714769 DOI: 10.1002/prot.22554] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]

Michino M, Brooks CL. Predicting structurally conserved contacts for homologous proteins using sequence conservation filters. Proteins 2009;77:448-53. [PMID: 19475704 DOI: 10.1002/prot.22456] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Liu T, Horst JA, Samudrala R. A novel method for predicting and using distance constraints of high accuracy for refining protein structure prediction. Proteins 2009;77:220-34. [PMID: 19422061 DOI: 10.1002/prot.22434] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Gao X, Bu D, Xu J, Li M. Improving consensus contact prediction via server correlation reduction. BMC STRUCTURAL BIOLOGY 2009;9:28. [PMID: 19419562 PMCID: PMC2689239 DOI: 10.1186/1472-6807-9-28] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Accepted: 05/06/2009] [Indexed: 11/10/2022]

Abstract

Background

Protein inter-residue contacts play a crucial role in the determination and prediction of protein structures. Previous studies on contact prediction indicate that although template-based consensus methods outperform sequence-based methods on targets with typical templates, such consensus methods perform poorly on new fold targets. However, we find out that even for new fold targets, the models generated by threading programs can contain many true contacts. The challenge is how to identify them.

Results

In this paper, we develop an integer linear programming model for consensus contact prediction. In contrast to the simple majority voting method assuming that all the individual servers are equally important and independent, the newly developed method evaluates their correlation by using maximum likelihood estimation and extracts independent latent servers from them by using principal component analysis. An integer linear programming method is then applied to assign a weight to each latent server to maximize the difference between true contacts and false ones. The proposed method is tested on the CASP7 data set. If the top L/5 predicted contacts are evaluated where L is the protein size, the average accuracy is 73%, which is much higher than that of any previously reported study. Moreover, if only the 15 new fold CASP7 targets are considered, our method achieves an average accuracy of 37%, which is much better than that of the majority voting method, SVM-LOMETS, SVM-SEQ, and SAM-T06. These methods demonstrate an average accuracy of 13.0%, 10.8%, 25.8% and 21.2%, respectively.

Conclusion

Reducing server correlation and optimally combining independent latent servers show a significant improvement over the traditional consensus methods. This approach can hopefully provide a powerful tool for protein structure refinement and prediction use.

Collapse

Rajgaria R, McAllister SR, Floudas CA. Towards accurate residue-residue hydrophobic contact prediction for alpha helical proteins via integer linear optimization. Proteins 2009;74:929-47. [PMID: 18767158 DOI: 10.1002/prot.22202] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Fuchs A, Kirschner A, Frishman D. Prediction of helix-helix contacts and interacting helices in polytopic membrane proteins using neural networks. Proteins 2009;74:857-71. [PMID: 18704938 DOI: 10.1002/prot.22194] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Abstract

Despite rapidly increasing numbers of available 3D structures, membrane proteins still account for less than 1% of all structures in the Protein Data Bank. Recent high-resolution structures indicate a clearly broader structural diversity of membrane proteins than initially anticipated, motivating the development of reliable structure prediction methods specifically tailored for this class of molecules. One important prediction target capturing all major aspects of a protein's 3D structure is its contact map. Our analysis shows that computational methods trained to predict residue contacts in globular proteins perform poorly when applied to membrane proteins. We have recently published a method to identify interacting alpha-helices in membrane proteins based on the analysis of coevolving residues in predicted transmembrane regions. Here, we present a substantially improved algorithm for the same problem, which uses a newly developed neural network approach to predict helix-helix contacts. In addition to the input features commonly used for contact prediction of soluble proteins, such as windowed residue profiles and residue distance in the sequence, our network also incorporates features that apply to membrane proteins only, such as residue position within the transmembrane segment and its orientation toward the lipophilic environment. The obtained neural network can predict contacts between residues in transmembrane segments with nearly 26% accuracy. It is therefore the first published contact predictor developed specifically for membrane proteins performing with equal accuracy to state-of-the-art contact predictors available for soluble proteins. The predicted helix-helix contacts were employed in a second step to identify interacting helices. For our dataset consisting of 62 membrane proteins of solved structure, we gained an accuracy of 78.1%. Because the reliable prediction of helix interaction patterns is an important step in the classification and prediction of membrane protein folds, our method will be a helpful tool in compiling a structural census of membrane proteins.

Collapse

Lo A, Chiu YY, Rødland EA, Lyu PC, Sung TY, Hsu WL. Predicting helix-helix interactions from residue contacts in membrane proteins. ACTA ACUST UNITED AC 2009;25:996-1003. [PMID: 19244388 PMCID: PMC2666818 DOI: 10.1093/bioinformatics/btp114] [Citation(s) in RCA: 50] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

Motivation: Helix–helix interactions play a critical role in the structure assembly, stability and function of membrane proteins. On the molecular level, the interactions are mediated by one or more residue contacts. Although previous studies focused on helix-packing patterns and sequence motifs, few of them developed methods specifically for contact prediction.

Results: We present a new hierarchical framework for contact prediction, with an application in membrane proteins. The hierarchical scheme consists of two levels: in the first level, contact residues are predicted from the sequence and their pairing relationships are further predicted in the second level. Statistical analyses on contact propensities are combined with other sequence and structural information for training the support vector machine classifiers. Evaluated on 52 protein chains using leave-one-out cross validation (LOOCV) and an independent test set of 14 protein chains, the two-level approach consistently improves the conventional direct approach in prediction accuracy, with 80% reduction of input for prediction. Furthermore, the predicted contacts are then used to infer interactions between pairs of helices. When at least three predicted contacts are required for an inferred interaction, the accuracy, sensitivity and specificity are 56%, 40% and 89%, respectively. Our results demonstrate that a hierarchical framework can be applied to eliminate false positives (FP) while reducing computational complexity in predicting contacts. Together with the estimated contact propensities, this method can be used to gain insights into helix-packing in membrane proteins.

Availability:http://bio-cluster.iis.sinica.edu.tw/TMhit/

Contact:tsung@iis.sinica.edu.tw

Supplementary information:Supplementary data are available at Bioinformatics online.

Collapse

Ashkenazy H, Unger R, Kliger Y. Optimal data collection for correlated mutation analysis. Proteins 2009;74:545-55. [PMID: 18655065 DOI: 10.1002/prot.22168] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Waldispühl J, O'Donnell CW, Devadas S, Clote P, Berger B. Modeling ensembles of transmembrane beta-barrel proteins. Proteins 2008;71:1097-112. [PMID: 18004792 DOI: 10.1002/prot.21788] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Latek D, Kolinski A. Contact prediction in protein modeling: scoring, folding and refinement of coarse-grained models. BMC STRUCTURAL BIOLOGY 2008;8:36. [PMID: 18694501 PMCID: PMC2527566 DOI: 10.1186/1472-6807-8-36] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2008] [Accepted: 08/11/2008] [Indexed: 11/10/2022]

Abstract

BACKGROUND

Several different methods for contact prediction succeeded within the Sixth Critical Assessment of Techniques for Protein Structure Prediction (CASP6). The most relevant were non-local contact predictions for targets from the most difficult categories: fold recognition-analogy and new fold. Such contacts could provide valuable structural information in case a template structure cannot be found in the PDB.

RESULTS

We described comprehensive tests of the effectiveness of contact data in various aspects of de novo modeling with CABS, an algorithm which was used successfully in CASP6 by the Kolinski-Bujnicki group. We used the predicted contacts in a simple scoring function for the post-simulation ranking of protein models and as a soft bias in the folding simulations and in the fold-refinement procedure. The latter approach turned out to be the most successful. The CABS force field used in the Replica Exchange Monte Carlo simulations cooperated with the true contacts and discriminated the false ones, which resulted in an improvement of the majority of Kolinski-Bujnicki's protein models. In the modeling we tested different sets of predicted contact data submitted to the CASP6 server. According to our results, the best performing were the contacts with the accuracy balanced with the coverage, obtained either from the best two predictors only or by a consensus from as many predictors as possible.

CONCLUSION

Our tests have shown that theoretically predicted contacts can be very beneficial for protein structure prediction. Depending on the protein modeling method, a contact data set applied should be prepared with differently balanced coverage and accuracy of predicted contacts. Namely, high coverage of contact data is important for the model ranking and high accuracy for the folding simulations.

Collapse

Miller CS, Eisenberg D. Using inferred residue contacts to distinguish between correct and incorrect protein models. ACTA ACUST UNITED AC 2008;24:1575-82. [PMID: 18511466 PMCID: PMC2638260 DOI: 10.1093/bioinformatics/btn248] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]

Faure G, Bornot A, de Brevern AG. Protein contacts, inter-residue interactions and side-chain modelling. Biochimie 2008;90:626-39. [DOI: 10.1016/j.biochi.2007.11.007] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2007] [Accepted: 11/22/2007] [Indexed: 10/22/2022]

Wu S, Zhang Y. A comprehensive assessment of sequence-based and template-based methods for protein contact prediction. ACTA ACUST UNITED AC 2008;24:924-31. [PMID: 18296462 DOI: 10.1093/bioinformatics/btn069] [Citation(s) in RCA: 151] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]

Abstract

MOTIVATION

Pair-wise residue-residue contacts in proteins can be predicted from both threading templates and sequence-based machine learning. However, most structure modeling approaches only use the template-based contact predictions in guiding the simulations; this is partly because the sequence-based contact predictions are usually considered to be less accurate than that by threading. With the rapid progress in sequence databases and machine-learning techniques, it is necessary to have a detailed and comprehensive assessment of the contact-prediction methods in different template conditions.

RESULTS

We develop two methods for protein-contact predictions: SVM-SEQ is a sequence-based machine learning approach which trains a variety of sequence-derived features on contact maps; SVM-LOMETS collects consensus contact predictions from multiple threading templates. We test both methods on the same set of 554 proteins which are categorized into 'Easy', 'Medium', 'Hard' and 'Very Hard' targets based on the evolutionary and structural distance between templates and targets. For the Easy and Medium targets, SVM-LOMETS obviously outperforms SVM-SEQ; but for the Hard and Very Hard targets, the accuracy of the SVM-SEQ predictions is higher than that of SVM-LOMETS by 12-25%. If we combine the SVM-SEQ and SVM-LOMETS predictions together, the total number of correctly predicted contacts in the Hard proteins will increase by more than 60% (or 70% for the long-range contact with a sequence separation > or =24), compared with SVM-LOMETS alone. The advantage of SVM-SEQ is also shown in the CASP7 free modeling targets where the SVM-SEQ is around four times more accurate than SVM-LOMETS in the long-range contact prediction. These data demonstrate that the state-of-the-art sequence-based contact prediction has reached a level which may be helpful in assisting tertiary structure modeling for the targets which do not have close structure templates. The maximum yield should be obtained by the combination of both sequence- and template-based predictions.

Collapse

Izarzugaza JMG, Graña O, Tress ML, Valencia A, Clarke ND. Assessment of intramolecular contact predictions for CASP7. Proteins 2008;69 Suppl 8:152-8. [PMID: 17671976 DOI: 10.1002/prot.21637] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]