1
|
Homology Modeling and Analysis of Vacuolar Aspartyl Protease from a Novel Yeast Expression Host Meyerozyma guilliermondii Strain SO. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2022. [DOI: 10.1007/s13369-022-07153-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
|
2
|
Minervini G, Quaglia F, Tosatto SCE. Computational analysis of prolyl hydroxylase domain-containing protein 2 (PHD2) mutations promoting polycythemia insurgence in humans. Sci Rep 2016; 6:18716. [PMID: 26754054 PMCID: PMC4709589 DOI: 10.1038/srep18716] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2015] [Accepted: 11/06/2015] [Indexed: 12/18/2022] Open
Abstract
Idiopathic erythrocytosis is a rare disease characterized by an increase in red blood cell mass due to mutations in proteins of the oxygen-sensing pathway, such as prolyl hydroxylase 2 (PHD2). Here, we present a bioinformatics investigation of the pathological effect of twelve PHD2 mutations related to polycythemia insurgence. We show that few mutations impair the PHD2 catalytic site, while most localize to non-enzymatic regions. We also found that most mutations do not overlap the substrate recognition site, suggesting a novel PHD2 binding interface. After a structural analysis of both binding partners, we suggest that this novel interface is responsible for PHD2 interaction with the LIMD1 tumor suppressor.
Collapse
Affiliation(s)
- Giovanni Minervini
- Department of Biomedical Sciences and CRIBI Biotechnology Center, University of Padova, Viale G. Colombo 3, 35121, Padova, Italy
| | - Federica Quaglia
- Department of Biomedical Sciences and CRIBI Biotechnology Center, University of Padova, Viale G. Colombo 3, 35121, Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences and CRIBI Biotechnology Center, University of Padova, Viale G. Colombo 3, 35121, Padova, Italy.,CNR Institute of Neuroscience, Viale G. Colombo 3, 35121, Padova, Italy
| |
Collapse
|
3
|
Abstract
Motivation: Alignment errors are still the main bottleneck for current template-based protein modeling (TM) methods, including protein threading and homology modeling, especially when the sequence identity between two proteins under consideration is low (<30%). Results: We present a novel protein threading method, CNFpred, which achieves much more accurate sequence–template alignment by employing a probabilistic graphical model called a Conditional Neural Field (CNF), which aligns one protein sequence to its remote template using a non-linear scoring function. This scoring function accounts for correlation among a variety of protein sequence and structure features, makes use of information in the neighborhood of two residues to be aligned, and is thus much more sensitive than the widely used linear or profile-based scoring function. To train this CNF threading model, we employ a novel quality-sensitive method, instead of the standard maximum-likelihood method, to maximize directly the expected quality of the training set. Experimental results show that CNFpred generates significantly better alignments than the best profile-based and threading methods on several public (but small) benchmarks as well as our own large dataset. CNFpred outperforms others regardless of the lengths or classes of proteins, and works particularly well for proteins with sparse sequence profiles due to the effective utilization of structure information. Our methodology can also be adapted to protein sequence alignment. Contact:j3xu@ttic.edu Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jianzhu Ma
- Toyota Technological Institute at Chicago, IL 60637, USA
| | | | | | | |
Collapse
|
4
|
Ballarin L, Franchi N, Schiavon F, Tosatto SCE, Mičetić I, Kawamura K. Looking for putative phenoloxidases of compound ascidians: haemocyanin-like proteins in Polyandrocarpa misakiensis and Botryllus schlosseri. DEVELOPMENTAL AND COMPARATIVE IMMUNOLOGY 2012; 38:232-242. [PMID: 22698614 DOI: 10.1016/j.dci.2012.05.008] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/18/2012] [Revised: 05/03/2012] [Accepted: 05/08/2012] [Indexed: 06/01/2023]
Abstract
Phenoloxidases (POs) and haemocyanins constitute a family of copper-containing proteins widely distributed among invertebrates. Both of them are able, under appropriate conditions, to convert polyphenols to quinones and induce cytotoxicity through the production of reactive oxygen species, a fundamental event in many immune responses. In ascidians, PO activity has been described and studied in both solitary and colonial species and the enzyme is involved in inflammatory and cytotoxic reactions against foreign cells or molecules, and in the formation of the cytotoxic foci which characterise the nonfusion reaction of botryllids. Expressed genes for two putative POs (CiPO1 and CiPO2) have been recently identified in C. intestinalis. In the present study, we determined the cDNA sequences of two haemocyanin-like proteins from two colonial ascidians: Botryllus schlosseri from the Mediterranean Sea and Polyandrocarpa misakiensis from Japan. Multiple sequence alignments evidenced the similarity between the above sequences and crustacean proPOs whereas the analysis of the three-dimensional structure reveals high similarity with arthropod haemocyanins which share common precursors with arthropod proPOs. Botryllus HLP grouped in the same cluster with Ciona POs, whereas Polyandrocarpa HLP clustered with arthropod haemocyanins; all of them share the full conservation of the six histidines at the two copper-binding sites as well as of other motifs, also found in arthropod haemocyanin subunits, involved in the regulation of enzyme activity. In situ hybridisation indicated that the genes are transcribed inside morula cells, a characteristic haemocyte type in ascidians where PO activity is located, at the beginning of their differentiation. These results represent a first attempt to identify candidate molecules responsible of the PO activity in compound ascidians.
Collapse
|
5
|
POLEKSIC ALEKSANDAR, FIENUP MARK, DANZER JOSEPHF, DEBE DEREKA. A DIFFERENT LOOK AT THE QUALITY OF MODELED THREE-DIMENSIONAL PROTEIN STRUCTURES. J Bioinform Comput Biol 2011; 6:335-45. [DOI: 10.1142/s0219720008003424] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2007] [Revised: 11/14/2007] [Accepted: 12/05/2007] [Indexed: 11/18/2022]
Abstract
Measuring the accuracy of protein three-dimensional structures is one of the most important problems in protein structure prediction. For structure-based drug design, the accuracy of the binding site is far more important than the accuracy of any other region of the protein. We have developed an automated method for assessing the quality of a protein model by focusing on the set of residues in the small molecule binding site. Small molecule binding sites typically involve multiple regions of the protein coming together in space, and their accuracy has been observed to be sensitive to even small alignment errors. In addition, ligand binding sites contain the critical information required for drug design, making their accuracy particularly important. We analyzed the accuracy of the binding sites on two sets of protein models: the predictions submitted by the top-performing CASP7 groups, and the models generated by four widely used homology modeling packages. The results of our CASP7 analysis significantly differ from the previous findings, implying that the binding site measure does not correlate with the traditional model quality measures used in the structure prediction benchmarks. For the modeling programs, the resolution of binding sites is extremely sensitive to the degree of sequence homology between the query and the template, even when the most accurate alignments are used in the homology modeling process.
Collapse
Affiliation(s)
- ALEKSANDAR POLEKSIC
- Computer Science Department, University of Northern Iowa, Cedar Falls, IA 50614, USA
| | - MARK FIENUP
- Computer Science Department, University of Northern Iowa, Cedar Falls, IA 50614, USA
| | - JOSEPH F. DANZER
- Eidogen-Sertanty Inc., 9381 Judicial Dr., San Diego, CA 92121, USA
| | - DEREK A. DEBE
- Global Pharmaceutical Research and Development, Abbott Laboratories, Abbott Park, IL 60064, USA
| |
Collapse
|
6
|
Immune roles of a rhamnose-binding lectin in the colonial ascidian Botryllus schlosseri. Immunobiology 2011; 216:725-36. [DOI: 10.1016/j.imbio.2010.10.011] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2010] [Accepted: 10/29/2010] [Indexed: 02/07/2023]
|
7
|
Leonardi E, Martella M, Tosatto SC, Murgia A. Identification and In Silico Analysis of Novel von Hippel-Lindau (VHL) Gene Variants from a Large Population. Ann Hum Genet 2011; 75:483-96. [DOI: 10.1111/j.1469-1809.2011.00647.x] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
8
|
Leonardi E, Andreazza S, Vanin S, Busolin G, Nobile C, Tosatto SCE. A computational model of the LGI1 protein suggests a common binding site for ADAM proteins. PLoS One 2011; 6:e18142. [PMID: 21479274 PMCID: PMC3066209 DOI: 10.1371/journal.pone.0018142] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2010] [Accepted: 02/23/2011] [Indexed: 01/06/2023] Open
Abstract
Mutations of human leucine-rich glioma inactivated (LGI1) gene encoding the epitempin protein cause autosomal dominant temporal lateral epilepsy (ADTLE), a rare familial partial epileptic syndrome. The LGI1 gene seems to have a role on the transmission of neuronal messages but the exact molecular mechanism remains unclear. In contrast to other genes involved in epileptic disorders, epitempin shows no homology with known ion channel genes but contains two domains, composed of repeated structural units, known to mediate protein-protein interactions.A three dimensional in silico model of the two epitempin domains was built to predict the structure-function relationship and propose a functional model integrating previous experimental findings. Conserved and electrostatic charged regions of the model surface suggest a possible arrangement between the two domains and identifies a possible ADAM protein binding site in the β-propeller domain and another protein binding site in the leucine-rich repeat domain. The functional model indicates that epitempin could mediate the interaction between proteins localized to different synaptic sides in a static way, by forming a dimer, or in a dynamic way, by binding proteins at different times.The model was also used to predict effects of known disease-causing missense mutations. Most of the variants are predicted to alter protein folding while several other map to functional surface regions. In agreement with experimental evidence, this suggests that non-secreted LGI1 mutants could be retained within the cell by quality control mechanisms or by altering interactions required for the secretion process.
Collapse
Affiliation(s)
| | | | - Stefano Vanin
- Department of Biology, University of Padova, Padova, Italy
- School of Applied Science, University of Huddersfield, Huddersfield, United Kingdom
| | - Giorgia Busolin
- Institute of Neurosciences, Consiglio Nazionale delle Ricerche (CNR), Padova, Italy
| | - Carlo Nobile
- Institute of Neurosciences, Consiglio Nazionale delle Ricerche (CNR), Padova, Italy
| | | |
Collapse
|
9
|
Chen H, Kihara D. Effect of using suboptimal alignments in template-based protein structure prediction. Proteins 2011; 79:315-34. [PMID: 21058297 PMCID: PMC3058269 DOI: 10.1002/prot.22885] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing because of the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of using suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we use suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach, which only uses the optimal alignment in defining residue contacts, and also the re-ranking strategy, which uses the contact potential in re-ranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperforms existing methods.
Collapse
Affiliation(s)
- Hao Chen
- Department of Biological Sciences College of Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences College of Science, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science College of Science, Purdue University, West Lafayette, IN, 47907, USA
- Markey Center for Structural Biology College of Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
10
|
Fogolari F, Tosatto SCE, Muraro L, Montecucco C. Electric dipole reorientation in the interaction of botulinum neurotoxins with neuronal membranes. FEBS Lett 2009; 583:2321-5. [PMID: 19576894 DOI: 10.1016/j.febslet.2009.06.046] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2009] [Revised: 06/25/2009] [Accepted: 06/28/2009] [Indexed: 11/16/2022]
Abstract
Botulinum neurotoxins are highly potent toxins capable of rapid and specific interaction with the presynaptic membrane. We have hypothesised that: (1) these neurotoxins possess an electric dipole with the positive pole on receptor binding domain Hc-C and that (2) on approaching the negatively charged presynaptic membrane, they reorient themselves and hit the membrane surface with Hc-C; this electrostatic effect would contribute efficient binding. Electrostatic calculations confirm these hypotheses and strongly indicate that electrostatics effects can play an important role in the unique presynaptic membrane binding properties of these neurotoxins and generally on the interaction of other plasma membrane protein ligands.
Collapse
Affiliation(s)
- Federico Fogolari
- Department of Biomedical Sciences and Technologies, University of Udine, Piazzale Kolbe 4, 33100 Udine, Italy
| | | | | | | |
Collapse
|
11
|
Benkert P, Schwede T, Tosatto SC. QMEANclust: estimation of protein model quality by combining a composite scoring function with structural density information. BMC STRUCTURAL BIOLOGY 2009; 9:35. [PMID: 19457232 PMCID: PMC2709111 DOI: 10.1186/1472-6807-9-35] [Citation(s) in RCA: 112] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/21/2008] [Accepted: 05/20/2009] [Indexed: 11/10/2022]
Abstract
BACKGROUND The selection of the most accurate protein model from a set of alternatives is a crucial step in protein structure prediction both in template-based and ab initio approaches. Scoring functions have been developed which can either return a quality estimate for a single model or derive a score from the information contained in the ensemble of models for a given sequence. Local structural features occurring more frequently in the ensemble have a greater probability of being correct. Within the context of the CASP experiment, these so called consensus methods have been shown to perform considerably better in selecting good candidate models, but tend to fail if the best models are far from the dominant structural cluster. In this paper we show that model selection can be improved if both approaches are combined by pre-filtering the models used during the calculation of the structural consensus. RESULTS Our recently published QMEAN composite scoring function has been improved by including an all-atom interaction potential term. The preliminary model ranking based on the new QMEAN score is used to select a subset of reliable models against which the structural consensus score is calculated. This scoring function called QMEANclust achieves a correlation coefficient of predicted quality score and GDT_TS of 0.9 averaged over the 98 CASP7 targets and perform significantly better in selecting good models from the ensemble of server models than any other groups participating in the quality estimation category of CASP7. Both scoring functions are also benchmarked on the MOULDER test set consisting of 20 target proteins each with 300 alternatives models generated by MODELLER. QMEAN outperforms all other tested scoring functions operating on individual models, while the consensus method QMEANclust only works properly on decoy sets containing a certain fraction of near-native conformations. We also present a local version of QMEAN for the per-residue estimation of model quality (QMEANlocal) and compare it to a new local consensus-based approach. CONCLUSION Improved model selection is obtained by using a composite scoring function operating on single models in order to enrich higher quality models which are subsequently used to calculate the structural consensus. The performance of consensus-based methods such as QMEANclust highly depends on the composition and quality of the model ensemble to be analysed. Therefore, performance estimates for consensus methods based on large meta-datasets (e.g. CASP) might overrate their applicability in more realistic modelling situations with smaller sets of models based on individual methods.
Collapse
Affiliation(s)
- Pascal Benkert
- Swiss Institute of Bioinformatics, Biozentrum, University of Basel, Klingelbergstrasse 50/70, 4056 Basel, Switzerland.
| | | | | |
Collapse
|
12
|
Bordoli L, Kiefer F, Arnold K, Benkert P, Battey J, Schwede T. Protein structure homology modeling using SWISS-MODEL workspace. Nat Protoc 2009; 4:1-13. [PMID: 19131951 DOI: 10.1038/nprot.2008.197] [Citation(s) in RCA: 912] [Impact Index Per Article: 60.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Homology modeling aims to build three-dimensional protein structure models using experimentally determined structures of related family members as templates. SWISS-MODEL workspace is an integrated Web-based modeling expert system. For a given target protein, a library of experimental protein structures is searched to identify suitable templates. On the basis of a sequence alignment between the target protein and the template structure, a three-dimensional model for the target protein is generated. Model quality assessment tools are used to estimate the reliability of the resulting models. Homology modeling is currently the most accurate computational method to generate reliable structural models and is routinely used in many biological applications. Typically, the computational effort for a modeling project is less than 2 h. However, this does not include the time required for visualization and interpretation of the model, which may vary depending on personal experience working with protein structures.
Collapse
Affiliation(s)
- Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH 4056 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
13
|
Ferretti M, Destro T, Tosatto SCE, La Rocca N, Rascio N, Masi A. Gamma-glutamyl transferase in the cell wall participates in extracellular glutathione salvage from the root apoplast. THE NEW PHYTOLOGIST 2009; 181:115-126. [PMID: 19076720 DOI: 10.1111/j.1469-8137.2008.02653.x] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
The molecular properties and subcellular location of bound gamma-glutamyl transferase (GGT) were studied, and an experimental setup devised to assess its functions in barley roots. Enzyme histochemistry was used to detect GGT activity at tissue level; immunocytochemistry to localize the protein at subcellular level; and modelling studies to investigate its surface charge properties. GGT activity in vivo was measured for the first time. Functions were explored by applying chemical treatments with inhibitors and the thiol-oxidizing drug diamide, performing time-course chromatographic and spectrophotometric analyses on low-molecular-weight thiols. Gamma-glutamyl transferase activity was found to be high in the root apical region and the protein was anchored to root cell wall components, probably by basic amino acid residues. The results show that GGT is essential to the recovery of apoplastic glutathione provided exogenously or extruded by oxidative treatment. It is demonstrated that GGT activity helps to salvage extracellular glutathione and may contribute to redox control of the extracellular environment, thus providing evidence of a functional role for gamma-glutamyl cycle in roots.
Collapse
Affiliation(s)
- M Ferretti
- Department of Agricultural Biotechnology, University of Padova, Viale dell'Universita' 16, I-35020 Legnaro (PD), Italy;Department of Biology, University of Padova, Via Trieste 75, I-35100 Padova, Italy;CRIBI Biotech Centre, University of Padova, Via Trieste 75, I-35100 Padova, Italy
| | - T Destro
- Department of Agricultural Biotechnology, University of Padova, Viale dell'Universita' 16, I-35020 Legnaro (PD), Italy;Department of Biology, University of Padova, Via Trieste 75, I-35100 Padova, Italy;CRIBI Biotech Centre, University of Padova, Via Trieste 75, I-35100 Padova, Italy
| | - S C E Tosatto
- Department of Agricultural Biotechnology, University of Padova, Viale dell'Universita' 16, I-35020 Legnaro (PD), Italy;Department of Biology, University of Padova, Via Trieste 75, I-35100 Padova, Italy;CRIBI Biotech Centre, University of Padova, Via Trieste 75, I-35100 Padova, Italy
| | - N La Rocca
- Department of Agricultural Biotechnology, University of Padova, Viale dell'Universita' 16, I-35020 Legnaro (PD), Italy;Department of Biology, University of Padova, Via Trieste 75, I-35100 Padova, Italy;CRIBI Biotech Centre, University of Padova, Via Trieste 75, I-35100 Padova, Italy
| | - N Rascio
- Department of Agricultural Biotechnology, University of Padova, Viale dell'Universita' 16, I-35020 Legnaro (PD), Italy;Department of Biology, University of Padova, Via Trieste 75, I-35100 Padova, Italy;CRIBI Biotech Centre, University of Padova, Via Trieste 75, I-35100 Padova, Italy
| | - A Masi
- Department of Agricultural Biotechnology, University of Padova, Viale dell'Universita' 16, I-35020 Legnaro (PD), Italy;Department of Biology, University of Padova, Via Trieste 75, I-35100 Padova, Italy;CRIBI Biotech Centre, University of Padova, Via Trieste 75, I-35100 Padova, Italy
| |
Collapse
|
14
|
Toppo S, Vanin S, Bosello V, Tosatto SCE. Evolutionary and structural insights into the multifaceted glutathione peroxidase (Gpx) superfamily. Antioxid Redox Signal 2008; 10:1501-14. [PMID: 18498225 DOI: 10.1089/ars.2008.2057] [Citation(s) in RCA: 163] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 01/23/2023]
Abstract
Glutathione peroxidase (GPx) is a widespread protein superfamily found in many organisms throughout all kingdoms of life. Although it was initially thought to use only glutathione as reductant, recent evidence suggests that the majority of GPxs have specificity for thioredoxin. We present a thorough in silico analysis performed on 724 sequences and 12 structures aimed to clarify the evolutionary, structural, and sequence determinants of GPx specificity. Structural variability was found to be limited to only two regions, termed oligomerization loop and functional helix, which modulate both reduced substrate specificity and oligomerization state. We show that mammalian GPx-1, the canonic selenocysteine-based tetrameric glutathione peroxidase, is a recent "invention" during evolution. Contrary to common belief, cysteine-based thioredoxin-specific GPx, which we propose the TGPx, are both more common and more ancient. This raises interesting evolutionary considerations regarding oligomerization and the use of active-site selenocysteine residue. In addition, phylogenetic analysis has revealed the presence of a novel member belonging to the GPx superfamily in Mammalia and Amphibia, for which we propose the name GPx-8, following the present numeric order of the mammalian GPxs.
Collapse
Affiliation(s)
- Stefano Toppo
- Department of Biological Chemistry, University of Padova, Italy.
| | | | | | | |
Collapse
|
15
|
Chen H, Kihara D. Estimating quality of template-based protein models by alignment stability. Proteins 2008; 71:1255-74. [PMID: 18041762 DOI: 10.1002/prot.21819] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
The error in protein tertiary structure prediction is unavoidable, but it is not explicitly shown in most of the current prediction algorithms. Estimated error of a predicted structure is crucial information for experimental biologists to use the prediction model for design and interpretation of experiments. Here, we propose a method to estimate errors in predicted structures based on the stability of the optimal target-template alignment when compared with a set of suboptimal alignments. The stability of the optimal alignment is quantified by an index named the SuboPtimal Alignment Diversity (SPAD). We implemented SPAD in a profile-based threading algorithm and investigated how well SPAD can indicate errors in threading models using a large benchmark dataset of 5232 alignments. SPAD shows a very good correlation not only to alignment shift errors but also structure-level errors, the root mean square deviation (RMSD) of predicted structure models to the native structures (i.e. global errors), and local errors at each residue position. We have further compared SPAD with seven other quality measures, six from sequence alignment-based measures and one atomic statistical potential, discrete optimized protein energy (DOPE), in terms of the correlation coefficient to the global and local structure-level errors. In terms of the correlation to the RMSD of structure models, when a target and a template are in the same SCOP family, the sequence identity showed a best correlation to the RMSD; in the superfamily level, SPAD was the best; and in the fold level, DOPE was best. However, in a head-to-head comparison, SPAD wins over the other measures. Next, SPAD is compared with three other measures of local errors. In this comparison, SPAD was best in all of the family, the superfamily and the fold levels. Using the discovered correlation, we have also predicted the global and local error of our predicted structures of CASP7 targets by the SPAD. Finally, we proposed a sausage representation of predicted tertiary structures which intuitively indicate the predicted structure and the estimated error range of the structure simultaneously.
Collapse
Affiliation(s)
- Hao Chen
- Department of Biological Sciences, College of Science, Purdue University, West Lafayette, Indiana 47907, USA
| | | |
Collapse
|
16
|
Benkert P, Tosatto SCE, Schomburg D. QMEAN: A comprehensive scoring function for model quality assessment. Proteins 2008; 71:261-77. [PMID: 17932912 DOI: 10.1002/prot.21715] [Citation(s) in RCA: 733] [Impact Index Per Article: 45.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In protein structure prediction, a considerable number of alternative models are usually produced from which subsequently the final model has to be selected. Thus, a scoring function for the identification of the best model within an ensemble of alternative models is a key component of most protein structure prediction pipelines. QMEAN, which stands for Qualitative Model Energy ANalysis, is a composite scoring function describing the major geometrical aspects of protein structures. Five different structural descriptors are used. The local geometry is analyzed by a new kind of torsion angle potential over three consecutive amino acids. A secondary structure-specific distance-dependent pairwise residue-level potential is used to assess long-range interactions. A solvation potential describes the burial status of the residues. Two simple terms describing the agreement of predicted and calculated secondary structure and solvent accessibility, respectively, are also included. A variety of different implementations are investigated and several approaches to combine and optimize them are discussed. QMEAN was tested on several standard decoy sets including a molecular dynamics simulation decoy set as well as on a comprehensive data set of totally 22,420 models from server predictions for the 95 targets of CASP7. In a comparison to five well-established model quality assessment programs, QMEAN shows a statistically significant improvement over nearly all quality measures describing the ability of the scoring function to identify the native structure and to discriminate good from bad models. The three-residue torsion angle potential turned out to be very effective in recognizing the native fold.
Collapse
Affiliation(s)
- Pascal Benkert
- Institute for Biochemistry, University of Cologne, 50674 Cologne, Germany
| | | | | |
Collapse
|