201
|
Ferrada E, Vergara IA, Melo F. A knowledge-based potential with an accurate description of local interactions improves discrimination between native and near-native protein conformations. Cell Biochem Biophys 2007; 49:111-24. [PMID: 17906366 DOI: 10.1007/s12013-007-0050-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2007] [Revised: 11/30/1999] [Accepted: 07/16/2007] [Indexed: 10/22/2022]
Abstract
The correct discrimination between native and near-native protein conformations is essential for achieving accurate computer-based protein structure prediction. However, this has proven to be a difficult task, since currently available physical energy functions, empirical potentials and statistical scoring functions are still limited in achieving this goal consistently. In this work, we assess and compare the ability of different full atom knowledge-based potentials to discriminate between native protein structures and near-native protein conformations generated by comparative modeling. Using a benchmark of 152 near-native protein models and their corresponding native structures that encompass several different folds, we demonstrate that the incorporation of close non-bonded pairwise atom terms improves the discriminating power of the empirical potentials. Since the direct and unbiased derivation of close non-bonded terms from current experimental data is not possible, we obtained and used those terms from the corresponding pseudo-energy functions of a non-local knowledge-based potential. It is shown that this methodology significantly improves the discrimination between native and near-native protein conformations, suggesting that a proper description of close non-bonded terms is important to achieve a more complete and accurate description of native protein conformations. Some external knowledge-based energy functions that are widely used in model assessment performed poorly, indicating that the benchmark of models and the specific discrimination task tested in this work constitutes a difficult challenge.
Collapse
Affiliation(s)
- Evandro Ferrada
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Alameda 340, Santiago, Chile
| | | | | |
Collapse
|
202
|
Fitzgerald JE, Jha AK, Colubri A, Sosnick TR, Freed KF. Reduced C(beta) statistical potentials can outperform all-atom potentials in decoy identification. Protein Sci 2007; 16:2123-39. [PMID: 17893359 PMCID: PMC2204143 DOI: 10.1110/ps.072939707] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
We developed a series of statistical potentials to recognize the native protein from decoys, particularly when using only a reduced representation in which each side chain is treated as a single C(beta) atom. Beginning with a highly successful all-atom statistical potential, the Discrete Optimized Protein Energy function (DOPE), we considered the implications of including additional information in the all-atom statistical potential and subsequently reducing to the C(beta) representation. One of the potentials includes interaction energies conditional on backbone geometries. A second potential separates sequence local from sequence nonlocal interactions and introduces a novel reference state for the sequence local interactions. The resultant potentials perform better than the original DOPE statistical potential in decoy identification. Moreover, even upon passing to a reduced C(beta) representation, these statistical potentials outscore the original (all-atom) DOPE potential in identifying native states for sets of decoys. Interestingly, the backbone-dependent statistical potential is shown to retain nearly all of the information content of the all-atom representation in the C(beta) representation. In addition, these new statistical potentials are combined with existing potentials to model hydrogen bonding, torsion energies, and solvation energies to produce even better performing potentials. The ability of the C(beta) statistical potentials to accurately represent protein interactions bodes well for computational efficiency in protein folding calculations using reduced backbone representations, while the extensions to DOPE illustrate general principles for improving knowledge-based potentials.
Collapse
Affiliation(s)
- James E Fitzgerald
- Department of Physics, The University of Chicago, Chicago, Illinois 60637, USA
| | | | | | | | | |
Collapse
|
203
|
Abstract
Accurate and automated assessment of both geometrical errors and incompleteness of comparative protein structure models is necessary for an adequate use of the models. Here, we describe a composite score for discriminating between models with the correct and incorrect fold. To find an accurate composite score, we designed and applied a genetic algorithm method that searched for a most informative subset of 21 input model features as well as their optimized nonlinear transformation into the composite score. The 21 input features included various statistical potential scores, stereochemistry quality descriptors, sequence alignment scores, geometrical descriptors, and measures of protein packing. The optimized composite score was found to depend on (1) a statistical potential z-score for residue accessibilities and distances, (2) model compactness, and (3) percentage sequence identity of the alignment used to build the model. The accuracy of the composite score was compared with the accuracy of assessment by single and combined features as well as by other commonly used assessment methods. The testing set was representative of models produced by automated comparative modeling on a genomic scale. The composite score performed better than any other tested score in terms of the maximum correct classification rate (i.e., 3.3% false positives and 2.5% false negatives) as well as the sensitivity and specificity across the whole range of thresholds. The composite score was implemented in our program MODELLER-8 and was used to assess models in the MODBASE database that contains comparative models for domains in approximately 1.3 million protein sequences.
Collapse
Affiliation(s)
- Francisco Melo
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile.
| | | |
Collapse
|
204
|
Parthiban V, Gromiha MM, Abhinandan M, Schomburg D. Computational modeling of protein mutant stability: analysis and optimization of statistical potentials and structural features reveal insights into prediction model development. BMC STRUCTURAL BIOLOGY 2007; 7:54. [PMID: 17705837 PMCID: PMC2000882 DOI: 10.1186/1472-6807-7-54] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 08/16/2007] [Indexed: 02/02/2023]
Abstract
Background Understanding and predicting protein stability upon point mutations has wide-spread importance in molecular biology. Several prediction models have been developed in the past with various algorithms. Statistical potentials are one of the widely used algorithms for the prediction of changes in stability upon point mutations. Although the methods provide flexibility and the capability to develop an accurate and reliable prediction model, it can be achieved only by the right selection of the structural factors and optimization of their parameters for the statistical potentials. In this work, we have selected five atom classification systems and compared their efficiency for the development of amino acid atom potentials. Additionally, torsion angle potentials have been optimized to include the orientation of amino acids in such a way that altered backbone conformation in different secondary structural regions can be included for the prediction model. This study also elaborates the importance of classifying the mutations according to their solvent accessibility and secondary structure specificity. The prediction efficiency has been calculated individually for the mutations in different secondary structural regions and compared. Results Results show that, in addition to using an advanced atom description, stepwise regression and selection of atoms are necessary to avoid the redundancy in atom distribution and improve the reliability of the prediction model validation. Comparing to other atom classification models, Melo-Feytmans model shows better prediction efficiency by giving a high correlation of 0.85 between experimental and theoretical ΔΔG with 84.06% of the mutations correctly predicted out of 1538 mutations. The theoretical ΔΔG values for the mutations in partially buried β-strands generated by the structural training dataset from PISCES gave a correlation of 0.84 without performing the Gaussian apodization of the torsion angle distribution. After the Gaussian apodization, the correlation increased to 0.92 and prediction accuracy increased from 80% to 88.89% respectively. Conclusion These findings were useful for the optimization of the Melo-Feytmans atom classification system and implementing them to develop the statistical potentials. It was also significant that the prediction efficiency of mutations in the partially buried β-strands improves with the help of Gaussian apodization of the torsion angle distribution. All these comparisons and optimization techniques demonstrate their advantages as well as the restrictions for the development of the prediction model. These findings will be quite helpful not only for the protein stability prediction, but also for various structure solutions in future.
Collapse
Affiliation(s)
- Vijaya Parthiban
- Cologne University Bioinformatics Center, International Max Planck Research School, Cologne, Germany
| | - M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Japan
| | - Madenhalli Abhinandan
- Cologne University Bioinformatics Center, International Max Planck Research School, Cologne, Germany
| | - Dietmar Schomburg
- Cologne University Bioinformatics Center, International Max Planck Research School, Cologne, Germany
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Japan
| |
Collapse
|
205
|
Ferrada E, Melo F. Nonbonded terms extrapolated from nonlocal knowledge-based energy functions improve error detection in near-native protein structure models. Protein Sci 2007; 16:1410-21. [PMID: 17586774 PMCID: PMC2206707 DOI: 10.1110/ps.062735907] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The accurate assessment of structural errors plays a key role in protein structure prediction, constitutes the first step of protein structure refinement, and has a major impact on subsequent functional inference from structural data. In this study, we assess and compare the ability of different full atom knowledge-based potentials to detect small and localized errors in comparative protein structure models of known accuracy. We have evaluated the effect of incorporating close nonbonded pairwise atom terms on the task of classifying residue modeling accuracy. Since the direct and unbiased derivation of close nonbonded terms from current experimental data is not possible, we extrapolated those terms from the corresponding pseudo-energy functions of a nonlocal knowledge-based potential. It is shown that this methodology clearly improves the detection of errors in protein models, suggesting that a proper description of close nonbonded terms is important to achieve a more complete and accurate description of native protein conformations. The use of close nonbonded terms directly derived from experimental data exhibited a poor performance, demonstrating that these terms cannot be accurately obtained by using the current data and methodology. Some external knowledge-based energy functions that are widely used in model assessment also performed poorly, which suggests that the benchmark of models and the specific error detection task tested in this study constituted a difficult challenge. The methodology presented here could be useful to detect localized structural errors not only in high-quality protein models, but also in experimental protein structures.
Collapse
Affiliation(s)
- Evandro Ferrada
- Departmento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de Chile, Santiago, Chile
| | | |
Collapse
|
206
|
Wu Y, Lu M, Chen M, Li J, Ma J. OPUS-Ca: a knowledge-based potential function requiring only Calpha positions. Protein Sci 2007; 16:1449-63. [PMID: 17586777 PMCID: PMC2206690 DOI: 10.1110/ps.072796107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In this paper, we report a knowledge-based potential function, named the OPUS-Ca potential, that requires only Calpha positions as input. The contributions from other atomic positions were established from pseudo-positions artificially built from a Calpha trace for auxiliary purposes. The potential function is formed based on seven major representative molecular interactions in proteins: distance-dependent pairwise energy with orientational preference, hydrogen bonding energy, short-range energy, packing energy, tri-peptide packing energy, three-body energy, and solvation energy. From the testing of decoy recognition on a number of commonly used decoy sets, it is shown that the new potential function outperforms all known Calpha-based potentials and most other coarse-grained ones that require more information than Calpha positions. We hope that this potential function adds a new tool for protein structural modeling.
Collapse
Affiliation(s)
- Yinghao Wu
- Department of Bioengineering, Rice University, Houston, TX 77005, USA
| | | | | | | | | |
Collapse
|
207
|
Rykunov D, Fiser A. Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins 2007; 67:559-68. [PMID: 17335003 DOI: 10.1002/prot.21279] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.
Collapse
Affiliation(s)
- Dmitry Rykunov
- Department of Biochemistry, Seaver Center for Bioinformatics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | | |
Collapse
|
208
|
Zhang Z, Chen H, Lai L. Identification of amyloid fibril-forming segments based on structure and residue-based statistical potential. Bioinformatics 2007; 23:2218-25. [PMID: 17599928 DOI: 10.1093/bioinformatics/btm325] [Citation(s) in RCA: 83] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
MOTIVATION Experimental evidence suggests that certain short protein segments have stronger amyloidogenic propensities than others. Identification of the fibril-forming segments of proteins is crucial for understanding diseases associated with protein misfolding and for finding favorable targets for therapeutic strategies. RESULT In this study, we used the microcrystal structure of the NNQQNY peptide from yeast prion protein and residue-based statistical potentials to establish an algorithm to identify the amyloid fibril-forming segment of proteins. Using the same sets of sequences, a comparable prediction performance was obtained from this study to that from 3D profile method based on the physical atomic-level potential ROSETTADESIGN. The predicted results are consistent with experiments for several representative proteins associated with amyloidosis, and also agree with the idea that peptides that can form fibrils may have strong sequence signatures. Application of the residue-based statistical potentials is computationally more efficient than using atomic-level potentials and can be applied in whole proteome analysis to investigate the evolutionary pressure effect or forecast other latent diseases related to amyloid deposits. AVAILABILITY The fibril prediction program is available at ftp://mdl.ipc.pku.edu.cn/pub/software/pre-amyl/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Zhuqing Zhang
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | | | | |
Collapse
|
209
|
Fasnacht M, Zhu J, Honig B. Local quality assessment in homology models using statistical potentials and support vector machines. Protein Sci 2007; 16:1557-68. [PMID: 17600147 PMCID: PMC2203356 DOI: 10.1110/ps.072856307] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
In this study, we address the problem of local quality assessment in homology models. As a prerequisite for the evaluation of methods for predicting local model quality, we first examine the problem of measuring local structural similarities between a model and the corresponding native structure. Several local geometric similarity measures are evaluated. Two methods based on structural superposition are found to best reproduce local model quality assessments by human experts. We then examine the performance of state-of-the-art statistical potentials in predicting local model quality on three qualitatively distinct data sets. The best statistical potential, DFIRE, is shown to perform on par with the best current structure-based method in the literature, ProQres. A combination of different statistical potentials and structural features using support vector machines is shown to provide somewhat improved performance over published methods.
Collapse
Affiliation(s)
- Marc Fasnacht
- Howard Hughes Medical Institute at Columbia University, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, New York, New York 10032, USA
| | | | | |
Collapse
|
210
|
Bockhorst J, Lu F, Janes JH, Keebler J, Gamain B, Awadalla P, Su XZ, Samudrala R, Jojic N, Smith JD. Structural polymorphism and diversifying selection on the pregnancy malaria vaccine candidate VAR2CSA. Mol Biochem Parasitol 2007; 155:103-12. [PMID: 17669514 DOI: 10.1016/j.molbiopara.2007.06.007] [Citation(s) in RCA: 94] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2007] [Revised: 06/11/2007] [Accepted: 06/12/2007] [Indexed: 01/08/2023]
Abstract
VAR2CSA is the main candidate for a pregnancy malaria vaccine, but vaccine development may be complicated by sequence polymorphism. Here, we obtained partial or full-length var2CSA sequences from 106 parasites and applied novel computational methods and three-dimensional modeling to investigate VAR2CSA geographic variation and selection pressure. Our analysis reveals structural patterns of VAR2CSA sequence variation in which polymorphic sites group into segments of limited diversity. Within these segments, two or three basic types characterize a substantial majority of the parasite samples. Comparison to the primate malaria Plasmodium reichenowi shows that these basic types have ancient origins. Globally, var2CSA genes are comprised of a mosaic of these ancestral polymorphic segments that have recombined extensively between var2CSA alleles. Three-dimensional modeling reveals that polymorphic segments concentrate in flexible loops at characteristic locations in the six VAR2CSA Duffy binding-like (DBL) adhesion domains. Individual DBL domain surfaces have distinct patterns of diversifying selection, suggesting that limited and differing portions of each DBL domain are targeted by host antibody. Since standard phylogenetic tree analysis is inadequate for highly recombining genes like var2CSA, we developed a novel phylogenetic approach that incorporates recombination and tracks new mutations in segment types. In the resulting tree, P. reichenowi is confirmed as an outlier and African and Asian P. falciparum isolates have slightly diverged. These findings validate a new approach to modeling protein evolution in the presence of frequent recombination and provide a clearer understanding of how var gene products function as immunoevasive binding ligands.
Collapse
MESH Headings
- Animals
- Antigens, Protozoan/chemistry
- Antigens, Protozoan/genetics
- Antigens, Protozoan/immunology
- Computational Biology/methods
- DNA, Protozoan/chemistry
- DNA, Protozoan/genetics
- Female
- Geography
- Humans
- Malaria/immunology
- Malaria/parasitology
- Malaria Vaccines/immunology
- Models, Molecular
- Molecular Sequence Data
- Phylogeny
- Plasmodium falciparum/genetics
- Plasmodium falciparum/isolation & purification
- Polymorphism, Genetic
- Pregnancy
- Pregnancy Complications, Parasitic/immunology
- Pregnancy Complications, Parasitic/prevention & control
- Protein Structure, Tertiary
- Selection, Genetic
- Sequence Analysis, DNA
- Sequence Homology, Amino Acid
Collapse
|
211
|
Yang Y, Liu H. Genetic algorithms for protein conformation sampling and optimization in a discrete backbone dihedral angle space. J Comput Chem 2007; 27:1593-602. [PMID: 16868993 DOI: 10.1002/jcc.20463] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We have investigated protein conformation sampling and optimization based on the genetic algorithm and discrete main chain dihedral state model. An efficient approach combining the genetic algorithm with local minimization and with a niche technique based on the sharing function is proposed. Using two different types of potential energy functions, a Go-type potential function and a knowledge-based pairwise potential energy function, and a test set containing small proteins of varying sizes and secondary structure compositions, we demonstrated the importance of local minimization and population diversity in protein conformation optimization with genetic algorithms. Some general properties of the sampled conformations such as their native-likeness and the influences of including side-chains are discussed.
Collapse
Affiliation(s)
- Yuedong Yang
- Hefei National Laboratory for Physical Sciences, Key Laboratory of Structural Biology, School of Life Sciences, University of Science and Technology of China, Hefei, Anhui 230026, People's Republic of China
| | | |
Collapse
|
212
|
Hartmann C, Antes I, Lengauer T. IRECS: a new algorithm for the selection of most probable ensembles of side-chain conformations in protein models. Protein Sci 2007; 16:1294-307. [PMID: 17567749 PMCID: PMC2206697 DOI: 10.1110/ps.062658307] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
We introduce a new algorithm, IRECS (Iterative REduction of Conformational Space), for identifying ensembles of most probable side-chain conformations for homology modeling. On the basis of a given rotamer library, IRECS ranks all side-chain rotamers of a protein according to the probability with which each side chain adopts the respective rotamer conformation. This ranking enables the user to select small rotamer sets that are most likely to contain a near-native rotamer for each side chain. IRECS can therefore act as a fast heuristic alternative to the Dead-End-Elimination algorithm (DEE). In contrast to DEE, IRECS allows for the selection of rotamer subsets of arbitrary size, thus being able to define structure ensembles for a protein. We show that the selection of more than one rotamer per side chain is generally meaningful, since the selected rotamers represent the conformational space of flexible side chains. A knowledge-based statistical potential ROTA was constructed for the IRECS algorithm. The potential was optimized to discriminate between side-chain conformations of native and rotameric decoys of protein structures. By restricting the number of rotamers per side chain to one, IRECS can optimize side chains for a single conformation model. The average accuracy of IRECS for the chi1 and chi1+2 dihedral angles amounts to 84.7% and 71.6%, respectively, using a 40 degrees cutoff. When we compared IRECS with SCWRL and SCAP, the performance of IRECS was comparable to that of both methods. IRECS and the ROTA potential are available for download from the URL http://irecs.bioinf.mpi-inf.mpg.de.
Collapse
|
213
|
Berube PM, Samudrala R, Stahl DA. Transcription of all amoC copies is associated with recovery of Nitrosomonas europaea from ammonia starvation. J Bacteriol 2007; 189:3935-44. [PMID: 17384196 PMCID: PMC1913382 DOI: 10.1128/jb.01861-06] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2006] [Accepted: 03/14/2007] [Indexed: 11/20/2022] Open
Abstract
The chemolithotrophic ammonia-oxidizing bacterium Nitrosomonas europaea is known to be highly resistant to starvation conditions. The transcriptional response of N. europaea to ammonia addition following short- and long-term starvation was examined by primer extension and S1 nuclease protection analyses of genes encoding enzymes for ammonia oxidation (amoCAB operons) and CO(2) fixation (cbbLS), a third, lone copy of amoC (amoC(3)), and two representative housekeeping genes (glyA and rpsJ). Primer extension analysis of RNA isolated from growing, starved, and recovering cells revealed two differentially regulated promoters upstream of the two amoCAB operons. The distal sigma(70) type amoCAB promoter was constitutively active in the presence of ammonia, but the proximal promoter was only active when cells were recovering from ammonia starvation. The lone, divergent copy of amoC (amoC(3)) was expressed only during recovery. Both the proximal amoC(1,2) promoter and the amoC(3) promoter are similar to gram-negative sigma(E) promoters, thus implicating sigma(E) in the regulation of the recovery response. Although modeling of subunit interactions suggested that a nonconservative proline substitution in AmoC(3) may modify the activity of the holoenzyme, characterization of a DeltaamoC(3) strain showed no significant difference in starvation recovery under conditions evaluated. In contrast to the amo transcripts, a delayed appearance of transcripts for a gene required for CO(2) fixation (cbbL) suggested that its transcription is retarded until sufficient energy is available. Overall, these data revealed a programmed exit from starvation likely involving regulation by sigma(E) and the coordinated regulation of catabolic and anabolic genes.
Collapse
Affiliation(s)
- Paul M Berube
- Department of Microbiology, University of Washington, Seattle, WA 98195-2700, USA
| | | | | |
Collapse
|
214
|
Montero-Morán GM, Li M, Rendòn-Huerta E, Jourdan F, Lowe DJ, Stumpff-Kane AW, Feig M, Scazzocchio C, Hausinger RP. Purification and characterization of the FeII- and alpha-ketoglutarate-dependent xanthine hydroxylase from Aspergillus nidulans. Biochemistry 2007; 46:5293-304. [PMID: 17429948 PMCID: PMC2525507 DOI: 10.1021/bi700065h] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
His6-tagged xanthine/alpha-ketoglutarate (alphaKG) dioxygenase (XanA) of Aspergillus nidulans was purified from both the fungal mycelium and recombinant Escherichia coli cells, and the properties of the two forms of the protein were compared. Evidence was obtained for both N- and O-linked glycosylation on the fungus-derived XanA, which aggregates into an apparent dodecamer, while bacterium-derived XanA is free of glycosylation and behaves as a monomer. Immunological methods identify phosphothreonine in both forms of XanA, with phosphoserine also detected in the bacterium-derived protein. Mass spectrometric analysis confirms glycosylation and phosphorylation of the fungus-derived sample, which also undergoes extensive truncation at its amino terminus. Despite the major differences in the properties of these proteins, their kinetic parameters are similar (kcat = 30-70 s-1, Km of alphaKG = 31-50 muM, Km of xanthine approximately 45 muM, and pH optima at 7.0-7.4). The enzyme exhibits no significant isotope effect when [8-2H]xanthine is used; however, it demonstrates a 2-fold solvent deuterium isotope effect. CuII and ZnII potently inhibit the FeII-specific enzyme, whereas CoII, MnII, and NiII are weaker inhibitors. NaCl decreases the kcat and increases the Km of both alphaKG and xanthine. The alphaKG cosubstrate can be substituted with alpha-ketoadipate (9-fold decrease in kcat and 5-fold increase in the Km compared to those of the normal alpha-keto acid), while the alphaKG analogue N-oxalylglycine is a competitive inhibitor (Ki = 0.12 muM). No alternative purines effectively substitute for xanthine as a substrate, and only one purine analogue (6,8-dihydroxypurine) results in significant inhibition. Quenching of the endogenous fluorescence of the two enzyme forms by xanthine, alphaKG, and DHP was used to characterize their binding properties. A XanA homology model was generated on the basis of the structure of the related enzyme TauD (PDB entry 1OS7) and provided insights into the sites of posttranslational modification and substrate binding. These studies represent the first biochemical characterization of purified xanthine/alphaKG dioxygenase.
Collapse
Affiliation(s)
- Gabriela M Montero-Morán
- Institut de Génétique et de Microbiologie, Université Paris-Sud, Bâtiment 409, UMR 8621 CNRS, 91405 Orsay Cedex, France
| | | | | | | | | | | | | | | | | |
Collapse
|
215
|
Rakhmanov SV, Makeev VJ. Atomic hydration potentials using a Monte Carlo Reference State (MCRS) for protein solvation modeling. BMC STRUCTURAL BIOLOGY 2007; 7:19. [PMID: 17397537 PMCID: PMC1852318 DOI: 10.1186/1472-6807-7-19] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/15/2006] [Accepted: 03/30/2007] [Indexed: 11/10/2022]
Abstract
Background Accurate description of protein interaction with aqueous solvent is crucial for modeling of protein folding, protein-protein interaction, and drug design. Efforts to build a working description of solvation, both by continuous models and by molecular dynamics, yield controversial results. Specifically constructed knowledge-based potentials appear to be promising for accounting for the solvation at the molecular level, yet have not been used for this purpose. Results We developed original knowledge-based potentials to study protein hydration at the level of atom contacts. The potentials were obtained using a new Monte Carlo reference state (MCRS), which simulates the expected probability density of atom-atom contacts via exhaustive sampling of structure space with random probes. Using the MCRS allowed us to calculate the expected atom contact densities with high resolution over a broad distance range including very short distances. Knowledge-based potentials for hydration of protein atoms of different types were obtained based on frequencies of their contacts at different distances with protein-bound water molecules, in a non-redundant training data base of 1776 proteins with known 3D structures. Protein hydration sites were predicted in a test set of 12 proteins with experimentally determined water locations. The MCRS greatly improves prediction of water locations over existing methods. In addition, the contribution of the energy of macromolecular solvation into total folding free energy was estimated, and tested in fold recognition experiments. The correct folds were preferred over all the misfolded decoys for the majority of proteins from the improved Rosetta decoy set based on the structure hydration energy alone. Conclusion MCRS atomic hydration potentials provide a detailed distance-dependent description of hydropathies of individual protein atoms. This allows placement of water molecules on the surface of proteins and in protein interfaces with much higher precision. The potentials provide a means to estimate the total solvation energy for a protein structure, in many cases achieving a successful fold recognition. Possible applications of atomic hydration potentials to structure verification, protein folding and stability, and protein-protein interactions are discussed.
Collapse
Affiliation(s)
- Sergei V Rakhmanov
- Institute of Genetics and Selection of Industrial Microorganisms, State Research Centre GosNIIgenetika, 1Dorozhny proezd, 1, Moscow, Russia
| | - Vsevolod J Makeev
- Institute of Genetics and Selection of Industrial Microorganisms, State Research Centre GosNIIgenetika, 1Dorozhny proezd, 1, Moscow, Russia
- Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova str. 32, Moscow, Russia
| |
Collapse
|
216
|
Fogolari F, Pieri L, Dovier A, Bortolussi L, Giugliarelli G, Corazza A, Esposito G, Viglino P. Scoring predictive models using a reduced representation of proteins: model and energy definition. BMC STRUCTURAL BIOLOGY 2007; 7:15. [PMID: 17378941 PMCID: PMC1854906 DOI: 10.1186/1472-6807-7-15] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 09/28/2006] [Accepted: 03/23/2007] [Indexed: 11/25/2022]
Abstract
Background Reduced representations of proteins have been playing a keyrole in the study of protein folding. Many such models are available, with different representation detail. Although the usefulness of many such models for structural bioinformatics applications has been demonstrated in recent years, there are few intermediate resolution models endowed with an energy model capable, for instance, of detecting native or native-like structures among decoy sets. The aim of the present work is to provide a discrete empirical potential for a reduced protein model termed here PC2CA, because it employs a PseudoCovalent structure with only 2 Centers of interactions per Amino acid, suitable for protein model quality assessment. Results All protein structures in the set top500H have been converted in reduced form. The distribution of pseudobonds, pseudoangle, pseudodihedrals and distances between centers of interactions have been converted into potentials of mean force. A suitable reference distribution has been defined for non-bonded interactions which takes into account excluded volume effects and protein finite size. The correlation between adjacent main chain pseudodihedrals has been converted in an additional energetic term which is able to account for cooperative effects in secondary structure elements. Local energy surface exploration is performed in order to increase the robustness of the energy function. Conclusion The model and the energy definition proposed have been tested on all the multiple decoys' sets in the Decoys'R'us database. The energetic model is able to recognize, for almost all sets, native-like structures (RMSD less than 2.0 Å). These results and those obtained in the blind CASP7 quality assessment experiment suggest that the model compares well with scoring potentials with finer granularity and could be useful for fast exploration of conformational space. Parameters are available at the url: .
Collapse
Affiliation(s)
- Federico Fogolari
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| | - Lidia Pieri
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
- INAF – Astronomical Observatory of Padova Vicolo dell'Osservatorio 5, I-35122 Padova, Italy
| | - Agostino Dovier
- Dipartimento di Matematica e Informatica, Università di Udine, Via delle Scienze 206, 33100 Udine, Italy
| | - Luca Bortolussi
- Dipartimento di Matematica e Informatica, Università di Udine, Via delle Scienze 206, 33100 Udine, Italy
| | - Gilberto Giugliarelli
- Dipartimento di Fisica, Università di Udine, Via delle Scienze 206, 33100 Udine, Italy
| | - Alessandra Corazza
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| | - Gennaro Esposito
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| | - Paolo Viglino
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| |
Collapse
|
217
|
Protein structure prediction by all-atom free-energy refinement. BMC STRUCTURAL BIOLOGY 2007; 7:12. [PMID: 17371594 PMCID: PMC1832197 DOI: 10.1186/1472-6807-7-12] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2006] [Accepted: 03/19/2007] [Indexed: 11/18/2022]
Abstract
Background The reliable prediction of protein tertiary structure from the amino acid sequence remains challenging even for small proteins. We have developed an all-atom free-energy protein forcefield (PFF01) that we could use to fold several small proteins from completely extended conformations. Because the computational cost of de-novo folding studies rises steeply with system size, this approach is unsuitable for structure prediction purposes. We therefore investigate here a low-cost free-energy relaxation protocol for protein structure prediction that combines heuristic methods for model generation with all-atom free-energy relaxation in PFF01. Results We use PFF01 to rank and cluster the conformations for 32 proteins generated by ROSETTA. For 22/10 high-quality/low quality decoy sets we select near-native conformations with an average Cα root mean square deviation of 3.03 Å/6.04 Å. The protocol incorporates an inherent reliability indicator that succeeds for 78% of the decoy sets. In over 90% of these cases near-native conformations are selected from the decoy set. This success rate is rationalized by the quality of the decoys and the selectivity of the PFF01 forcefield, which ranks near-native conformations an average 3.06 standard deviations below that of the relaxed decoys (Z-score). Conclusion All-atom free-energy relaxation with PFF01 emerges as a powerful low-cost approach toward generic de-novo protein structure prediction. The approach can be applied to large all-atom decoy sets of any origin and requires no preexisting structural information to identify the native conformation. The study provides evidence that a large class of proteins may be foldable by PFF01.
Collapse
|
218
|
Cheng J, Pei J, Lai L. A free-rotating and self-avoiding chain model for deriving statistical potentials based on protein structures. Biophys J 2007; 92:3868-77. [PMID: 17351015 PMCID: PMC1868969 DOI: 10.1529/biophysj.106.102152] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Statistical potentials have been widely used in protein studies despite the much-debated theoretical basis. In this work, we have applied two physical reference states for deriving the statistical potentials based on protein structure features to achieve zero interaction and orthogonalization. The free-rotating chain-based potential applies a local free-rotating chain reference state, which could theoretically be described by the Gaussian distribution. The self-avoiding chain-based potential applies a reference state derived from a database of artificial self-avoiding backbones generated by Monte Carlo simulation. These physical reference states are independent of known protein structures and are based solely on the analytical formulation or simulation method. The new potentials performed better and yielded higher Z-scores and success rates compared to other statistical potentials. The end-to-end distance distribution produced by the self-avoiding chain model was similar to the distance distribution of protein atoms in structure database. This fact may partly explain the basis of the reference states that depend on the atom pair frequency observed in the protein database. The current study showed that a more physical reference model improved the performance of statistical potentials in protein fold recognition, which could also be extended to other types of applications.
Collapse
Affiliation(s)
- Ji Cheng
- State Key Laboratory for Structural Chemistry of Stable and Unstable Species, College of Chemistry and Molecular Engineering, and Center for Theoretical Biology, Peking University, Beijing, China
| | | | | |
Collapse
|
219
|
Adamczak R, Meller* J. On the transferability of folding and threading potentials and sequence-independent filters for protein folding simulations. Mol Phys 2007. [DOI: 10.1080/00268970410001728636] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Affiliation(s)
- Rafal Adamczak
- a Division of Biomedical Informatics , Children’s Hospital Research Foundation , 3333 Burnet Avenue, Cincinnati , OH 45229 , USA
| | - Jaroslaw Meller*
- a Division of Biomedical Informatics , Children’s Hospital Research Foundation , 3333 Burnet Avenue, Cincinnati , OH 45229 , USA
- b Department of Informatics , Nicholas Copernicus University , 87-100 Toruń , Poland
| |
Collapse
|
220
|
Summa CM, Levitt M. Near-native structure refinement using in vacuo energy minimization. Proc Natl Acad Sci U S A 2007; 104:3177-82. [PMID: 17360625 PMCID: PMC1802011 DOI: 10.1073/pnas.0611593104] [Citation(s) in RCA: 124] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
One of the greatest shortcomings of macromolecular energy minimization and molecular dynamics techniques is that they generally do not preserve the native structure of proteins as observed by x-ray crystallography. This deformation of the native structure means that these methods are not generally used to refine structures produced by homology-modeling techniques. Here, we use a database of 75 proteins to test the ability of a variety of popular molecular mechanics force fields to maintain the native structure. Minimization from the native structure is a weak test of potential energy functions: It is complemented by a much stronger test in which the same methods are compared for their ability to attract a near-native decoy protein structure toward the native structure. We use a powerfully convergent energy-minimization method and show that, of the traditional molecular mechanics potentials tested, only one showed a modest net improvement over a large data set of structurally diverse proteins. A smooth, differentiable knowledge-based pairwise atomic potential performs better on this test than traditional potential functions. This work is expected to have important implications for protein structure refinement, homology modeling, and structure prediction.
Collapse
Affiliation(s)
- Christopher M. Summa
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305-5126
| | - Michael Levitt
- Department of Structural Biology, Stanford University School of Medicine, Stanford, CA 94305-5126
- To whom correspondence should be addressed at:
Department of Structural Biology, Stanford University School of Medicine, D109 Fairchild Building, Stanford, CA 94305-5126. E-mail:
| |
Collapse
|
221
|
Shen MY, Sali A. Statistical potential for assessment and prediction of protein structures. Protein Sci 2007; 15:2507-24. [PMID: 17075131 PMCID: PMC2242414 DOI: 10.1110/ps.062416606] [Citation(s) in RCA: 1761] [Impact Index Per Article: 103.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
Protein structures in the Protein Data Bank provide a wealth of data about the interactions that determine the native states of proteins. Using the probability theory, we derive an atomic distance-dependent statistical potential from a sample of native structures that does not depend on any adjustable parameters (Discrete Optimized Protein Energy, or DOPE). DOPE is based on an improved reference state that corresponds to noninteracting atoms in a homogeneous sphere with the radius dependent on a sample native structure; it thus accounts for the finite and spherical shape of the native structures. The DOPE potential was extracted from a nonredundant set of 1472 crystallographic structures. We tested DOPE and five other scoring functions by the detection of the native state among six multiple target decoy sets, the correlation between the score and model error, and the identification of the most accurate non-native structure in the decoy set. For all decoy sets, DOPE is the best performing function in terms of all criteria, except for a tie in one criterion for one decoy set. To facilitate its use in various applications, such as model assessment, loop modeling, and fitting into cryo-electron microscopy mass density maps combined with comparative protein structure modeling, DOPE was incorporated into the modeling package MODELLER-8.
Collapse
Affiliation(s)
- Min-Yi Shen
- Department of Biopharmaceutical Sciences, Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, California 94158, USA.
| | | |
Collapse
|
222
|
Zhao G, Lu H. Development of a Grid-based statistical potential for protein structure prediction. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2007; 2005:6064-7. [PMID: 17281645 DOI: 10.1109/iembs.2005.1615875] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
A key component in protein structure prediction is the development of potentials that can discriminate native or near native structures from the wrong ones. Most previously developed statistical potentials are based on the preferred distances between any pair of residues. Here we explore the possible angle dependence between pairs of residues in addition to their distance dependence. For simplicity, we used a grid based partition of the space and analyzed relative geometric propensity between protein residue pairs. One thousand and nine non-redundant protein structures are studied in this paper. We have attached a local coordinate system to each amino acid, and spatial distributions of its nearby residues are investigated. Within the same distance range, there are clear differences in various grids. We have further developed a grid-based statistical potential, which incorporates both the distance dependence and angle dependence using a quasi-chemical approximation. The potential is tested against 32 decoy sets, and in 25 cases the native structure has the best score. This performance is comparable and in two cases better than best performance from previously developed distance-dependent statistical potentials on residue and atom level.
Collapse
Affiliation(s)
- Guijun Zhao
- Bioinformatics Program, Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607 USA
| | | |
Collapse
|
223
|
Rajgaria R, McAllister SR, Floudas CA. A novel high resolution Calpha--Calpha distance dependent force field based on a high quality decoy set. Proteins 2007; 65:726-41. [PMID: 16981202 DOI: 10.1002/prot.21149] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
This work presents a novel C(alpha)--C(alpha) distance dependent force field which is successful in selecting native structures from an ensemble of high resolution near-native conformers. An enhanced and diverse protein set, along with an improved decoy generation technique, contributes to the effectiveness of this potential. High quality decoys were generated for 1489 nonhomologous proteins and used to train an optimization based linear programming formulation. The goal in developing a set of high resolution decoys was to develop a simple, distance-dependent force field that yields the native structure as the lowest energy structure and assigns higher energies to decoy structures that are quite similar as well as those that are less similar. The model also includes a set of physical constraints that were based on experimentally observed physical behavior of the amino acids. The force field was tested on two sets of test decoys not in the training set and was found to excel on all the metrics that are widely used to measure the effectiveness of a force field. The high resolution force field was successful in correctly identifying 113 native structures out of 150 test cases and the average rank obtained for this test was 1.87. All the high resolution structures (training and testing) used for this work are available online and can be downloaded from http://titan.princeton.edu/HRDecoys.
Collapse
Affiliation(s)
- R Rajgaria
- Department of Chemical Engineering, Princeton University, Princeton, New Jersey 08544-5263, USA
| | | | | |
Collapse
|
224
|
|
225
|
Zhu J, Xie L, Honig B. Structural refinement of protein segments containing secondary structure elements: Local sampling, knowledge-based potentials, and clustering. Proteins 2006; 65:463-79. [PMID: 16927337 DOI: 10.1002/prot.21085] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
In this article, we present an iterative, modular optimization (IMO) protocol for the local structure refinement of protein segments containing secondary structure elements (SSEs). The protocol is based on three modules: a torsion-space local sampling algorithm, a knowledge-based potential, and a conformational clustering algorithm. Alternative methods are tested for each module in the protocol. For each segment, random initial conformations were constructed by perturbing the native dihedral angles of loops (and SSEs) of the segment to be refined while keeping the protein body fixed. Two refinement procedures based on molecular mechanics force fields - using either energy minimization or molecular dynamics - were also tested but were found to be less successful than the IMO protocol. We found that DFIRE is a particularly effective knowledge-based potential and that clustering algorithms that are biased by the DFIRE energies improve the overall results. Results were further improved by adding an energy minimization step to the conformations generated with the IMO procedure, suggesting that hybrid strategies that combine both knowledge-based and physical effective energy functions may prove to be particularly effective in future applications.
Collapse
Affiliation(s)
- Jiang Zhu
- Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics, Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Avenue, Room 815, New York, New York 10032, USA
| | | | | |
Collapse
|
226
|
Conner AC, Simms J, Conner MT, Wootten DL, Wheatley M, Poyner DR. Diverse functional motifs within the three intracellular loops of the CGRP1 receptor. Biochemistry 2006; 45:12976-85. [PMID: 17059214 DOI: 10.1021/bi0615801] [Citation(s) in RCA: 28] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The CGRP(1) receptor exists as a heterodimeric complex between a single-pass transmembrane accessory protein (RAMP1) and a family B G-protein-coupled receptor (GPCR) called the calcitonin receptor-like receptor (CLR). This study investigated the structural motifs found in the intracellular loops (ICLs) of this receptor. Molecular modeling was used to predict active and inactive conformations of each ICL. Conserved residues were altered to alanine by site-directed mutagenesis. cAMP accumulation, cell-surface expression, agonist affinity, and CGRP-stimulated receptor internalization were characterized. Within ICL1, L147 and particularly R151 were important for coupling to G(s). R151 may interact directly with the G-protein, accessing it following conformational changes involving ICL2 and ICL3. At the proximal end of ICL3, I290 and L294, probably lying on the same face of an alpha helix, formed a G-protein coupling motif. The largest effects on coupling were observed with I290A; additionally, it reduced CGRP affinity and impaired internalization. I290 may interact with TM6 to stabilize the conformation of ICL3, but it could also interact directly with Gs. R314, at the distal end of ICL3, impaired G-protein coupling and to a lesser extent reduced CGRP affinity; it may stabilize the TM6-ICL3 junction by interacting with the polar headgroups of membrane phospholipids. Y215 and L214 in ICL2 are required for cell-surface expression; they form a microdomain with H216 which has the same function. This study reveals similarities between the activation of CLR and other GPCRs in the role of TM6 and ICL3 but shows that other conserved motifs differ in their function.
Collapse
Affiliation(s)
- Alex C Conner
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, UK
| | | | | | | | | | | |
Collapse
|
227
|
Miyazawa S, Jernigan RL. How effective for fold recognition is a potential of mean force that includes relative orientations between contacting residues in proteins? J Chem Phys 2006; 122:024901. [PMID: 15638624 DOI: 10.1063/1.1824012] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
We estimate the statistical distribution of relative orientations between contacting residues from a database of protein structures and evaluate the potential of mean force for relative orientations between contacting residues. Polar angles and Euler angles are used to specify two degrees of directional freedom and three degrees of rotational freedom for the orientation of one residue relative to another in contacting residues, respectively. A local coordinate system affixed to each residue based only on main chain atoms is defined for fold recognition. The number of contacting residue pairs in the database will severely limit the resolution of the statistical distribution of relative orientations, if it is estimated by dividing space into cells and counting samples observed in each cell. To overcome such problems and to evaluate the fully anisotropic distributions of relative orientations as a function of polar and Euler angles, we choose a method in which the observed distribution is represented as a sum of delta functions each of which represents the observed orientation of a contacting residue, and is evaluated as a series expansion of spherical harmonics functions. The sample size limits the frequencies of modes whose expansion coefficients can be reliably estimated. High frequency modes are statistically less reliable than low frequency modes. Each expansion coefficient is separately corrected for the sample size according to suggestions from a Bayesian statistical analysis. As a result, many expansion terms can be utilized to evaluate orientational distributions. Also, unlike other orientational potentials, the uniform distribution is used for a reference distribution in evaluating a potential of mean force for each type of contacting residue pair from its orientational distribution, so that residue-residue orientations can be fully evaluated. It is shown by using decoy sets that the discrimination power of the orientational potential in fold recognition increases by taking account of the Euler angle dependencies and becomes comparable to that of a simple contact potential, and that the total energy potential taken as a simple sum of contact, orientation, and (phi,psi) potentials performs well to identify the native folds.
Collapse
Affiliation(s)
- Sanzo Miyazawa
- Faculty of Technology, Gunma University, Kiryu, Gunma 376-8515, Japan.
| | | |
Collapse
|
228
|
Robertson TA, Varani G. An all-atom, distance-dependent scoring function for the prediction of protein-DNA interactions from structure. Proteins 2006; 66:359-74. [PMID: 17078093 DOI: 10.1002/prot.21162] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We have developed an all-atom statistical potential function for the prediction of protein-DNA interactions from their structures, and show that this method outperforms similar, lower-resolution statistical potentials in a series of decoy discrimination experiments. The all-atom formalism appears to capture details of atomic interactions that are missed by the lower-resolution methods, with the majority of the discriminatory power arising from its description of short-range atomic contacts. We show that, on average, the method is able to identify 90% of near-native docking decoys within the best-scoring 10% of structures in a given decoy set, and it compares favorably with an optimized physical potential function in a test of structure-based identification of DNA binding-sequences. These results demonstrate that all-atom statistical functions specific to protein-DNA interactions can achieve great discriminatory power despite the limited size of the structural database. They also suggest that the statistical scores may soon be able to achieve accuracy on par with more complex, physical potential functions.
Collapse
Affiliation(s)
- Timothy A Robertson
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | | |
Collapse
|
229
|
Trovato A, Chiti F, Maritan A, Seno F. Insight into the structure of amyloid fibrils from the analysis of globular proteins. PLoS Comput Biol 2006; 2:e170. [PMID: 17173479 PMCID: PMC1698942 DOI: 10.1371/journal.pcbi.0020170] [Citation(s) in RCA: 172] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2006] [Accepted: 10/30/2006] [Indexed: 11/19/2022] Open
Abstract
The conversion from soluble states into cross-β fibrillar aggregates is a property shared by many different proteins and peptides and was hence conjectured to be a generic feature of polypeptide chains. Increasing evidence is now accumulating that such fibrillar assemblies are generally characterized by a parallel in-register alignment of β-strands contributed by distinct protein molecules. Here we assume a universal mechanism is responsible for β-structure formation and deduce sequence-specific interaction energies between pairs of protein fragments from a statistical analysis of the native folds of globular proteins. The derived fragment–fragment interaction was implemented within a novel algorithm, prediction of amyloid structure aggregation (PASTA), to investigate the role of sequence heterogeneity in driving specific aggregation into ordered self-propagating cross-β structures. The algorithm predicts that the parallel in-register arrangement of sequence portions that participate in the fibril cross-β core is favoured in most cases. However, the antiparallel arrangement is correctly discriminated when present in fibrils formed by short peptides. The predictions of the most aggregation-prone portions of initially unfolded polypeptide chains are also in excellent agreement with available experimental observations. These results corroborate the recent hypothesis that the amyloid structure is stabilised by the same physicochemical determinants as those operating in folded proteins. They also suggest that side chain–side chain interaction across neighbouring β-strands is a key determinant of amyloid fibril formation and of their self-propagating ability. In many fatal neurodegenerative diseases, including Alzheimer, Parkinson, and spongiform encephalopathies, proteins aggregate into specific fibrous structures to form insoluble plaques known as amyloid. The amyloid structure may also play a nonaberrant role in different organisms. Many globular proteins, folding to their biologically functional native structures in vivo, can be induced to aggregate into amyloid-like fibrils under suitable conditions in vitro. One hallmark of amyloid structure is a specific supramolecular architecture called cross-beta structure, held together by hydrogen bonds extending repeatedly along the fibril axis, but intermolecular interactions are yet unknown at the amino-acid level except for very few cases. In this study, the authors present an algorithm, called prediction of amyloid structure aggregation (PASTA), to computationally predict which portions of a given protein or peptide sequence forming amyloid fibrils are stabilizing the corresponding cross-beta structure and the specific intermolecular pattern of hydrogen-bonded amino acids. PASTA is based on the assumption that the same amino acid–specific interactions stabilizing hydrogen bond patterns in native structures of globular proteins are also employed by nature in amyloid structure. The successful comparison of the authors' prediction with available experimental data supports the existence of a unique framework to describe protein folding and aggregation.
Collapse
Affiliation(s)
- Antonio Trovato
- Consorzio Nazionale Interuniversitario per le Scienze Fisiche della Materia, Unità di Padova, Padua, Italy.
| | | | | | | |
Collapse
|
230
|
Sommer I, Toppo S, Sander O, Lengauer T, Tosatto SCE. Improving the quality of protein structure models by selecting from alignment alternatives. BMC Bioinformatics 2006; 7:364. [PMID: 16872519 PMCID: PMC1579234 DOI: 10.1186/1471-2105-7-364] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2006] [Accepted: 07/27/2006] [Indexed: 11/12/2022] Open
Abstract
Background In the area of protein structure prediction, recently a lot of effort has gone into the development of Model Quality Assessment Programs (MQAPs). MQAPs distinguish high quality protein structure models from inferior models. Here, we propose a new method to use an MQAP to improve the quality of models. With a given target sequence and template structure, we construct a number of different alignments and corresponding models for the sequence. The quality of these models is scored with an MQAP and used to choose the most promising model. An SVM-based selection scheme is suggested for combining MQAP partial potentials, in order to optimize for improved model selection. Results The approach has been tested on a representative set of proteins. The ability of the method to improve models was validated by comparing the MQAP-selected structures to the native structures with the model quality evaluation program TM-score. Using the SVM-based model selection, a significant increase in model quality is obtained (as shown with a Wilcoxon signed rank test yielding p-values below 10-15). The average increase in TMscore is 0.016, the maximum observed increase in TM-score is 0.29. Conclusion In template-based protein structure prediction alignment is known to be a bottleneck limiting the overall model quality. Here we show that a combination of systematic alignment variation and modern model scoring functions can significantly improve the quality of alignment-based models.
Collapse
Affiliation(s)
- Ingolf Sommer
- Department of Computational Biology and Applied Algorithmics, Max-Planck-lnstitute for Informatics, Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany
| | - Stefano Toppo
- Department of Biological Chemistry, University of Padova, via U. Bassi 58/b, 1-35121 Padova, Italy
| | - Oliver Sander
- Department of Computational Biology and Applied Algorithmics, Max-Planck-lnstitute for Informatics, Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany
| | - Thomas Lengauer
- Department of Computational Biology and Applied Algorithmics, Max-Planck-lnstitute for Informatics, Stuhlsatzenhausweg 85, D-66123 Saarbrücken, Germany
| | - Silvio CE Tosatto
- Department of Biology and CRIBI Biotechnology Centre University of Padova, V.le G. Colombo 3, I-35131 Padova, Italy
| |
Collapse
|
231
|
Simms J, Hay DL, Wheatley M, Poyner DR. Characterization of the structure of RAMP1 by mutagenesis and molecular modeling. Biophys J 2006; 91:662-9. [PMID: 16632510 PMCID: PMC1483116 DOI: 10.1529/biophysj.106.084582] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2006] [Accepted: 03/30/2006] [Indexed: 11/18/2022] Open
Abstract
Receptor activity modifying proteins (RAMPs) are a family of single-pass transmembrane proteins that dimerize with G-protein-coupled receptors. They may alter the ligand recognition properties of the receptors (particularly for the calcitonin receptor-like receptor, CLR). Very little structural information is available about RAMPs. Here, an ab initio model has been generated for the extracellular domain of RAMP1. The disulfide bond arrangement (Cys27-Cys82, Cys40-Cys72, and Cys57-Cys104) was determined by site-directed mutagenesis. The secondary structure (alpha-helices from residues 29-51, 60-80, and 87-100) was established from a consensus of predictive routines. Using these constraints, an assemblage of 25,000 structures was constructed and these were ranked using an all-atom statistical potential. The best 1000 conformations were energy minimized. The lowest scoring model was refined by molecular dynamics simulation. To validate our strategy, the same methods were applied to three proteins of known structure; PDB:1HP8, PDB:1V54 chain H (residues 21-85), and PDB:1T0P. When compared to the crystal structures, the models had root mean-square deviations of 3.8 A, 4.1 A, and 4.0 A, respectively. The model of RAMP1 suggested that Phe93, Tyr100, and Phe101 form a binding interface for CLR, whereas Trp74 and Phe92 may interact with ligands that bind to the CLR/RAMP1 heterodimer.
Collapse
Affiliation(s)
- John Simms
- School of Life and Health Sciences, Aston University, Birmingham, United Kingdom
| | | | | | | |
Collapse
|
232
|
Dong Q, Wang X, Lin L. Novel knowledge-based mean force potential at the profile level. BMC Bioinformatics 2006; 7:324. [PMID: 16803615 PMCID: PMC1534065 DOI: 10.1186/1471-2105-7-324] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2006] [Accepted: 06/27/2006] [Indexed: 11/10/2022] Open
Abstract
Background The development and testing of functions for the modeling of protein energetics is an important part of current research aimed at understanding protein structure and function. Knowledge-based mean force potentials are derived from statistical analyses of interacting groups in experimentally determined protein structures. Current knowledge-based mean force potentials are developed at the atom or amino acid level. The evolutionary information contained in the profiles is not investigated. Based on these observations, a class of novel knowledge-based mean force potentials at the profile level has been presented, which uses the evolutionary information of profiles for developing more powerful statistical potentials. Results The frequency profiles are directly calculated from the multiple sequence alignments outputted by PSI-BLAST and converted into binary profiles with a probability threshold. As a result, the protein sequences are represented as sequences of binary profiles rather than sequences of amino acids. Similar to the knowledge-based potentials at the residue level, a class of novel potentials at the profile level is introduced. We develop four types of profile-level statistical potentials including distance-dependent, contact, Φ/Ψ dihedral angle and accessible surface statistical potentials. These potentials are first evaluated by the fold assessment between the correct and incorrect models generated by comparative modeling from our own and other groups. They are then used to recognize the native structures from well-constructed decoy sets. Experimental results show that all the knowledge-base mean force potentials at the profile level outperform those at the residue level. Significant improvements are obtained for the distance-dependent and accessible surface potentials (5–6%). The contact and Φ/Ψ dihedral angle potential only get a slight improvement (1–2%). Decoy set evaluation results show that the distance-dependent profile-level potentials even outperform other atom-level potentials. We also demonstrate that profile-level statistical potentials can improve the performance of threading. Conclusion The knowledge-base mean force potentials at the profile level can provide better discriminatory ability than those at the residue level, so they will be useful for protein structure prediction and model refinement.
Collapse
Affiliation(s)
- Qiwen Dong
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, PR China
| | - Xiaolong Wang
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, PR China
| | - Lei Lin
- School of Computer Science and Technology, Harbin Institute of Technology, Harbin, PR China
| |
Collapse
|
233
|
Rastogi S, Reuter N, Liberles DA. Evaluation of models for the evolution of protein sequences and functions under structural constraint. Biophys Chem 2006; 124:134-44. [PMID: 16837122 DOI: 10.1016/j.bpc.2006.06.008] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2006] [Revised: 06/13/2006] [Accepted: 06/14/2006] [Indexed: 12/01/2022]
Abstract
In the field of evolutionary structural genomics, methods are needed to evaluate why genomes evolved to contain the fold distributions that are observed. In order to study the effects of population dynamics in the evolved genomes we need fast and accurate evolutionary models which can analyze the effects of selection, drift and fixation of a protein sequence in a population that are grounded by physical parameters governing the folding and binding properties of the sequence. In this study, various knowledge-based, force field, and statistical methods for protein folding have been evaluated with four different folds: SH2 domains, SH3 domains, Globin-like, and Flavodoxin-like, to evaluate the speed and accuracy of the energy functions. Similarly, knowledge-based and force field methods have been used to predict ligand binding specificity in SH2 domain. To demonstrate the applicability of these methods, the dynamics of evolution of new binding capabilities by an SH2 domain is demonstrated.
Collapse
Affiliation(s)
- Shruti Rastogi
- Department of Molecular Biology, University of Wyoming, Laramie, WY 82071, USA
| | | | | |
Collapse
|
234
|
Abstract
A modeling method is described that avoids the need to consider the domain structure of the template used for modeling, and automatically extracts compact fragments of structure that would be of a suitable size to build the model. This aids automation as the size or nature of the template structure can be ignored and does not have to be broken into domain (or multi-domain) units beforehand. The approach leads to the generation of a large number of models each based on slightly differing domain definitions and this variation was further increased by considering alternative secondary structure predictions. Each model, of which there may be thousands, takes the form of a complete alpha-carbon trace and some methods (including residue burial) were investigated for their power to discriminate good models from bad models using decoys. The method is also compared to an earlier retroviral capsid modeling problem for which the X-ray structure is now known. Some potential extensions of the approach to more distant modeling problems are discussed.
Collapse
Affiliation(s)
- William R Taylor
- Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, United Kingdom.
| | | | | | | | | |
Collapse
|
235
|
Qiu J, Elber R. Atomically detailed potentials to recognize native and approximate protein structures. Proteins 2006; 61:44-55. [PMID: 16080157 DOI: 10.1002/prot.20585] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Atomically detailed potentials for recognition of protein folds are presented. The potentials consist of pair interactions between atoms. One or three distance steps are used to describe the range of interactions between a pair. Training is carried out with the mathematical programming approach on the decoy sets of Baker, Levitt, and some of our own design. Recognition is required not only for decoy-native structural pairs but also for pairs of decoy and homologous structures. Performance is tested on the targets of CASP5 using templates from the Protein Data Bank, on two test ab initio decoy sets from Skolnick's laboratory, and on decoy sets from Moult's laboratory. We conclude that the newly derived potentials have significant recognition capacity, comparable to the best models derived from other techniques. The new potentials require a significantly smaller number of parameters. The enhanced recognition capacity extends primarily to the identification of structures generated by ab initio simulation and less to the recognition of approximate shapes created by homology.
Collapse
Affiliation(s)
- Jian Qiu
- Department of Computer Science, Cornell University, Ithaca, New York 14853, USA
| | | |
Collapse
|
236
|
Ngan SC, Inouye MT, Samudrala R. A knowledge-based scoring function based on residue triplets for protein structure prediction. Protein Eng Des Sel 2006; 19:187-93. [PMID: 16533801 PMCID: PMC5441915 DOI: 10.1093/protein/gzj018] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2005] [Revised: 12/30/2005] [Accepted: 01/09/2006] [Indexed: 11/29/2022] Open
Abstract
One of the general paradigms for ab initio protein structure prediction involves sampling the conformational space such that a large set of decoy (candidate) structures are generated and then selecting native-like conformations from those decoys using various scoring functions. In this study, based on a physical/geometric approach first suggested by Banavar and colleagues, we formulate a knowledge-based scoring function, which uses the radii of curvature formed among triplets of residues in a protein conformation. By analyzing its performance on various decoy sets, we determine a good set of parameters--the distance cutoff and the number of distance bins--to use for configuring such a function. Furthermore, we investigate the effect of using various approaches for compiling the prior distribution on the performance of the knowledge-based function. Possible extensions to the current form of the residue triplet scoring function are discussed.
Collapse
Affiliation(s)
- Shing-Chung Ngan
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Michael T. Inouye
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ram Samudrala
- Computational Genomics Group, Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| |
Collapse
|
237
|
Fang Q, Shortle D. Protein refolding in silico with atom-based statistical potentials and conformational search using a simple genetic algorithm. J Mol Biol 2006; 359:1456-67. [PMID: 16678202 DOI: 10.1016/j.jmb.2006.04.033] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2006] [Revised: 04/10/2006] [Accepted: 04/12/2006] [Indexed: 11/23/2022]
Abstract
A distance-dependent atom-pair potential that treats long range and local interactions separately has been developed and optimized to distinguish native protein structures from sets of incorrect or decoy structures. Atoms are divided into 30 types based on chemical properties and relative position in the amino acid side-chains. Several parameters affecting the calculation and evaluation of this statistical potential, such as the reference state, the bin width, cutoff distances between pairs, and the number of residues separating the atom pairs, are adjusted to achieve the best discrimination. The native structure has the lowest energy for 39 of the 40 sets of original ROSETTA decoys (1000 structures per set) and 23 of the 25 improved decoys (approximately 1900 structures per set). Combined with the orientation-dependent backbone hydrogen bonding potential used by ROSETTA and a statistical solvation potential based on the solvent exclusion model of Lazaridis & Karplus, this potential is used as a scoring function for conformational search based on a genetic algorithm method. After unfolding the native structure by changing every phi and psi angle by either +/-3, +/-5 or +/-7 degrees, five small proteins can be efficiently refolded, in some cases to within 0.5 A C(alpha) distance matrix error (DME) to the native state. Although no significant correlation is found between the total energy and structural similarity to the native state, a surprisingly strong correlation exists between the radius of gyration and the DME for low energy structures.
Collapse
Affiliation(s)
- Qiaojun Fang
- Department of Biological Chemistry, The Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | | |
Collapse
|
238
|
Liu T, Jenwitheesuk E, Teller DC, Samudrala R. Structural insights into the cellular retinaldehyde-binding protein (CRALBP). Proteins 2006; 61:412-22. [PMID: 16121400 DOI: 10.1002/prot.20621] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Cellular retinaldehyde-binding protein (CRALBP) is an essential protein in the human visual cycle without a known three-dimensional structure. Previous studies associate retinal pathologies to specific mutations in the CRALBP protein. Here we use homology modeling and molecular dynamics methods to investigate the structural mechanisms by which CRALBP functions in the visual cycle. We have constructed two conformations of CRALBP representing two states in the process of ligand association and dissociation. Notably, our homology models map the pathology-associated mutations either directly in or adjacent to the putative ligand-binding cavity. Furthermore, six novel residues have been identified to be crucial for the hinge movement of the lipid-exchange loop in CRALBP. We conclude that the binding and release of retinoid involve large conformational changes in the lipid-exchange loop at the entrance of the ligand-binding cavity.
Collapse
Affiliation(s)
- Tianyun Liu
- Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
| | | | | | | |
Collapse
|
239
|
Solis AD, Rackovsky S. Improvement of statistical potentials and threading score functions using information maximization. Proteins 2006; 62:892-908. [PMID: 16395676 DOI: 10.1002/prot.20501] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We show that statistical potentials and threading score functions, derived from finite data sets, are informatic functions, and that their performance depends on the manner in which data are classified and compressed. The choice of sequence and structural parameters affects estimates of the conditional probabilities P(C|S), the quantification of the effect of sequence S on conformation C, and determines the amount of information extracted from the data set, as measured by information gain. The mathematical link between information gain and mean conformational energy, established in this work using the local backbone potential as model, demonstrates that manipulation of descriptive parameters also alters the "energy" values assigned to native conformation and to decoy structures in the test pool, and consequently, the performance of such statistical potential functions in fold recognition exercises. We show that sequence and structural partitions that maximize information gain also minimize the mean energy of the ensemble of native conformations. Moreover, we establish an informatic basis for the placement of the native score within an energy spectrum given by the decoy pool in a threading exercise. We discover that, among all informatic quantities, information gain is the best predictor of threading success, even better than the standard Z-score. Consequently, the choices of sequence and structural descriptors, extent of compression, and levels of discretization that maximize information gain must also produce the best potential functions. Strategies to optimize these parameters with respect to information extraction are therefore relevant to building better statistical potentials. Last, we demonstrate that the backbone torsion potential, defined by the trimer sequence, can be an effective tool in greatly reducing the set of possible conformations from a vast decoy pool.
Collapse
Affiliation(s)
- Armando D Solis
- Department of Pharmacology and Biological Chemistry, Mount Sinai School of Medicine, Box 1215, New York, New York 10029, USA
| | | |
Collapse
|
240
|
Qiu J, Elber R. SSALN: an alignment algorithm using structure-dependent substitution matrices and gap penalties learned from structurally aligned protein pairs. Proteins 2006; 62:881-91. [PMID: 16385554 DOI: 10.1002/prot.20854] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
In template-based modeling of protein structures, the generation of the alignment between the target and the template is a critical step that significantly affects the accuracy of the final model. This paper proposes an alignment algorithm SSALN that learns substitution matrices and position-specific gap penalties from a database of structurally aligned protein pairs. In addition to the amino acid sequence information, secondary structure and solvent accessibility information of a position are used to derive substitution scores and position-specific gap penalties. In a test set of CASP5 targets, SSALN outperforms sequence alignment methods such as a Smith-Waterman algorithm with BLOSUM50 and PSI_BLAST. SSALN also generates better alignments than PSI_BLAST in the CASP6 test set. LOOPP server prediction based on an SSALN alignment is ranked the best for target T0280_1 in CASP6. SSALN is also compared with several threading methods and sequence alignment methods on the ProSup benchmark. SSALN has the highest alignment accuracy among the methods compared. On the Fischer's benchmark, SSALN performs better than CLUSTALW and GenTHREADER, and generates more alignments with accuracy >50%, >60% or >70% than FUGUE, but fewer alignments with accuracy >80% than FUGUE. All the supplemental materials can be found at http://www.cs.cornell.edu/ approximately jianq/research.htm.
Collapse
Affiliation(s)
- Jian Qiu
- Department of Computer Science, Cornell University, Ithaca, New York 14853, USA
| | | |
Collapse
|
241
|
Abstract
Scoring functions are widely used in the final step of model selection in protein structure prediction. This is of interest both for comparative modeling targets, where it is important to select the best model among a set of many good, "correct" ones, as well as for other (fold recognition or novel fold) targets, where the set may contain many incorrect models. A novel combination of four knowledge-based potentials recognizing different features of native protein structures is introduced and tested. The pairwise, solvation, hydrogen bond, and torsion angle potentials contain largely orthogonal information. Of these, the torsion angle potential is found to show the strongest correlation with model quality. Combining these features with a linear weighting function, it was possible to construct a robust energy function capable of discriminating native-like structures on several benchmarking sets. In a recent blind test (CAFASP-4 MQAP), the scoring function ranked consistently well and was able to reliably distinguish the correct template from an ensemble of high quality decoys in 52 of 70 cases (33 of 34 for comparative modeling). An executable version of the Victor/FRST function for Linux PCs is available for download from the URL http://protein.cribi.unipd.it/frst/.
Collapse
|
242
|
Abstract
We propose a novel and flexible derivation scheme of statistical, database-derived, potentials, which allows one to take simultaneously into account specific correlations between several sequence and structure descriptors. This scheme leads to the decomposition of the total folding free energy of a protein into a sum of lower order terms, thereby giving the possibility to analyze independently each contribution and clarify its significance and importance, to avoid overcounting certain contributions, and to deal more efficiently with the limited size of the database. In addition, this derivation scheme appears as quite general, for many previously developed potentials can be expressed as particular cases of our formalism. We use this formalism as a framework to generate different residue-based energy functions, whose performances are assessed on the basis of their ability to discriminate genuine proteins from decoy models. The optimal potential is generated as a combination of several coupling terms, measuring correlations between residue types, backbone torsion angles, solvent accessibilities, relative positions along the sequence, and interresidue distances. This potential outperforms all tested residue-based potentials, and even several atom-based potentials. Its incorporation in algorithms aiming at predicting protein structure and stability should therefore substantially improve their performances.
Collapse
Affiliation(s)
- Y Dehouck
- Unité de Bioinformatique génomique et structurale, Université Libre de Bruxelles, 1050 Brussels, Belgium.
| | | | | |
Collapse
|
243
|
Szarecka A, Meirovitch H. Optimization of the GB/SA solvation model for predicting the structure of surface loops in proteins. J Phys Chem B 2006; 110:2869-80. [PMID: 16471897 PMCID: PMC1945207 DOI: 10.1021/jp055771+] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Implicit solvation models are commonly optimized with respect to experimental data or Poisson-Boltzmann (PB) results obtained for small molecules, where the force field is sometimes not considered. In previous studies, we have developed an optimization procedure for cyclic peptides and surface loops in proteins based on the entire system studied and the specific force field used. Thus, the loop has been modeled by the simplified solvation function E(tot) = E(FF) (epsilon = 2r) + Sigma(i) sigma(i)A(i), where E(FF) (epsilon = nr) is the AMBER force field energy with a distance-dependent dielectric function, epsilon = nr, A(i) is the solvent accessible surface area of atom i, and sigma(i) is its atomic solvation parameter. During the optimization process, the loop is free to move while the protein template is held fixed in its X-ray structure. To improve on the results of this model, in the present work we apply our optimization procedure to the physically more rigorous solvation model, the generalized Born with surface area (GB/SA) (together with the all-atom AMBER force field) as suggested by Still and co-workers (J. Phys. Chem. A 1997, 101, 3005). The six parameters of the GB/SA model, namely, P(1)-P(5) and the surface area parameter, sigma (programmed in the TINKER package) are reoptimized for a "training" group of nine loops, and a best-fit set is defined from the individual sets of optimized parameters. The best-fit set and Still's original set of parameters (where Lys, Arg, His, Glu, and Asp are charged or neutralized) were applied to the training group as well as to a "test" group of seven loops, and the energy gaps and the corresponding RMSD values were calculated. These GB/SA results based on the three sets of parameters have been found to be comparable; surprisingly, however, they are somewhat inferior (e.g, of larger energy gaps) to those obtained previously from the simplified model described above. We discuss recent results for loops obtained by other solvation models and potential directions for future studies.
Collapse
Affiliation(s)
- Agnieszka Szarecka
- Department of Computational Biology, University of Pittsburgh School of Medicine, Suite 3064, BST 3, 3501 Fifth Avenue, Pittsburgh, PA 15213
| | - Hagai Meirovitch
- Department of Computational Biology, University of Pittsburgh School of Medicine, Suite 3064, BST 3, 3501 Fifth Avenue, Pittsburgh, PA 15213
| |
Collapse
|
244
|
Floudas C, Fung H, McAllister S, Mönnigmann M, Rajgaria R. Advances in protein structure prediction and de novo protein design: A review. Chem Eng Sci 2006. [DOI: 10.1016/j.ces.2005.04.009] [Citation(s) in RCA: 175] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
|
245
|
Fogolari F, Tosatto SCE, Colombo G. A decoy set for the thermostable subdomain from chicken villin headpiece, comparison of different free energy estimators. BMC Bioinformatics 2005; 6:301. [PMID: 16354298 PMCID: PMC1351271 DOI: 10.1186/1471-2105-6-301] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2005] [Accepted: 12/14/2005] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Estimators of free energies are routinely used to judge the quality of protein structural models. As these estimators still present inaccuracies, they are frequently evaluated by discriminating native or native-like conformations from large ensembles of so-called decoy structures. RESULTS A decoy set is obtained from snapshots taken from 5 long (100 ns) molecular dynamics (MD) simulations of the thermostable subdomain from chicken villin headpiece. An evaluation of the energy of the decoys is given using: i) a residue based contact potential supplemented by a term for the quality of dihedral angles; ii) a recently introduced combination of four statistical scoring functions for model quality estimation (FRST); iii) molecular mechanics with solvation energy estimated either according to the generalized Born surface area (GBSA) or iv) the Poisson-Boltzmann surface area (PBSA) method. CONCLUSION The decoy set presented here has the following features which make it attractive for testing energy scoring functions:1) it covers a broad range of RMSD values (from less than 2.0 A to more than 12 A);2) it has been obtained from molecular dynamics trajectories, starting from different non-native-like conformations which have diverse behaviour, with secondary structure elements correctly or incorrectly formed, and in one case folding to a native-like structure. This allows not only for scoring of static structures, but also for studying, using free energy estimators, the kinetics of folding;3) all structures have been obtained from accurate MD simulations in explicit solvent and after molecular mechanics (MM) energy minimization using an implicit solvent method. The quality of the covalent structure therefore does not suffer from steric or covalent problems. The statistical and physical effective energy functions tested on the set behave differently when native simulation snapshots are included or not in the set and when averaging over the trajectory is performed.
Collapse
Affiliation(s)
- Federico Fogolari
- Dipartimento di Scienze e Tecnologie Biomediche, Università di Udine, P.le Kolbe 4, 33100 Udine, Italy
| | - Silvio CE Tosatto
- Dipartimento di Biologia and CRIBI Biotech Centre, Università di Padova, Viale G. Colombo 3, 35131 Padova, Italy
| | - Giorgio Colombo
- Istituto di Chimica del Riconoscimento Molecolare, CNR, Via Mario Bianco 9, 20131 Milano, Italy
| |
Collapse
|
246
|
Abstract
In recent years, there has been significant progress in the ability to predict the three-dimensional structure of proteins from their amino acid sequence. Progress has been due to new methods to extract the growing amount of information in sequence and structure databases and improved computational descriptions of protein energetics. This review summarizes recent advances in these areas and describes a number of novel biological applications made possible by structure prediction. Despite remaining challenges, protein structure prediction is becoming an extremely useful tool in understanding phenomena in modern molecular and cell biology.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
247
|
Conner AC, Simms J, Howitt SG, Wheatley M, Poyner DR. The second intracellular loop of the calcitonin gene-related peptide receptor provides molecular determinants for signal transduction and cell surface expression. J Biol Chem 2005; 281:1644-51. [PMID: 16293613 DOI: 10.1074/jbc.m510064200] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The calcitonin gene-related peptide (CGRP) receptor is a heterodimer of a family B G-protein-coupled receptor, calcitonin receptor-like receptor (CLR), and the accessory protein receptor activity modifying protein 1. It couples to G(s), but it is not known which intracellular loops mediate this. We have identified the boundaries of this loop based on the relative position and length of the juxtamembrane transmembrane regions 3 and 4. The loop has been analyzed by systematic mutagenesis of all residues to alanine, measuring cAMP accumulation, CGRP affinity, and receptor expression. Unlike rhodopsin, ICL2 of the CGRP receptor plays a part in the conformational switch after agonist interaction. His-216 and Lys-227 were essential for a functional CGRP-induced cAMP response. The effect of (H216A)CLR is due to a disruption to the cell surface transport or surface stability of the mutant receptor. In contrast, (K227A)CLR had wild-type expression and agonist affinity, suggesting a direct disruption to the downstream signal transduction mechanism of the CGRP receptor. Modeling suggests that the loop undergoes a significant shift in position during receptor activation, exposing a potential G-protein binding pocket. Lys-227 changes position to point into the pocket, potentially allowing it to interact with bound G-proteins. His-216 occupies a position similar to that of Tyr-136 in bovine rhodopsin, part of the DRY motif of the latter receptor. This is the first comprehensive analysis of an entire intracellular loop within the calcitonin family of G-protein-coupled receptor. These data help to define the structural and functional characteristics of the CGRP-receptor and of family B G-protein-coupled receptors in general.
Collapse
Affiliation(s)
- Alex C Conner
- School of Life and Health Sciences, Aston University, Birmingham B4 7ET, United Kingdom
| | | | | | | | | |
Collapse
|
248
|
Jenwitheesuk E, Samudrala R. Heptad-Repeat-2 Mutations Enhance the Stability of the Enfuvirtide-Resistant HIV-1 gp41 Hairpin Structure. Antivir Ther 2005. [DOI: 10.1177/135965350501000804] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Enfuvirtide (T20) is a peptide-based fusion inhibitor derived from the heptad repeat 2 (HR2) region of HIV-1 glycoprotein 41 (gp41). The inhibitor binds to the gp41 heptad repeat 1 (HR1) region, thereby blocking viral HR1/HR2 association. Mutations in HR1 have been reported to cause enfuvirtide resistance and reduce viral fitness. In this study, we first showed that scores obtained by a residue-specific all-atom probability discriminatory function (RAPDF) may be used as a reliable predictor of structural stability of gp41 mutants by comparing it to experimentally determined melting temperatures, and as a reliable indicator of enfuvirtide resistance by comparing it to experimentally determined fusion inhibition and viral fitness levels. We then generated an initial set of 28 theoretical structures of the HR1/HR2 hairpin complex where each structure consists of one mutation on HR1 known to cause enfuvirtide resistance and a wild-type amino acid at the corresponding HR2 residue. Mutations were then introduced in the corresponding HR2 residue of each structure where the wild-type amino acid was changed to each of the other nineteen amino acids. The enfuvirtide-resistant HR1 mutants with compensatory mutations at the corresponding HR2 residues had better RAPDF scores than those HR1 mutants with wild-type HR2. This indicates that mutations in HR2 improve structural stability of the HR1/HR2 hairpin complex and may lead to enhanced enfuvirtide resistance when present with resistant HR1 mutations. Modification of the amino acid side chains that contribute to enfuvirtide resistance using the RAPDF scores as a guide may help design of a second generation of fusion inhibitors against the enfuvirtide-resistant strains.
Collapse
Affiliation(s)
- Ekachai Jenwitheesuk
- Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| | - Ram Samudrala
- Department of Microbiology, University of Washington School of Medicine, Seattle, WA 98195, USA
| |
Collapse
|
249
|
Summa CM, Levitt M, Degrado WF. An atomic environment potential for use in protein structure prediction. J Mol Biol 2005; 352:986-1001. [PMID: 16126228 DOI: 10.1016/j.jmb.2005.07.054] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2004] [Revised: 06/20/2005] [Accepted: 07/20/2005] [Indexed: 11/25/2022]
Abstract
We describe the derivation and testing of a knowledge-based atomic environment potential for the modeling of protein structural energetics. An analysis of the probabilities of atomic interactions in a dataset of high-resolution protein structures shows that the probabilities of non-bonded inter-atomic contacts are not statistically independent events, and that the multi-body contact frequencies are poorly predicted from pairwise contact potentials. A pseudo-energy function is defined that measures the preferences for protein atoms to be in a given microenvironment defined by the number of contacting atoms in the environment and its atomic composition. This functional form is tested for its ability to recognize native protein structures amongst an ensemble of decoy structures and a detailed relative performance comparison is made with a number of common functions used in protein structure prediction.
Collapse
Affiliation(s)
- Christopher M Summa
- Department of Biochemistry and Biophysics, The University of Pennsylvania Medical School, Philadelphia, PA 19104-6059, USA
| | | | | |
Collapse
|
250
|
Hung LH, Ngan SC, Liu T, Samudrala R. PROTINFO: new algorithms for enhanced protein structure predictions. Nucleic Acids Res 2005; 33:W77-80. [PMID: 15980581 PMCID: PMC1160164 DOI: 10.1093/nar/gki403] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Abstract
We describe new algorithms and modules for protein structure prediction available as part of the PROTINFO web server. The modules, comparative and de novo modelling, have significantly improved back-end algorithms that were rigorously evaluated at the sixth meeting on the Critical Assessment of Protein Structure Prediction methods. We were one of four server groups invited to make an oral presentation (only the best performing groups are asked to do so). These two modules allow a user to submit a protein sequence and return atomic coordinates representing the tertiary structure of that protein. The PROTINFO server is available at .
Collapse
Affiliation(s)
| | | | | | - Ram Samudrala
- To whom correspondence should be addressed. Tel: +1 206 732 6122; Fax: +1 206 732 6055;
| |
Collapse
|