101
|
Cheng J. A multi-template combination algorithm for protein comparative modeling. BMC STRUCTURAL BIOLOGY 2008; 8:18. [PMID: 18366648 PMCID: PMC2311309 DOI: 10.1186/1472-6807-8-18] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/08/2008] [Accepted: 03/17/2008] [Indexed: 11/26/2022]
Abstract
BACKGROUND Multiple protein templates are commonly used in manual protein structure prediction. However, few automated algorithms of selecting and combining multiple templates are available. RESULTS Here we develop an effective multi-template combination algorithm for protein comparative modeling. The algorithm selects templates according to the similarity significance of the alignments between template and target proteins. It combines the whole template-target alignments whose similarity significance score is close to that of the top template-target alignment within a threshold, whereas it only takes alignment fragments from a less similar template-target alignment that align with a sizable uncovered region of the target. We compare the algorithm with the traditional method of using a single top template on the 45 comparative modeling targets (i.e. easy template-based modeling targets) used in the seventh edition of Critical Assessment of Techniques for Protein Structure Prediction (CASP7). The multi-template combination algorithm improves the GDT-TS scores of predicted models by 6.8% on average. The statistical analysis shows that the improvement is significant (p-value < 10-4). Compared with the ideal approach that always uses the best template, the multi-template approach yields only slightly better performance. During the CASP7 experiment, the preliminary implementation of the multi-template combination algorithm (FOLDpro) was ranked second among 67 servers in the category of high-accuracy structure prediction in terms of GDT-TS measure. CONCLUSION We have developed a novel multi-template algorithm to improve protein comparative modeling.
Collapse
Affiliation(s)
- Jianlin Cheng
- Department of Computer Science, Informatics Institute, University of Missouri, Columbia, MO 65211-2060, USA.
| |
Collapse
|
102
|
Abstract
Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.
Collapse
|
103
|
A historical perspective of template-based protein structure prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:3-42. [PMID: 18075160 DOI: 10.1007/978-1-59745-574-9_1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter presents a broad and a historical overview of the problem of protein structure prediction. Different structure prediction methods, including homology modeling, fold recognition (FR)/protein threading, ab initio/de novo approaches, and hybrid techniques involving multiple types of approaches, are introduced in a historical context. The progress of the field as a whole, especially in the threading/FR area, as reflected by the CASP/CAFASP contests, is reviewed. At the end of the chapter, we discuss the challenging issues ahead in the field of protein structure prediction.
Collapse
|
104
|
|
105
|
Abstract
We perform a systematic examination of the ability of several different high-resolution, atomic-detail scoring functions to discriminate native conformations of loops in membrane proteins from non-native but physically reasonable, or "decoy," conformations. Decoys constructed from changing a loop conformation while keeping the remainder of the protein fixed are a challenging test of energy function accuracy. Nevertheless, the best of the energy functions we examined recognized the native structure as lowest in energy around half the time, and consistently chose it as a low-energy structure. This suggests that the best of present energy functions, even without a representation of the lipid bilayer, are of sufficient accuracy to give reasonable confidence in predictions of membrane protein structure. We also constructed homology models for each structure, using other known structures in the same protein family as templates. Homology models were constructed using several scoring functions and modeling programs, but with a comparable sampling effort for each procedure. Our results indicate that the quality of sequence alignment is probably the most important factor in model accuracy for sequence identity from 20-40%; one can expect a reasonably accurate model for membrane proteins when sequence identity is greater than 30%, in agreement with previous studies. Most errors are localized in loop regions, which tend to be found outside the lipid bilayer. For the most discriminative energy functions, it appears that errors are most likely due to lack of sufficient sampling, although it should be stressed that present energy functions are still far from perfectly reliable.
Collapse
Affiliation(s)
- Cen Gao
- Department of Chemistry, University of Rochester, Rochester, New York, USA
| | | |
Collapse
|
106
|
Yan A, Kloczkowski A, Hofmann H, Jernigan RL. Prediction of side chain orientations in proteins by statistical machine learning methods. J Biomol Struct Dyn 2007; 25:275-88. [PMID: 17937489 DOI: 10.1080/07391102.2007.10507176] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
We develop ways to predict the side chain orientations of residues within a protein structure by using several different statistical machine learning methods. Here side chain orientation of a given residue i is measured by an angle Omega(i) between the vector pointing from the center of the protein structure to the C(i)(alpha) atom and the vector pointing from the C(i)(alpha) atom to the center of its side chain atoms. To predict the Omega(i) angles, we construct statistical models by using several different methods such as general linear regression, a regression tree and bagging, a neural network, and a support vector machine. The root mean square errors for the different models range only from 36.67 to 37.60 degrees and the correlation coefficients are all between 30% and 34%. The performances of different models in the test set are, thus, quite similar, and show the relative predictive power of these models to be significant in comparison with random side chain orientations.
Collapse
Affiliation(s)
- Aimin Yan
- Laurence H. Baker Center for Bioinformatics and Biological Statistics, Iowa State University, Ames, Iowa, USA
| | | | | | | |
Collapse
|
107
|
Gabdoulline RR, Stein M, Wade RC. qPIPSA: relating enzymatic kinetic parameters and interaction fields. BMC Bioinformatics 2007; 8:373. [PMID: 17919319 PMCID: PMC2174957 DOI: 10.1186/1471-2105-8-373] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2007] [Accepted: 10/05/2007] [Indexed: 11/29/2022] Open
Abstract
Background The simulation of metabolic networks in quantitative systems biology requires the assignment of enzymatic kinetic parameters. Experimentally determined values are often not available and therefore computational methods to estimate these parameters are needed. It is possible to use the three-dimensional structure of an enzyme to perform simulations of a reaction and derive kinetic parameters. However, this is computationally demanding and requires detailed knowledge of the enzyme mechanism. We have therefore sought to develop a general, simple and computationally efficient procedure to relate protein structural information to enzymatic kinetic parameters that allows consistency between the kinetic and structural information to be checked and estimation of kinetic constants for structurally and mechanistically similar enzymes. Results We describe qPIPSA: quantitative Protein Interaction Property Similarity Analysis. In this analysis, molecular interaction fields, for example, electrostatic potentials, are computed from the enzyme structures. Differences in molecular interaction fields between enzymes are then related to the ratios of their kinetic parameters. This procedure can be used to estimate unknown kinetic parameters when enzyme structural information is available and kinetic parameters have been measured for related enzymes or were obtained under different conditions. The detailed interaction of the enzyme with substrate or cofactors is not modeled and is assumed to be similar for all the proteins compared. The protein structure modeling protocol employed ensures that differences between models reflect genuine differences between the protein sequences, rather than random fluctuations in protein structure. Conclusion Provided that the experimental conditions and the protein structural models refer to the same protein state or conformation, correlations between interaction fields and kinetic parameters can be established for sets of related enzymes. Outliers may arise due to variation in the importance of different contributions to the kinetic parameters, such as protein stability and conformational changes. The qPIPSA approach can assist in the validation as well as estimation of kinetic parameters, and provide insights into enzyme mechanism.
Collapse
Affiliation(s)
- Razif R Gabdoulline
- Molecular and Cellular Modeling Group, EML Research gGmbH, Schloss Wolfsbrunnenweg 33, Heidelberg, 69118, Germany.
| | | | | |
Collapse
|
108
|
Buonocore F, Randelli E, Casani D, Costantini S, Facchiano A, Scapigliati G, Stet RJM. Molecular cloning, differential expression and 3D structural analysis of the MHC class-II beta chain from sea bass (Dicentrarchus labrax L.). FISH & SHELLFISH IMMUNOLOGY 2007; 23:853-66. [PMID: 17493833 DOI: 10.1016/j.fsi.2007.03.013] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/15/2007] [Revised: 03/15/2007] [Accepted: 03/15/2007] [Indexed: 05/15/2023]
Abstract
The major histocompatibility complex class I and II molecules (MHC-I and MHC-II) play a pivotal role in vertebrate immune response to antigenic peptides. In this paper we report the cloning and sequencing of the MHC class II beta chain from sea bass (Dicentrarchus labrax L.). The six obtained cDNA sequences (designated as Dila-DAB) code for 250 amino acids, with a predicted 21 amino acid signal peptide and contain a 28bp 5'-UTR and a 478bp 3'-UTR. A multiple alignment of the predicted translation of the Dila-DAB sequences was assembled together with other fish and mammalian sequences and it showed the conservation of most amino acid residues characteristic of the MHC class II beta chain structure. The highest basal Dila-DAB expression was found in gills, followed by gut and thymus, lower mRNA levels were found in spleen, peripheral blood leucocytes (PBL) and liver. Stimulation of head kidney leukocytes with LPS for 4h showed very little difference in the Dila-DAB expression, but after 24h the Dila-DAB level decreased to a large extent and the difference was statistically significant. Stimulation of head kidney leukocytes with different concentrations of rIL-1beta (ranging from 0 to 100ng/ml) resulted in a dose-dependent reduction of the Dila-DAB expression. Moreover, two 3D Dila-DAB*0101 homology models were obtained based on crystallographic mouse MHC-II structures complexed with D10 T-cell antigen receptor or human CD4; features and differences between the models were evaluated and discussed. Taken together these results are of interest as MHC-II structure and function, molecular polymorphism and differential gene expression are in correlation with disease resistance to virus and bacteria in teleost fish.
Collapse
Affiliation(s)
- Francesco Buonocore
- Dipartimento di Scienze Ambientali, University of Tuscia, Largo dell'Università snc, I-01100, Viterbo, Italy.
| | | | | | | | | | | | | |
Collapse
|
109
|
Rossi KA, Weigelt CA, Nayeem A, Krystek SR. Loopholes and missing links in protein modeling. Protein Sci 2007; 16:1999-2012. [PMID: 17660258 PMCID: PMC2206982 DOI: 10.1110/ps.072887807] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2007] [Revised: 06/08/2007] [Accepted: 06/09/2007] [Indexed: 10/23/2022]
Abstract
This paper provides an unbiased comparison of four commercially available programs for loop sampling, Prime, Modeler, ICM, and Sybyl, each of which uses a different modeling protocol. The study assesses the quality of results and examines the relative strengths and weaknesses of each method. The set of loops to be modeled varied in length from 4-12 amino acids. The approaches used for loop modeling can be classified into two methodologies: ab initio loop generation (Modeler and Prime) and database searches (Sybyl and ICM). Comparison of the modeled loops to the native structures was used to determine the accuracy of each method. All of the protocols returned similar results for short loop lengths (four to six residues), but as loop length increased, the quality of the results varied among the programs. Prime generated loops with RMSDs <2.5 A for loops up to 10 residues, while the other three methods met the 2.5 A criteria at seven-residue loops. Additionally, the ability of the software to utilize disulfide bonds and X-ray crystal packing influenced the quality of the results. In the final analysis, the top-ranking loop from each program was rarely the loop with the lowest RMSD with respect to the native template, revealing a weakness in all programs to correctly rank the modeled loops.
Collapse
Affiliation(s)
- Karen A Rossi
- Computer-Assisted Drug Design, Pharmaceutical Research Institute, Bristol-Myers Squibb Company, Princeton, New Jersey 08543, USA.
| | | | | | | |
Collapse
|
110
|
Qi Y, Sadreyev RI, Wang Y, Kim BH, Grishin NV. A comprehensive system for evaluation of remote sequence similarity detection. BMC Bioinformatics 2007; 8:314. [PMID: 17725841 PMCID: PMC2031906 DOI: 10.1186/1471-2105-8-314] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2007] [Accepted: 08/28/2007] [Indexed: 11/25/2022] Open
Abstract
Background Accurate and sensitive performance evaluation is crucial for both effective development of better structure prediction methods based on sequence similarity, and for the comparative analysis of existing methods. Up to date, there has been no satisfactory comprehensive evaluation method that (i) is based on a large and statistically unbiased set of proteins with clearly defined relationships; and (ii) covers all performance aspects of sequence-based structure predictors, such as sensitivity and specificity, alignment accuracy and coverage, and structure template quality. Results With the aim of designing such a method, we (i) select a statistically balanced set of divergent protein domains from SCOP, and define similarity relationships for the majority of these domains by complementing the best of information available in SCOP with a rigorous SVM-based algorithm; and (ii) develop protocols for the assessment of similarity detection and alignment quality from several complementary perspectives. The evaluation of similarity detection is based on ROC-like curves and includes several complementary approaches to the definition of true/false positives. Reference-dependent approaches use the 'gold standard' of pre-defined domain relationships and structure-based alignments. Reference-independent approaches assess the quality of structural match predicted by the sequence alignment, with respect to the whole domain length (global mode) or to the aligned region only (local mode). Similarly, the evaluation of alignment quality includes several reference-dependent and -independent measures, in global and local modes. As an illustration, we use our benchmark to compare the performance of several methods for the detection of remote sequence similarities, and show that different aspects of evaluation reveal different properties of the evaluated methods, highlighting their advantages, weaknesses, and potential for further development. Conclusion The presented benchmark provides a new tool for a statistically unbiased assessment of methods for remote sequence similarity detection, from various complementary perspectives. This tool should be useful both for users choosing the best method for a given purpose, and for developers designing new, more powerful methods. The benchmark set, reference alignments, and evaluation codes can be downloaded from .
Collapse
Affiliation(s)
- Yuan Qi
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Ruslan I Sadreyev
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
| | - Yong Wang
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
| | - Bong-Hyun Kim
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323, Harry Hines Blvd, Dallas, TX 75390-9050, USA
| |
Collapse
|
111
|
Lue NF, Li Z. Modeling and structure function analysis of the putative anchor site of yeast telomerase. Nucleic Acids Res 2007; 35:5213-22. [PMID: 17670795 PMCID: PMC1976438 DOI: 10.1093/nar/gkm531] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Telomerase is a ribonucleoprotein reverse transcriptase responsible for extending one strand of the telomere terminal repeats. Unique among reverse transcriptases, telomerase is thought to possess a DNA-binding domain (known as anchor site) that allows the enzyme to add telomere repeats processively. Previous crosslinking and mutagenesis studies have mapped the anchor site to an N-terminal region of TERT, and the structure of this region of Tetrahymena TERT was recently determined at atomic resolutions. Here we use a combination of homology modeling, electrostatic calculation and site-specific mutagenesis analysis to identify a positively charged, functionally important surface patch on yeast TERT. This patch is lined by both conserved and non-conserved residues, which when mutated, caused loss of telomerase processivity in vitro and telomere shortening in vivo. In addition, we demonstrate that a point mutation in this domain of yeast TERT simultaneously enhanced the repeat addition processivity of telomerase and caused telomere elongation. Our data argue that telomerase anchor site has evolved species-specific residues to interact with species-specific telomere repeats. The data also reinforce the importance of telomerase processivity in regulating telomere length.
Collapse
Affiliation(s)
- Neal F Lue
- Department of Microbiology & Immunology, W. R. Hearst Microbiology Research Center, Weill Medical College of Cornell University, 1300 York Avenue, New York, NY 10021, USA.
| | | |
Collapse
|
112
|
Bhattacharya A, Wunderlich Z, Monleon D, Tejero R, Montelione GT. Assessing model accuracy using the homology modeling automatically software. Proteins 2007; 70:105-18. [PMID: 17640066 DOI: 10.1002/prot.21466] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Homology modeling is a powerful technique that greatly increases the value of experimental structure determination by using the structural information of one protein to predict the structures of homologous proteins. We have previously described a method of homology modeling by satisfaction of spatial restraints (Li et al., Protein Sci 1997;6:956-970). The Homology Modeling Automatically (HOMA) web site, <http://www-nmr.cabm.rutgers.edu/HOMA>, is a new tool, using this method to predict 3D structure of a target protein based on the sequence alignment of the target protein to a template protein and the structure coordinates of the template. The user is presented with the resulting models, together with an extensive structure validation report providing critical assessments of the quality of the resulting homology models. The homology modeling method employed by HOMA was assessed and validated using twenty-four groups of homologous proteins. Using HOMA, homology models were generated for 510 proteins, including 264 proteins modeled with correct folds and 246 modeled with incorrect folds. Accuracies of these models were assessed by superimposition on the corresponding experimentally determined structures. A subset of these results was compared with parallel studies of modeling accuracy using several other automated homology modeling approaches. Overall, HOMA provides prediction accuracies similar to other state-of-the-art homology modeling methods. We also provide an evaluation of several structure quality validation tools in assessing the accuracy of homology models generated with HOMA. This study demonstrates that Verify3D (Luthy et al., Nature 1992;356:83-85) and ProsaII (Sippl, Proteins 1993;17:355-362) are most sensitive in distinguishing between homology models with correct or incorrect folds. For homology models that have the correct fold, the steric conformational energy (including primarily the Van der Waals energy), MolProbity clashscore (Word et al., Protein Sci 2000;9:2251-2259), and the PROCHECK G-factors (Laskowski et al., J Biomol NMR 1996;8:477-486) provide sensitive and consistent methods for assessing accuracy and can distinguish between homology models of higher and lower accuracy. As demonstrated in the accompanying paper (Bhattacharya et al., accompanying paper), combinations of these scores for models generated with HOMA provide a basis for distinguishing low from high accuracy models.
Collapse
Affiliation(s)
- Aneerban Bhattacharya
- Center for Advanced Biotechnology and Medicine (CABM), Rutgers University and Robert Wood Johnson Medical School (UMDNJ), Piscataway, New Jersey 08854, USA
| | | | | | | | | |
Collapse
|
113
|
Klebe G. Virtual ligand screening: strategies, perspectives and limitations. Drug Discov Today 2007; 11:580-94. [PMID: 16793526 PMCID: PMC7108249 DOI: 10.1016/j.drudis.2006.05.012] [Citation(s) in RCA: 448] [Impact Index Per Article: 26.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2005] [Revised: 02/13/2006] [Accepted: 05/16/2006] [Indexed: 11/28/2022]
Abstract
In contrast to high-throughput screening, in virtual ligand screening (VS), compounds are selected using computer programs to predict their binding to a target receptor. A key prerequisite is knowledge about the spatial and energetic criteria responsible for protein–ligand binding. The concepts and prerequisites to perform VS are summarized here, and explanations are sought for the enduring limitations of the technology. Target selection, analysis and preparation are discussed, as well as considerations about the compilation of candidate ligand libraries. The tools and strategies of a VS campaign, and the accuracy of scoring and ranking of the results, are also considered.
Collapse
Affiliation(s)
- Gerhard Klebe
- Institute of Pharmaceutical Chemistry, University of Marburg, Marbacher Weg 6, D-35032 Marburg, Germany.
| |
Collapse
|
114
|
Abstract
The Pcons.net Meta Server (http://pcons.net) provides improved automated tools for protein structure prediction and analysis using consensus. It essentially implements all the steps necessary to produce a high quality model of a protein. The whole process is fully automated and a potential user only submits the protein sequence. For PSI-BLAST detectable targets, an accurate model is generated within minutes of submission. For more difficult targets the sequence is automatically submitted to publicly available fold-recognition servers that use more advanced approaches to find distant structural homologs. The results from these servers are analyzed and assessed for structural correctness using Pcons and ProQ; and the user is presented with a ranked list of possible models. In addition, if the protein sequence contains more than one domain, these are automatically parsed out and resubmitted to the server as individual queries.
Collapse
Affiliation(s)
- Björn Wallner
- Department of Biochemistry, University of Washington, Box 357350, Seattle, WA 98195, USA.
| | | | | |
Collapse
|
115
|
Heath AP, Kavraki LE, Clementi C. From coarse-grain to all-atom: Toward multiscale analysis of protein landscapes. Proteins 2007; 68:646-61. [PMID: 17523187 DOI: 10.1002/prot.21371] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Multiscale methods are becoming increasingly promising as a way to characterize the dynamics of large protein systems on biologically relevant time-scales. The underlying assumption in multiscale simulations is that it is possible to move reliably between different resolutions. We present a method that efficiently generates realistic all-atom protein structures starting from the C(alpha) atom positions, as obtained for instance from extensive coarse-grain simulations. The method, a reconstruction algorithm for coarse-grain structures (RACOGS), is validated by reconstructing ensembles of coarse-grain structures obtained during folding simulations of the proteins src-SH3 and S6. The results show that RACOGS consistently produces low energy, all-atom structures. A comparison of the free energy landscapes calculated using the coarse-grain structures versus the all-atom structures shows good correspondence and little distortion in the protein folding landscape.
Collapse
Affiliation(s)
- Allison P Heath
- Department of Computer Science, Rice University, Houston, Texas 77005, USA
| | | | | |
Collapse
|
116
|
Dalton JAR, Jackson RM. An evaluation of automated homology modelling methods at low target template sequence similarity. Bioinformatics 2007; 23:1901-8. [PMID: 17510171 DOI: 10.1093/bioinformatics/btm262] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION There are two main areas of difficulty in homology modelling that are particularly important when sequence identity between target and template falls below 50%: sequence alignment and loop building. These problems become magnified with automatic modelling processes, as there is no human input to correct mistakes. As such we have benchmarked several stand-alone strategies that could be implemented in a workflow for automated high-throughput homology modelling. These include three new sequence-structure alignment programs: 3D-Coffee, Staccato and SAlign, plus five homology modelling programs and their respective loop building methods: Builder, Nest, Modeller, SegMod/ENCAD and Swiss-Model. The SABmark database provided 123 targets with at least five templates from the same SCOP family and sequence identities </=50%. RESULTS When using Modeller as the common modelling program, 3D-Coffee outperforms Staccato and SAlign using both multiple templates and the best single template, and across the sequence identity range 20-50%. The mean model RMSD generated from 3D-Coffee using multiple templates is 15 and 28% (or using single templates, 3 and 13%) better than those generated by Staccato and Salign, respectively. 3D-Coffee gives equivalent modelling accuracy from multiple and single templates, but Staccato and SAlign are more successful with single templates, their quality deteriorating as additional lower sequence identity templates are added. Evaluating the different homology modelling programs, on average Modeller performs marginally better in overall modelling than the others tested. However, on average Nest produces the best loops with an 8% improvement by mean RMSD compared to the loops generated by Builder.
Collapse
Affiliation(s)
- James A R Dalton
- Institute of Molecular and Cellular Biology, Faculty of Biological Sciences, University of Leeds, Leeds, UK
| | | |
Collapse
|
117
|
Chambery A, Pisante M, Di Maro A, Di Zazzo E, Ruvo M, Costantini S, Colonna G, Parente A. Invariant Ser211 is involved in the catalysis of PD-L4, type I RIP from Phytolacca dioica leaves. Proteins 2007; 67:209-18. [PMID: 17243169 DOI: 10.1002/prot.21271] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Multiple sequence alignment analysis of ribosome inactivating proteins (RIPs) has revealed the occurrence of an invariant seryl residue in proximity of the catalytic tryptophan. The involvement of this seryl residue in the catalytic mechanism of RIPs was investigated by site-directed mutagenesis in PD-L4, type 1 RIP isolated from Phytolacca dioica leaves. We show that the replacement of Ser211 with Ala apparently does not influence the N-beta-glycosidase activity on ribosomes (determined as IC(50) in a cell-free system), but it reduces the adenine polynucleotide glycosylase activity (APG), assayed spectrophotometrically on other substrates such as DNA, rRNA, and poly(A). The ability of PD-L4 to deadenylate polynucleotides appears more sensitive to the Ser211Ala replacement when poly(A) is used as substrate, as only 33% activity is retained by the mutant, while with more complex and heterogeneous substrates such as DNA and rRNA, its APG activity is 73% and 66%, respectively. While the mutated protein shows a conserved secondary structure by CD, it also exhibits a remarkably enhanced tryptophan fluorescence. This indicates that, although the overall protein tridimensional structure is maintained, removal of the hydroxyl group locally affects the environment of a Trp residue. Modelling and docking analyses confirm the interaction between Ser211 and Trp207, which is located within the active site, thus affecting RIP adenine polynucleotide glycosylase activity. Data accumulated so far confirm the potential involvement of Ser211 in the catalytic mechanism of type 1 RIP PD-L4 and a possible role in stabilizing the conformation of Trp207 side chain, which participates actively in the protein enzymatic activity.
Collapse
Affiliation(s)
- Angela Chambery
- Dipartimento di Scienze della Vita, Seconda Università di Napoli, Caserta, Italy
| | | | | | | | | | | | | | | |
Collapse
|
118
|
Rockwell NC, Lagarias JC. Flexible mapping of homology onto structure with homolmapper. BMC Bioinformatics 2007; 8:123. [PMID: 17428344 PMCID: PMC1955750 DOI: 10.1186/1471-2105-8-123] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2007] [Accepted: 04/11/2007] [Indexed: 12/19/2022] Open
Abstract
Background Over the past decade, a number of tools have emerged for the examination of homology relationships among protein sequences in a structural context. Most recent software implementations for such analysis are tied to specific molecular viewing programs, which can be problematic for collaborations involving multiple viewing environments. Incorporation into larger packages also adds complications for users interested in adding their own scoring schemes or in analyzing proteins incorporating unusual amino acid residues such as selenocysteine. Results We describe homolmapper, a command-line application for mapping information from a multiple protein sequence alignment onto a protein structure for analysis in the viewing software of the user's choice. Homolmapper is small (under 250 K for the application itself) and is written in Python to ensure portability. It is released for non-commercial use under a modified University of California BSD license. Homolmapper permits facile import of additional scoring schemes and can incorporate arbitrary additional amino acids to allow handling of residues such as selenocysteine or pyrrolysine. Homolmapper also provides tools for defining and analyzing subfamilies relative to a larger alignment, for mutual information analysis, and for rapidly visualizing the locations of mutations and multi-residue motifs. Conclusion Homolmapper is a useful tool for analysis of homology relationships among proteins in a structural context. There is also extensive, example-driven documentation available. More information about homolmapper is available at .
Collapse
Affiliation(s)
- Nathan C Rockwell
- Section of Molecular and Cellular Biology, University of California, Davis, California 95616, USA
| | - J Clark Lagarias
- Section of Molecular and Cellular Biology, University of California, Davis, California 95616, USA
| |
Collapse
|
119
|
Punta M, Forrest LR, Bigelow H, Kernytsky A, Liu J, Rost B. Membrane protein prediction methods. Methods 2007; 41:460-74. [PMID: 17367718 PMCID: PMC1934899 DOI: 10.1016/j.ymeth.2006.07.026] [Citation(s) in RCA: 70] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2006] [Accepted: 07/05/2006] [Indexed: 10/23/2022] Open
Abstract
We survey computational approaches that tackle membrane protein structure and function prediction. While describing the main ideas that have led to the development of the most relevant and novel methods, we also discuss pitfalls, provide practical hints and highlight the challenges that remain. The methods covered include: sequence alignment, motif search, functional residue identification, transmembrane segment and protein topology predictions, homology and ab initio modeling. In general, predictions of functional and structural features of membrane proteins are improving, although progress is hampered by the limited amount of high-resolution experimental information available. While predictions of transmembrane segments and protein topology rank among the most accurate methods in computational biology, more attention and effort will be required in the future to ameliorate database search, homology and ab initio modeling.
Collapse
Affiliation(s)
- Marco Punta
- Department of Biochemistry and Molecular Biophysics, Columbia University, 1130 St. Nicholas Ave., New York, NY 10032, USA
| | | | | | | | | | | |
Collapse
|
120
|
Costantini S, Colonna G, Facchiano AM. Simulation of conformational changes occurring when a protein interacts with its receptor. Comput Biol Chem 2007; 31:196-206. [PMID: 17500035 DOI: 10.1016/j.compbiolchem.2007.03.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2006] [Accepted: 03/26/2007] [Indexed: 11/20/2022]
Abstract
In order to simulate the conformational changes occurring when a protein interacts with its receptor, we firstly evaluated the structural differences between the experimental unbound and bound conformations for selected proteins and created theoretical complexes by replacing, in each experimental complex, the protein-bound with the protein-unbound chain. The theoretical models were then subjected to additional modeling refinements to improve the side chain geometry. Comparing the theoretical and experimental complexes in term of structural and energetic factors is resulted that the refined theoretical complexes became more similar to the experimental ones. We applied the same procedure within an homology modeling experiment, using as templates the experimental structures of human interleukin-1beta (IL-1beta) unbound and bound with its receptor, to build models of the homologous proteins from mouse and trout in unbound and bound conformations and to simulate the interaction with the related receptors. Our results suggest that homology modeling techniques are sensitive to differences between bound and unbound conformations, and that modeling with accuracy the side chains in the complex improves the interaction and molecular recognition. Moreover, our refinement procedure could be used in protein-protein interaction studies and, also, applied in conjunction with rigid-body docking when is not available the protein-bound conformation.
Collapse
Affiliation(s)
- S Costantini
- Laboratory of Bioinformatics and Computational Biology, Institute of Food Science, CNR, via Roma 52 A/C, 83100 Avellino, Italy
| | | | | |
Collapse
|
121
|
Abstract
In this perspective, we begin by describing the comparative protein structure modeling technique and the accuracy of the corresponding models. We then discuss the significant role that comparative prediction plays in drug discovery. We focus on virtual ligand screening against comparative models and illustrate the state of the art by a number of specific examples.
Collapse
|
122
|
Andreeva ZI, Nesterenko VF, Fomkina MG, Ternovsky VI, Suzina NE, Bakulina AY, Solonin AS, Sineva EV. The properties of Bacillus cereus hemolysin II pores depend on environmental conditions. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2006; 1768:253-63. [PMID: 17173854 DOI: 10.1016/j.bbamem.2006.11.004] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2006] [Revised: 11/03/2006] [Accepted: 11/03/2006] [Indexed: 11/29/2022]
Abstract
Hemolysin II (HlyII), one of several cytolytic proteins encoded by the opportunistic human pathogen Bacillus cereus, is a member of the family of oligomeric beta-barrel pore-forming toxins. This work has studied the pore-forming properties of HlyII using a number of biochemical and biophysical approaches. According to electron microscopy, HlyII protein interacts with liposomes to form ordered heptamer-like macromolecular assemblies with an inner pore diameter of 1.5-2 nm and an outer diameter of 6-8 nm. This is consistent with inner pore diameter obtained from osmotic protection assay. According to the 3D model obtained, seven HlyII monomers might form a pore, the outer size of which has been estimated to be slightly larger than by the other method, with an inner diameter changing from 1 to 4 nm along the channel length. The hemolysis rate has been found to be temperature-dependent, with an explicit lag at lower temperatures. Temperature jump experiments have indicated the pore structures formed at 37 degrees C and 4 degrees C to be different. The channels formed by HlyII are anion-selective in lipid bilayers and show a rising conductance as the salt concentration increases. The results presented show for the first time that at high salt concentration HlyII pores demonstrate voltage-induced gating observed at low negative potentials. Taken together we have found that the membrane-binding properties of hemolysin II as well as the properties of its pores strongly depend on environmental conditions. The study of the properties together with structural modeling allows a better understanding of channel functioning.
Collapse
Affiliation(s)
- Zhanna I Andreeva
- Institute of Biochemistry and Physiology of Microorganisms, Russian Academy of Sciences, Pushchino, Moscow Region, 142290, Russia
| | | | | | | | | | | | | | | |
Collapse
|
123
|
Facchiano AM, Costantini S, Di Maro A, Panichi D, Chambery A, Parente A, Di Gennaro S, Poerio E. Modeling the 3D structure of wheat subtilisin/chymotrypsin inhibitor (WSCI). Probing the reactive site with two susceptible proteinases by time-course analysis and molecular dynamics simulations. Biol Chem 2006; 387:931-40. [PMID: 16913843 DOI: 10.1515/bc.2006.117] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
Comparative modeling and time-course hydrolysis experiments have been applied to investigate two enzyme-inhibitor complexes formed between the wheat subtilisin-chymotrypsin inhibitor (WSCI) and two susceptible proteinases. WSCI represents the first case of a wheat protein inhibitor active against animal chymotrypsins and bacterial subtilisins. The model was created using as template structure that of the CI-2A inhibitor from barley (PDB code: 2CI2), which shares 87% sequence identity with WSCI. Under these conditions of high similarity, the comparative modeling approach can be successfully applied. We predicted the WSCI 3D model and used it to investigate enzyme-inhibitor complex systems. Experimental observations indicated that chymotrypsin, but not subtilisin, in addition to cleavage at the primary reactive site Met48-Glu49, is able to hydrolyze a second peptide bond between Phe58 and Val59. Here, we report on cleavage of the peptide bond at the inhibitor's reactive site (Met48-Glu49) determined using time-course hydrolysis experiments; the same event was investigated for both subtilisin/WSCI and chymotrypsin/WSCI complexes using molecular dynamics simulations. The molecular details of the initial inhibitor-enzyme interactions, as well as of the changes observed during the simulations, allow us to speculate on the different fates of the two WSCI-proteinase complexes.
Collapse
|
124
|
Abstract
AMT (ammonium transporter)/Rh (Rhesus) ammonium transporters/channels are identified in all domains of life and fulfil contrasting functions related either to ammonium acquisition or excretion. Based on functional and crystallographic high-resolution structural data, it was recently proposed that the bacterial AmtB (ammonium transporter B) is a gas channel for NH3 [Khademi, O'Connell, III, Remis, Robles-Colmenares, Miercke and Stroud (2004) Science 305, 1587-1594; Zheng, Kostrewa, Berneche, Winkler and Li (2004) Proc. Natl. Acad. Sci. U.S.A. 101, 17090-17095]. Key residues, proposed to be crucial for NH3 conduction, and the hydrophobic, but obstructed, pore were conserved in a homology model of LeAMT1;1 from tomato. Transport by LeAMT1;1 was affected by mutations of residues that were predicted to constitute the aromatic recruitment site for NH4+ at the external pore entrance. Despite the structural similarities, LeAMT1;1 was shown to transport only the ion; each transported 14C-methylammonium molecule carried a single positive elementary charge. Similarly, NH4+ (or H+/NH3) was transported, but NH3 conduction was excluded. It is concluded that related proteins and a similar molecular architecture can apparently support contrasting transport mechanisms.
Collapse
Affiliation(s)
- Maria Mayer
- Zentrum für Molekularbiologie der Pflanzen (ZMBP), Pflanzenphysiologie, Universität Tübingen, Auf der Morgenstelle 1, 72076 Tübingen, Germany
| | - Marek Dynowski
- Zentrum für Molekularbiologie der Pflanzen (ZMBP), Pflanzenphysiologie, Universität Tübingen, Auf der Morgenstelle 1, 72076 Tübingen, Germany
| | - Uwe Ludewig
- Zentrum für Molekularbiologie der Pflanzen (ZMBP), Pflanzenphysiologie, Universität Tübingen, Auf der Morgenstelle 1, 72076 Tübingen, Germany
- To whom correspondence should be addressed (email )
| |
Collapse
|
125
|
Abstract
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function.
Collapse
Affiliation(s)
- Zhexin Xiang
- Center for Molecular Modeling, Center for Information Technology, National Institutes of Health, Building 12A Room 2051, 12 South Drive, Bethesda, Maryland 20892-5624, USA.
| |
Collapse
|
126
|
Abstract
The Sixth Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP6) held in December 2004 focused on the prediction of the structures of 90 protein domains from 64 targets. Thirty-eight of these were classified as "fold recognition," defined as being similar in fold to proteins of known structure at the time of submission of the predictions. Only the "first" predictions and those longer than 20 amino acids for each domain were assessed, resulting in 4527 predictions from 165 groups. The assessment was accomplished by the use of six structure alignment programs and three scoring measures based on these alignments. The use of a variety of measures resulted in scoring insensitive to the peculiarities of any one alignment method. The top-ranked methods in the prediction of structures that were clearly homologous to proteins in the Protein Data Bank primarily used servers and other programs based on achieving a consensus of many remote homology detection and fold recognition methods. The top-ranked methods in prediction of structures less clearly related or unrelated to proteins of known structures used fragment building methods in addition to the fold recognition meta methods.
Collapse
Affiliation(s)
- Guoli Wang
- Institute for Cancer Research, Fox Chase Cancer Center, Philadelphia, Pennsylvania 19111, USA
| | | | | |
Collapse
|
127
|
Ferraro DM, Ferraro DJ, Ramaswamy S, Robertson AD. Structures of ubiquitin insertion mutants support site-specific reflex response to insertions hypothesis. J Mol Biol 2006; 359:390-402. [PMID: 16647719 DOI: 10.1016/j.jmb.2006.03.047] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2005] [Revised: 03/17/2006] [Accepted: 03/21/2006] [Indexed: 11/29/2022]
Abstract
We previously concluded that, judging from NMR chemical shifts, the effects of insertions into ubiquitin on its conformation appear to depend primarily on the site of insertion rather than the sequence of the insertion. To obtain a more complete and atomic-resolution understanding of how these insertions modulate the conformation of ubiquitin, we have solved the crystal structures of four insertional mutants of ubiquitin. Insertions between residues 9 and 10 of ubiquitin are minimally perturbing to the remainder of the protein, while larger alterations occur when the insertion is between residues 35 and 36. Further, the alterations in response to insertions are very similar for each mutant at a given site. Two insertions, one at each site, were designed from structurally homologous proteins. Interestingly, the secondary structure within these five to seven amino acid residue insertions is conserved in the new protein. Overall, the crystal structures support the previous conclusion that the conformational effects of these insertions are determined largely by the site of insertion and only secondarily by the sequence of the insert.
Collapse
Affiliation(s)
- Debra M Ferraro
- Department of Biochemistry, Roy J. and Lucille A. Carver College of Medicine, University of Iowa, Iowa City, IA 52242, USA
| | | | | | | |
Collapse
|
128
|
Misura KMS, Chivian D, Rohl CA, Kim DE, Baker D. Physically realistic homology models built with ROSETTA can be more accurate than their templates. Proc Natl Acad Sci U S A 2006; 103:5361-6. [PMID: 16567638 PMCID: PMC1459360 DOI: 10.1073/pnas.0509355103] [Citation(s) in RCA: 136] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2005] [Indexed: 11/18/2022] Open
Abstract
We have developed a method that combines the ROSETTA de novo protein folding and refinement protocol with distance constraints derived from homologous structures to build homology models that are frequently more accurate than their templates. We test this method by building complete-chain models for a benchmark set of 22 proteins, each with 1 or 2 candidate templates, for a total of 39 test cases. We use structure-based and sequence-based alignments for each of the test cases. All atoms, including hydrogens, are represented explicitly. The resulting models contain approximately the same number of atomic overlaps as experimentally determined crystal structures and maintain good stereochemistry. The most accurate models can be identified by their energies, and in 22 of 39 cases a model that is more accurate than the template over aligned regions is one of the 10 lowest-energy models.
Collapse
Affiliation(s)
- Kira M. S. Misura
- Department of Biochemistry, University of Washington, Box 357350, J-567 Health Sciences, Seattle, WA 98195-7350
| | - Dylan Chivian
- Department of Biochemistry, University of Washington, Box 357350, J-567 Health Sciences, Seattle, WA 98195-7350
| | - Carol A. Rohl
- Department of Biochemistry, University of Washington, Box 357350, J-567 Health Sciences, Seattle, WA 98195-7350
| | - David E. Kim
- Department of Biochemistry, University of Washington, Box 357350, J-567 Health Sciences, Seattle, WA 98195-7350
| | - David Baker
- Department of Biochemistry, University of Washington, Box 357350, J-567 Health Sciences, Seattle, WA 98195-7350
| |
Collapse
|
129
|
Buonocore F, Randelli E, Bird S, Secombes CJ, Costantini S, Facchiano A, Mazzini M, Scapigliati G. The CD8alpha from sea bass (Dicentrarchus labrax L.): Cloning, expression and 3D modelling. FISH & SHELLFISH IMMUNOLOGY 2006; 20:637-46. [PMID: 16230027 DOI: 10.1016/j.fsi.2005.08.006] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/11/2005] [Revised: 08/10/2005] [Accepted: 08/29/2005] [Indexed: 05/04/2023]
Abstract
In this paper we describe the cloning, expression and structural study by modelling techniques of the CD8alpha from sea bass (Dicentrarchus labrax L.). The sea bass CD8alpha cDNA is comprised of 1490 bp and is translated in one reading frame to give a protein of 217 amino acids, with a predicted 26 amino acids signal peptide, a 88 bp 5'-UTR and a 748 bp 3'-UTR. A multiple alignment of CD8alpha from sea bass with other known CD8alpha sequences shows the conservation of most amino acid residues involved in the peculiar structural domains found within CD8alpha's. Cysteine residues that are involved in disulfide bonding to form the V domain are conserved. In contrast, an extra cysteine residue found in most mammals in this region is not present in sea bass. The transmembrane and cytoplasmic regions are the most conserved regions within the molecule in the alignment analysis. However, the motif (CXCP) that is thought to be responsible for binding p56lck is missing in the sea bass sequence. Phylogenetic analysis conducted using amino acid sequences showed that sea bass CD8alpha grouped with other known teleost sequences and that three different clusters were formed by the mammalian, avian and fish CD8alpha sequences. The thymus was the tissue with the highest CD8alpha expression, followed by gut, gills, peripheral blood leukocytes and spleen. Lower CD8alpha mRNA levels were found in head kidney, liver and brain. It was possible to create a partial 3D model using the human and mouse structures as template. The CD8alpha 11-120 amino acid region was taken into consideration and the best obtained 3D model shows the presence of ten beta-strands, involving about 50% of the sequence. The global structure was defined as an immunoglobulin-like beta-sandwich made of two anti-parallel sheets. Two cysteines were present in this region and they were at a suitable distance to form an S-S bond as seen in the template human and mouse structures.
Collapse
Affiliation(s)
- Francesco Buonocore
- Dipartimento di Scienze Ambientali, University of Tuscia, Largo dell'Università s.n.c., I-01100 Viterbo, Italy.
| | | | | | | | | | | | | | | |
Collapse
|
130
|
Reddy CS, Vijayasarathy K, Srinivas E, Sastry GM, Sastry GN. Homology modeling of membrane proteins: A critical assessment. Comput Biol Chem 2006; 30:120-6. [PMID: 16540373 DOI: 10.1016/j.compbiolchem.2005.12.002] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2005] [Revised: 11/10/2005] [Accepted: 12/14/2005] [Indexed: 11/22/2022]
Abstract
Evaluation and validation of homology modeling protocols are indispensable for membrane proteins as experimental determination of their three-dimensional structure is an arduous task. The prediction ability of Modeller, MOE, InsightII-Homology and Swiss-PdbViewer (SPV) with different sequence alignments CLUSTALW, BLAST and 3D-JIGSAW have been assessed. The sequence identity of the target and template was chosen to be in the range of 25-35%. Validation protocols to assess the structure, fold and stereochemical quality, are employed by comparing with experimental structures. Two different ranking schemes are suggested to evaluate the performance of each methodology based on the validation scores. While unambiguous preference for any given procedure did not surface, statistically Modeller and the sequence alignment technique, 3D-JIGSAW, gave best results amongst the chosen protocols. The present study helps in selecting the right protocols when modeling membrane proteins, which form a major class of drug targets.
Collapse
Affiliation(s)
- Ch Surendhar Reddy
- Molecular Modelling Group, Organic Chemical Sciences, Indian Institute of Chemical Technology, Tarnaka, Hyderabad 500007, India
| | | | | | | | | |
Collapse
|
131
|
Nayeem A, Sitkoff D, Krystek S. A comparative study of available software for high-accuracy homology modeling: from sequence alignments to structural models. Protein Sci 2006; 15:808-24. [PMID: 16600967 PMCID: PMC2242473 DOI: 10.1110/ps.051892906] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2005] [Revised: 01/09/2006] [Accepted: 01/12/2006] [Indexed: 10/24/2022]
Abstract
An open question in protein homology modeling is, how well do current modeling packages satisfy the dual criteria of quality of results and practical ease of use? To address this question objectively, we examined homology-built models of a variety of therapeutically relevant proteins. The sequence identities across these proteins range from 19% to 76%. A novel metric, the difference alignment index (DAI), is developed to aid in quantifying the quality of local sequence alignments. The DAI is also used to construct the relative sequence alignment (RSA), a new representation of global sequence alignment that facilitates comparison of sequence alignments from different methods. Comparisons of the sequence alignments in terms of the RSA and alignment methodologies are made to better understand the advantages and caveats of each method. All sequence alignments and corresponding 3D models are compared to their respective structure-based alignments and crystal structures. A variety of protein modeling software was used. We find that at sequence identities >40%, all packages give similar (and satisfactory) results; at lower sequence identities (<25%), the sequence alignments generated by Profit and Prime, which incorporate structural information in their sequence alignment, stand out from the rest. Moreover, the model generated by Prime in this low sequence identity region is noted to be superior to the rest. Additionally, we note that DSModeler and MOE, which generate reasonable models for sequence identities >25%, are significantly more functional and easier to use when compared with the other structure-building software.
Collapse
Affiliation(s)
- Akbar Nayeem
- Computer-Assisted Drug Design, Pharmaceutical Research Institute, Bristol-Myers Squibb, Princeton, New Jersey 08543, USA.
| | | | | |
Collapse
|
132
|
Topf M, Baker ML, Marti-Renom MA, Chiu W, Sali A. Refinement of Protein Structures by Iterative Comparative Modeling and CryoEM Density Fitting. J Mol Biol 2006; 357:1655-68. [PMID: 16490207 DOI: 10.1016/j.jmb.2006.01.062] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2005] [Revised: 01/06/2006] [Accepted: 01/17/2006] [Indexed: 11/21/2022]
Abstract
We developed a method for structure characterization of assembly components by iterative comparative protein structure modeling and fitting into cryo-electron microscopy (cryoEM) density maps. Specifically, we calculate a comparative model of a given component by considering many alternative alignments between the target sequence and a related template structure while optimizing the fit of a model into the corresponding density map. The method relies on the previously developed Moulder protocol that iterates over alignment, model building, and model assessment. The protocol was benchmarked using 20 varied target-template pairs of known structures with less than 30% sequence identity and corresponding simulated density maps at resolutions from 5A to 25A. Relative to the models based on the best existing sequence profile alignment methods, the percentage of C(alpha) atoms that are within 5A of the corresponding C(alpha) atoms in the superposed native structure increases on average from 52% to 66%, which is half-way between the starting models and the models from the best possible alignments (82%). The test also reveals that despite the improvements in the accuracy of the fitness function, this function is still the bottleneck in reducing the remaining errors. To demonstrate the usefulness of the protocol, we applied it to the upper domain of the P8 capsid protein of rice dwarf virus that has been studied by cryoEM at 6.8A. The C(alpha) root-mean-square deviation of the model based on the remotely related template, bluetongue virus VP7, improved from 8.7A to 6.0A, while the best possible model has a C(alpha) RMSD value of 5.3A. Moreover, the resulting model fits better into the cryoEM density map than the initial template structure. The method is being implemented in our program MODELLER for protein structure modeling by satisfaction of spatial restraints and will be applicable to the rapidly increasing number of cryoEM density maps of macromolecular assemblies.
Collapse
Affiliation(s)
- Maya Topf
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical Research, QB3, 1700 4th Street, Suite 503B, University of California at San Francisco, San Francisco, CA 94143-2552, USA
| | | | | | | | | |
Collapse
|
133
|
Wallner B, Elofsson A. Identification of correct regions in protein models using structural, alignment, and consensus information. Protein Sci 2006; 15:900-13. [PMID: 16522791 PMCID: PMC2242478 DOI: 10.1110/ps.051799606] [Citation(s) in RCA: 122] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
Abstract
In this study we present two methods to predict the local quality of a protein model: ProQres and ProQprof. ProQres is based on structural features that can be calculated from a model, while ProQprof uses alignment information and can only be used if the model is created from an alignment. In addition, we also propose a simple approach based on local consensus, Pcons-local. We show that all these methods perform better than state-of-the-art methodologies and that, when applicable, the consensus approach is by far the best approach to predict local structure quality. It was also found that ProQprof performed better than other methods for models based on distant relationships, while ProQres performed best for models based on closer relationship, i.e., a model has to be reasonably good to make a structural evaluation useful. Finally, we show that a combination of ProQprof and ProQres (ProQlocal) performed better than any other nonconsensus method for both high- and low-quality models. Additional information and Web servers are available at: http://www.sbc.su.se/~bjorn/ProQ/.
Collapse
Affiliation(s)
- Björn Wallner
- Stockholm Bioinformatics Center, Stockholm University, SE-106 91 Stockholm, Sweden.
| | | |
Collapse
|
134
|
Pieper U, Eswar N, Davis FP, Braberg H, Madhusudhan MS, Rossi A, Marti-Renom M, Karchin R, Webb BM, Eramian D, Shen MY, Kelly L, Melo F, Sali A. MODBASE: a database of annotated comparative protein structure models and associated resources. Nucleic Acids Res 2006; 34:D291-5. [PMID: 16381869 PMCID: PMC1347422 DOI: 10.1093/nar/gkj059] [Citation(s) in RCA: 202] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
MODBASE () is a database of annotated comparative protein structure models for all available protein sequences that can be matched to at least one known protein structure. The models are calculated by MODPIPE, an automated modeling pipeline that relies on MODELLER for fold assignment, sequence–structure alignment, model building and model assessment (). MODBASE is updated regularly to reflect the growth in protein sequence and structure databases, and improvements in the software for calculating the models. MODBASE currently contains 3 094 524 reliable models for domains in 1 094 750 out of 1 817 889 unique protein sequences in the UniProt database (July 5, 2005); only models based on statistically significant alignments and models assessed to have the correct fold despite insignificant alignments are included. MODBASE also allows users to generate comparative models for proteins of interest with the automated modeling server MODWEB (). Our other resources integrated with MODBASE include comprehensive databases of multiple protein structure alignments (DBAli, ), structurally defined ligand binding sites and structurally defined binary domain interfaces (PIBASE, ) as well as predictions of ligand binding sites, interactions between yeast proteins, and functional consequences of human nsSNPs (LS-SNP, ).
Collapse
Affiliation(s)
- Ursula Pieper
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Narayanan Eswar
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Fred P. Davis
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Hannes Braberg
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - M. S. Madhusudhan
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Andrea Rossi
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Marc Marti-Renom
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Rachel Karchin
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Ben M. Webb
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - David Eramian
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Graduate Group in Biophysics, University of CaliforniaSan Francisco, CA, USA
| | - Min-Yi Shen
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
| | - Libusha Kelly
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Graduate Group in Biological and Medical Informatics, University of CaliforniaSan Francisco, CA, USA
| | - Francisco Melo
- Departamento de Genética Molecular y Microbiología, Facultad de Ciencias Biológicas, Pontificia Universidad Católica de ChileAlameda 340, Santiago, Chile
| | - Andrej Sali
- Department of Biopharmaceutical Sciences, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- Department Pharmaceutical Chemistry, California Institute for Quantitative Biomedical ResearchQB3 at Mission Bay, Office 503BUniversity of California at San Francisco1700 4th Street, San Francisco, CA 94158, USA
- To whom correspondence should be addressed. Tel: +1 415 514 4227; Fax: +1 415 514 4231;
| |
Collapse
|
135
|
Abstract
In recent years, there has been significant progress in the ability to predict the three-dimensional structure of proteins from their amino acid sequence. Progress has been due to new methods to extract the growing amount of information in sequence and structure databases and improved computational descriptions of protein energetics. This review summarizes recent advances in these areas and describes a number of novel biological applications made possible by structure prediction. Despite remaining challenges, protein structure prediction is becoming an extremely useful tool in understanding phenomena in modern molecular and cell biology.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
136
|
Holmes JB, Tsai J. Characterizing conserved structural contacts by pair-wise relative contacts and relative packing groups. J Mol Biol 2005; 354:706-21. [PMID: 16269154 DOI: 10.1016/j.jmb.2005.09.081] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2005] [Revised: 09/06/2005] [Accepted: 09/26/2005] [Indexed: 10/25/2022]
Abstract
To adequately deal with the inherent complexity of interactions between protein side-chains, we develop and describe here a novel method for characterizing protein packing within a fold family. Instead of approaching side-chain interactions absolutely from one residue to another, we instead consider the relative interactions of contacting residue pairs. The basic element, the pair-wise relative contact, is constructed from a sequence alignment and contact analysis of a set of structures and consists of a cluster of similarly oriented, interacting, side-chain pairs. To demonstrate this construct's usefulness in analyzing protein structure, we used the pair-wise relative contacts to analyze two sets of protein structures as defined by SCOP: the diverse globin-like superfamily (126 structures) and the more uniform heme binding globin family (a 94 structure subset of the globin-like superfamily). The superfamily structure set produced 1266 unique pair-wise relative contacts, whereas the family structure subset gave 1001 unique pair-wise relative contacts. For both sets, we show that these constructs can be used to accurately and automatically differentiate between fold classes. Furthermore, these pair-wise relative contacts correlate well with sequence identity and thus provide a direct relationship between changes in sequence and changes in structure. To capture the complexity of protein packing, these pair-wise relative contacts can be superimposed around a single residue to create a multi-body construct called a relative packing group. Construction of convex hulls around the individual packing groups provides a measure of the variation in packing around a residue and defines an approximate volume of space occupied by the groups interacting with a residue. We find that these relative packing groups are useful in understanding the structural quality of sequence or structure alignments. Moreover, they provide context to calculate a value for structural randomness, which is important in properly assessing the quality of a structural alignment. The results of this study provide the framework for future analysis for correlating sequence changes to specific structure changes.
Collapse
Affiliation(s)
- J Bradley Holmes
- Laboratory of Molecular Genetics NICHD-NIH, Bethesda, MD 20952, USA
| | | |
Collapse
|
137
|
Topf M, Sali A. Combining electron microscopy and comparative protein structure modeling. Curr Opin Struct Biol 2005; 15:578-85. [PMID: 16118050 DOI: 10.1016/j.sbi.2005.08.001] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2005] [Revised: 07/01/2005] [Accepted: 08/10/2005] [Indexed: 10/25/2022]
Abstract
Recently, advances have been made in methods and applications that integrate electron microscopy density maps and comparative modeling to produce atomic structures of macromolecular assemblies. Electron microscopy can benefit from comparative modeling through the fitting of comparative models into electron microscopy density maps. Also, comparative modeling can benefit from electron microscopy through the use of intermediate-resolution density maps in fold recognition, template selection and sequence-structure alignment.
Collapse
Affiliation(s)
- Maya Topf
- Department of Biopharmaceutical Sciences, University of California San Francisco, San Francisco, CA 94143, USA
| | | |
Collapse
|