1
|
Prediction of Protein Tertiary Structure via Regularized Template Classification Techniques. Molecules 2020; 25:molecules25112467. [PMID: 32466409 PMCID: PMC7321371 DOI: 10.3390/molecules25112467] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Revised: 05/21/2020] [Accepted: 05/22/2020] [Indexed: 11/24/2022] Open
Abstract
We discuss the use of the regularized linear discriminant analysis (LDA) as a model reduction technique combined with particle swarm optimization (PSO) in protein tertiary structure prediction, followed by structure refinement based on singular value decomposition (SVD) and PSO. The algorithm presented in this paper corresponds to the category of template-based modeling. The algorithm performs a preselection of protein templates before constructing a lower dimensional subspace via a regularized LDA. The protein coordinates in the reduced spaced are sampled using a highly explorative optimization algorithm, regressive–regressive PSO (RR-PSO). The obtained structure is then projected onto a reduced space via singular value decomposition and further optimized via RR-PSO to carry out a structure refinement. The final structures are similar to those predicted by best structure prediction tools, such as Rossetta and Zhang servers. The main advantage of our methodology is that alleviates the ill-posed character of protein structure prediction problems related to high dimensional optimization. It is also capable of sampling a wide range of conformational space due to the application of a regularized linear discriminant analysis, which allows us to expand the differences over a reduced basis set.
Collapse
|
2
|
Álvarez Ó, Fernández-Martínez JL, Corbeanu AC, Fernández-Muñiz Z, Kloczkowski A. Predicting protein tertiary structure and its uncertainty analysis via particle swarm sampling. J Mol Model 2019; 25:79. [PMID: 30810816 PMCID: PMC7586042 DOI: 10.1007/s00894-019-3956-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 02/05/2019] [Indexed: 10/27/2022]
Abstract
We discuss the relationship between the problem of protein tertiary structure prediction from the amino acid sequence and the uncertainty analysis. The algorithm presented in this paper belongs to the category of decoy-based modeling, where different known protein models are used to establish a low dimensional space via principal component analysis. The low dimensional space is utilized to perform an energy optimization via a family of very explorative particle swarm optimizers to find the global minimum. The aim of this procedure is to get a representative sample of the nonlinear equivalent region, that is, protein models that have their energy lower than a certain energy bound. The posterior analysis of this family provides very valuable information about the backbone structure of the native conformation and its possible alternate states. This methodology has the advantage of being simple and fast and can help refine the tertiary protein structure. We comprehensively illustrate the performance of our algorithm on one protein from the CASP-9 protein structure prediction experiment. We also provide a theoretical analysis of the energy landscape found in the tertiary structure protein inverse problem, explaining why model reduction techniques (principal component analysis in this case) serve to alleviate the ill-posed character of this high dimensional optimization problem. In addition, we expand the computational benchmark with a summary of other CASP-9 proteins in the Appendix.
Collapse
Affiliation(s)
- Óscar Álvarez
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Juan Luis Fernández-Martínez
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain.
| | - Ana Cernea Corbeanu
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Zulima Fernández-Muñiz
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo C. Federico García Lorca, 18, 33007, Oviedo, Spain
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, USA
- Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
3
|
Cheung NJ, Yu W. De novo protein structure prediction using ultra-fast molecular dynamics simulation. PLoS One 2018; 13:e0205819. [PMID: 30458007 PMCID: PMC6245515 DOI: 10.1371/journal.pone.0205819] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2018] [Accepted: 10/02/2018] [Indexed: 11/19/2022] Open
Abstract
Modern genomics sequencing techniques have provided a massive amount of protein sequences, but experimental endeavor in determining protein structures is largely lagging far behind the vast and unexplored sequences. Apparently, computational biology is playing a more important role in protein structure prediction than ever. Here, we present a system of de novo predictor, termed NiDelta, building on a deep convolutional neural network and statistical potential enabling molecular dynamics simulation for modeling protein tertiary structure. Combining with evolutionary-based residue-contacts, the presented predictor can predict the tertiary structures of a number of target proteins with remarkable accuracy. The proposed approach is demonstrated by calculations on a set of eighteen large proteins from different fold classes. The results show that the ultra-fast molecular dynamics simulation could dramatically reduce the gap between the sequence and its structure at atom level, and it could also present high efficiency in protein structure determination if sparse experimental data is available.
Collapse
Affiliation(s)
- Ngaam J. Cheung
- Department of Brain and Cognitive Science, DGIST, Daegu, South Korea
- Cavendish Laboratory, Department of Physics, University of Cambridge, Cambridge, United Kingdom
| | - Wookyung Yu
- Department of Brain and Cognitive Science, DGIST, Daegu, South Korea
- Core Protein Resources Center, DGIST, Daegu, South Korea
- * E-mail:
| |
Collapse
|
4
|
Álvarez Ó, Fernández-Martínez JL, Fernández-Brillet C, Cernea A, Fernández-Muñiz Z, Kloczkowski A. Principal component analysis in protein tertiary structure prediction. J Bioinform Comput Biol 2018; 16:1850005. [PMID: 29566640 DOI: 10.1142/s0219720018500051] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We discuss applicability of principal component analysis (PCA) for protein tertiary structure prediction from amino acid sequence. The algorithm presented in this paper belongs to the category of protein refinement models and involves establishing a low-dimensional space where the sampling (and optimization) is carried out via particle swarm optimizer (PSO). The reduced space is found via PCA performed for a set of low-energy protein models previously found using different optimization techniques. A high frequency term is added into this expansion by projecting the best decoy into the PCA basis set and calculating the residual model. This term is aimed at providing high frequency details in the energy optimization. The goal of this research is to analyze how the dimensionality reduction affects the prediction capability of the PSO procedure. For that purpose, different proteins from the Critical Assessment of Techniques for Protein Structure Prediction experiments were modeled. In all the cases, both the energy of the best decoy and the distance to the native structure have decreased. Our analysis also shows how the predicted backbone structure of native conformation and of alternative low energy states varies with respect to the PCA dimensionality. Generally speaking, the reconstruction can be successfully achieved with 10 principal components and the high frequency term. We also provide a computational analysis of protein energy landscape for the inverse problem of reconstructing structure from the reduced number of principal components, showing that the dimensionality reduction alleviates the ill-posed character of this high-dimensional energy optimization problem. The procedure explained in this paper is very fast and allows testing different PCA expansions. Our results show that PSO improves the energy of the best decoy used in the PCA when the adequate number of PCA terms is considered.
Collapse
Affiliation(s)
- Óscar Álvarez
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Juan Luis Fernández-Martínez
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Celia Fernández-Brillet
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Ana Cernea
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Zulima Fernández-Muñiz
- * Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, C. Federico García Lorca, 18, 33007 Oviedo, Spain
| | - Andrzej Kloczkowski
- † Batelle Center for Mathematical Medicine, Nationwide Children's Hospital, Columbus, OH, USA.,‡ Department of Pediatrics, The Ohio State University, Columbus, OH, USA
| |
Collapse
|
5
|
Bastolla U. Computing protein dynamics from protein structure with elastic network models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2014. [DOI: 10.1002/wcms.1186] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Ugo Bastolla
- Centro de Biologa Molecular Severo Ochoa (CSIC‐UAM)Universidad Autónoma de MadridMadridSpain
| |
Collapse
|
6
|
Custódio FL, Barbosa HJ, Dardenne LE. A multiple minima genetic algorithm for protein structure prediction. Appl Soft Comput 2014. [DOI: 10.1016/j.asoc.2013.10.029] [Citation(s) in RCA: 52] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
|
7
|
Faraggi E, Kloczkowski A. A global machine learning based scoring function for protein structure prediction. Proteins 2013; 82:752-9. [DOI: 10.1002/prot.24454] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2013] [Revised: 10/03/2013] [Accepted: 10/21/2013] [Indexed: 01/07/2023]
Affiliation(s)
- Eshel Faraggi
- Department of Biochemistry and Molecular Biology; Indiana University School of Medicine; Indianapolis Indiana 46202
- Battelle Center for Mathematical Medicine; Nationwide Children's Hospital; Columbus Ohio 43215
- Physics Division; Research and Information Systems, LLC; Carmel Indiana 46032
| | - Andrzej Kloczkowski
- Battelle Center for Mathematical Medicine; Nationwide Children's Hospital; Columbus Ohio 43215
- Department of Pediatrics; The Ohio State University; Columbus Ohio 43215
| |
Collapse
|
8
|
Vyas VK, Ukawala RD, Ghate M, Chintha C. Homology modeling a fast tool for drug discovery: current perspectives. Indian J Pharm Sci 2012. [PMID: 23204616 PMCID: PMC3507339 DOI: 10.4103/0250-474x.102537] [Citation(s) in RCA: 139] [Impact Index Per Article: 11.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Major goal of structural biology involve formation of protein-ligand complexes; in which the protein molecules act energetically in the course of binding. Therefore, perceptive of protein-ligand interaction will be very important for structure based drug design. Lack of knowledge of 3D structures has hindered efforts to understand the binding specificities of ligands with protein. With increasing in modeling software and the growing number of known protein structures, homology modeling is rapidly becoming the method of choice for obtaining 3D coordinates of proteins. Homology modeling is a representation of the similarity of environmental residues at topologically corresponding positions in the reference proteins. In the absence of experimental data, model building on the basis of a known 3D structure of a homologous protein is at present the only reliable method to obtain the structural information. Knowledge of the 3D structures of proteins provides invaluable insights into the molecular basis of their functions. The recent advances in homology modeling, particularly in detecting and aligning sequences with template structures, distant homologues, modeling of loops and side chains as well as detecting errors in a model contributed to consistent prediction of protein structure, which was not possible even several years ago. This review focused on the features and a role of homology modeling in predicting protein structure and described current developments in this field with victorious applications at the different stages of the drug design and discovery.
Collapse
Affiliation(s)
- V K Vyas
- Department of Pharmaceutical Chemistry, Institute of Pharmacy, Nirma University, Ahmedabad-382 481, India
| | | | | | | |
Collapse
|
9
|
Abstract
Functional characterization of proteins being one of the major issues in molecular biology is still unsolved due to several resource and technical limitations of experimental structure determination methods. A suitable methodology for accurate prediction of protein confirmations simply from sequence is therefore emerging as the primary modeling goal of researchers today. Global blind protein structure prediction summit, entitled Critical Assessment of Structure Prediction (CASP), critically assesses the modeling methodologies, to track our algorithmic path development. But our success is still impeded by incompetent modeling methodologies and several key technical lacunas. There is still a great need to focus some key issues for bridging the major though considered trivial gaps, in the upcoming CASP to pave and demarcate our correct way of developing a consistently accurate prediction methodology in the near future. Major problems resulting in divergence of our predicted models from their actual native states are thus highlighted with suggested more stringent and reliable assessment considerations in the CASP test.
Collapse
Affiliation(s)
- Ashish Runthala
- Biological Sciences, Faculty Division III, Birla Institute of Technology & Science, Pilani, Rajasthan, India.
| |
Collapse
|
10
|
Mullins JGL. Structural modelling pipelines in next generation sequencing projects. ADVANCES IN PROTEIN CHEMISTRY AND STRUCTURAL BIOLOGY 2012; 89:117-67. [PMID: 23046884 DOI: 10.1016/b978-0-12-394287-6.00005-7] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Our capacity to reliably predict protein structure from sequence is steadily improving due to the increased numbers and better targeting of protein structures being experimentally determined by structural genomics projects, along with the development of better modeling methodologies. Template-based (homology) modeling and de novo modeling methods are being combined to fill in remaining gaps in template coverage, and powerful automated structural modeling pipelines are being applied to large data sets of protein sequences. The improved quality of 3D models of proteins has led to their routine use in assessing the functional impact of nonsynonymous single nucleotide polymorphisms (nsSNPs) in specific protein systems, with the development of approaches that may be applied in a predictive fashion to nsSNPs emerging from next-generation sequencing projects. The challenges encountered in deriving functionally meaningful deductions from structural modeling can be quite different for proteins of different protein functional classes. The specific challenges to the assessment of the structural and functional impact of nsSNPs in globular proteins such as binding and regulatory proteins, structural proteins, and enzymes are discussed, as well as membrane transport proteins and ion channels. The mapping of reliable predictions of the structural and functional impact of SNPs, generated from automated modeling pipelines, on to protein-protein interaction networks will facilitate new approaches to understanding complex polygenic disorders and predisposition to disease.
Collapse
Affiliation(s)
- Jonathan G L Mullins
- Genome and Structural Bioinformatics, Institute of Life Science, College of Medicine, Swansea University, Singleton Park, Swansea, Wales, UK.
| |
Collapse
|
11
|
MacCallum JL, Pérez A, Schnieders MJ, Hua L, Jacobson MP, Dill KA. Assessment of protein structure refinement in CASP9. Proteins 2011; 79 Suppl 10:74-90. [PMID: 22069034 DOI: 10.1002/prot.23131] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2011] [Revised: 06/15/2011] [Accepted: 07/03/2011] [Indexed: 11/06/2022]
Abstract
We assess performance in the structure refinement category in CASP9. Two years after CASP8, the performance of the best groups has not improved. There are few groups that improve any of our assessment scores with statistical significance. Some predictors, however, are able to consistently improve the physicality of the models. Although we cannot identify any clear bottleneck in improving refinement, several points arise: (1) The refinement portion of CASP has too few targets to make many statistically meaningful conclusions. (2) Predictors are usually very conservative, limiting the possibility of large improvements in models. (3) No group is actually able to correctly rank their five submissions-indicating that potentially better models may be discarded. (4) Different sampling strategies work better for different refinement problems; there is no single strategy that works on all targets. In general, conservative strategies do better, while the greatest improvements come from more adventurous sampling-at the cost of consistency. Comparison with experimental data reveals aspects not captured by comparison to a single structure. In particular, we show that improvement in backbone geometry does not always mean better agreement with experimental data. Finally, we demonstrate that even given the current challenges facing refinement, the refined models are useful for solving the crystallographic phase problem through molecular replacement. Proteins 2011;. © 2011 Wiley-Liss, Inc.
Collapse
Affiliation(s)
- Justin L MacCallum
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, New York, USA.
| | | | | | | | | | | |
Collapse
|
12
|
Liang S, Zhou Y, Grishin N, Standley DM. Protein side chain modeling with orientation-dependent atomic force fields derived by series expansions. J Comput Chem 2011; 32:1680-6. [PMID: 21374632 PMCID: PMC3072444 DOI: 10.1002/jcc.21747] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2010] [Revised: 12/10/2010] [Accepted: 12/11/2010] [Indexed: 11/09/2022]
Abstract
We describe the development of new force fields for protein side chain modeling called optimized side chain atomic energy (OSCAR). The distance-dependent energy functions (OSCAR-d) and side-chain dihedral angle potential energy functions were represented as power and Fourier series, respectively. The resulting 802 adjustable parameters were optimized by discriminating the native side chain conformations from non-native conformations, using a training set of 12,000 side chains for each residue type. In the course of optimization, for every residue, its side chain was replaced by varying rotamers, whereas conformations for all other residues were kept as they appeared in the crystal structure. Then, the OSCAR-d were multiplied by an orientation-dependent function to yield OSCAR-o. A total of 1087 parameters of the orientation-dependent energy functions (OSCAR-o) were optimized by maximizing the energy gap between the native conformation and subrotamers calculated as low energy by OSCAR-d. When OSCAR-o with optimized parameters were used to model side chain conformations simultaneously for 218 recently released protein structures, the prediction accuracies were 88.8% for χ(1) , 79.7% for χ(1 + 2) , 1.24 Å overall root mean square deviation (RMSD), and 0.62 Å RMSD for core residues, respectively, compared with the next-best performing side-chain modeling program which achieved 86.6% for χ(1) , 75.7% for χ(1 + 2) , 1.40 Å overall RMSD, and 0.86 Å RMSD for core residues, respectively. The continuous energy functions obtained in this study are suitable for gradient-based optimization techniques for protein structure refinement. A program with built-in OSCAR for protein side chain prediction is available for download at http://sysimm.ifrec.osaka-u.ac.jp/OSCAR/.
Collapse
Affiliation(s)
- Shide Liang
- Systems Immunology Lab, Immunology Frontier Research Center, Osaka University, Suita, Osaka 565-0871, Japan.
| | | | | | | |
Collapse
|
13
|
Eyal E, Dutta A, Bahar I. Cooperative dynamics of proteins unraveled by network models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2011; 1:426-439. [PMID: 32148561 DOI: 10.1002/wcms.44] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Recent years have seen a significant increase in the number of computational studies that adopted network models for investigating biomolecular systems dynamics and interactions. In particular, elastic network models have proven useful in elucidating the dynamics and allosteric signaling mechanisms of proteins and their complexes. Here we present an overview of two most widely used elastic network models, the Gaussian Network Model (GNM) and Anisotropic Network Model (ANM). We illustrate their use in (i) explaining the anisotropic response of proteins observed in external pulling experiments, (ii) identifying residues that possess high allosteric potentials, and demonstrating in this context the propensity of catalytic sites and metal-binding sites for enabling efficient signal transduction, and (iii) assisting in structure refinement, molecular replacement and comparative modeling of ligand-bound forms via efficient sampling of energetically favored conformers.
Collapse
Affiliation(s)
- Eran Eyal
- Department of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA.,Cancer Research Institute, Sheba Medical Center, Ramat Gan, Israel
| | - Anindita Dutta
- Department of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Ivet Bahar
- Department of Computational & Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
14
|
Davis FP. Proteome-wide prediction of overlapping small molecule and protein binding sites using structure. MOLECULAR BIOSYSTEMS 2010; 7:545-57. [PMID: 21103609 DOI: 10.1039/c0mb00200c] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Small molecules that modulate protein-protein interactions are of great interest for chemical biology and therapeutics. Here I present a structure-based approach to predict 'bi-functional' sites able to bind both small molecule ligands and proteins, in proteins of unknown structure. First, I develop a homology-based annotation method that transfers binding sites of known three-dimensional structure onto protein sequences, predicting residues in ligand and protein binding sites with estimated true positive rates of 98% and 88%, respectively, at 1% false positive rates. Applying this method to the human proteome predicts 8463 proteins with bi-functional residues and correctly recovers the targets of known interaction modulators. Proteins with significantly (p < 0.01) more bi-functional residues than expected were found to be enriched in regulatory and depleted in metabolism functions. Finally, I demonstrate the utility of the method by describing examples of predicted overlap and evidence of their biological and therapeutic relevance. The results suggest that combining the structures of known binding sites with established fold detection algorithms can predict regions of protein-protein interfaces that are amenable to small molecule modulation. Open-source software and the results for several complete proteomes are available at http://pibase.janelia.org/homolobind.
Collapse
Affiliation(s)
- Fred P Davis
- Howard Hughes Medical Institute, Janelia Farm Research Campus, 19700 Helix Dr, Ashburn, VA 20147, USA.
| |
Collapse
|
15
|
Meyer T, D'Abramo M, Hospital A, Rueda M, Ferrer-Costa C, Pérez A, Carrillo O, Camps J, Fenollosa C, Repchevsky D, Gelpí JL, Orozco M. MoDEL (Molecular Dynamics Extended Library): A Database of Atomistic Molecular Dynamics Trajectories. Structure 2010; 18:1399-409. [DOI: 10.1016/j.str.2010.07.013] [Citation(s) in RCA: 108] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2010] [Revised: 07/19/2010] [Accepted: 07/27/2010] [Indexed: 11/26/2022]
|
16
|
Bahar I, Lezon TR, Yang LW, Eyal E. Global dynamics of proteins: bridging between structure and function. Annu Rev Biophys 2010; 39:23-42. [PMID: 20192781 DOI: 10.1146/annurev.biophys.093008.131258] [Citation(s) in RCA: 446] [Impact Index Per Article: 31.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Biomolecular systems possess unique, structure-encoded dynamic properties that underlie their biological functions. Recent studies indicate that these dynamic properties are determined to a large extent by the topology of native contacts. In recent years, elastic network models used in conjunction with normal mode analyses have proven to be useful for elucidating the collective dynamics intrinsically accessible under native state conditions, including in particular the global modes of motions that are robustly defined by the overall architecture. With increasing availability of structural data for well-studied proteins in different forms (liganded, complexed, or free), there is increasing evidence in support of the correspondence between functional changes in structures observed in experiments and the global motions predicted by these coarse-grained analyses. These observed correlations suggest that computational methods may be advantageously employed for assessing functional changes in structure and allosteric mechanisms intrinsically favored by the native fold.
Collapse
Affiliation(s)
- Ivet Bahar
- Department of Computational Biology, School of Medicine, University of Pittsburgh, Pittsburgh, Pennsylvania 15213, USA.
| | | | | | | |
Collapse
|
17
|
Blum B, Jordan MI, Baker D. Feature space resampling for protein conformational search. Proteins 2010; 78:1583-93. [PMID: 20131376 DOI: 10.1002/prot.22677] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
De novo protein structure prediction requires location of the lowest energy state of the polypeptide chain among a vast set of possible conformations. Powerful approaches include conformational space annealing, in which search progressively focuses on the most promising regions of conformational space, and genetic algorithms, in which features of the best conformations thus far identified are recombined. We describe a new approach that combines the strengths of these two approaches. Protein conformations are projected onto a discrete feature space which includes backbone torsion angles, secondary structure, and beta pairings. For each of these there is one "native" value: the one found in the native structure. We begin with a large number of conformations generated in independent Monte Carlo structure prediction trajectories from Rosetta. Native values for each feature are predicted from the frequencies of feature value occurrences and the energy distribution in conformations containing them. A second round of structure prediction trajectories are then guided by the predicted native feature distributions. We show that native features can be predicted at much higher than background rates, and that using the predicted feature distributions improves structure prediction in a benchmark of 28 proteins. The advantages of our approach are that features from many different input structures can be combined simultaneously without producing atomic clashes or otherwise physically inviable models, and that the features being recombined have a relatively high chance of being correct.
Collapse
Affiliation(s)
- Ben Blum
- Department of Electrical Engineering and Computer Science, University of California, Berkeley, 94720, USA.
| | | | | |
Collapse
|
18
|
Davis FP, Sali A. The overlap of small molecule and protein binding sites within families of protein structures. PLoS Comput Biol 2010; 6:e1000668. [PMID: 20140189 PMCID: PMC2816688 DOI: 10.1371/journal.pcbi.1000668] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2009] [Accepted: 12/31/2009] [Indexed: 02/03/2023] Open
Abstract
Protein–protein interactions are challenging targets for modulation by small molecules. Here, we propose an approach that harnesses the increasing structural coverage of protein complexes to identify small molecules that may target protein interactions. Specifically, we identify ligand and protein binding sites that overlap upon alignment of homologous proteins. Of the 2,619 protein structure families observed to bind proteins, 1,028 also bind small molecules (250–1000 Da), and 197 exhibit a statistically significant (p<0.01) overlap between ligand and protein binding positions. These “bi-functional positions”, which bind both ligands and proteins, are particularly enriched in tyrosine and tryptophan residues, similar to “energetic hotspots” described previously, and are significantly less conserved than mono-functional and solvent exposed positions. Homology transfer identifies ligands whose binding sites overlap at least 20% of the protein interface for 35% of domain–domain and 45% of domain–peptide mediated interactions. The analysis recovered known small-molecule modulators of protein interactions as well as predicted new interaction targets based on the sequence similarity of ligand binding sites. We illustrate the predictive utility of the method by suggesting structural mechanisms for the effects of sanglifehrin A on HIV virion production, bepridil on the cellular entry of anthrax edema factor, and fusicoccin on vertebrate developmental pathways. The results, available at http://pibase.janelia.org, represent a comprehensive collection of structurally characterized modulators of protein interactions, and suggest that homologous structures are a useful resource for the rational design of interaction modulators. Proteins function through their interactions with other biological molecules, including other proteins. Often times, these interactions underlie cellular processes that go awry in disease. Therefore, modulating these interactions with small molecules is an active area of research for new drugs to treat diseases and new chemical tools to dissect cellular interaction networks. However, targeting protein–protein interactions has proven to be more challenging than the typical drug targets found on individual proteins. Here, we present a computational approach that aims to help in this challenge by identifying regions of protein–protein interfaces that may be amenable to targeting by small molecules. Through a comprehensive analysis of all known protein structures, we identify closely related proteins that in one case bind a protein and in another case bind a small molecule. We find that a significant number of protein–protein interactions occur through surface regions that bind small molecules in related proteins. These “bi-functional” positions, which can bind both proteins and ligands, will serve as an additional piece of structural information that can aid experimentalists in developing small molecules that modulate protein interactions.
Collapse
Affiliation(s)
- Fred P. Davis
- Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia, United States of America
- * E-mail: (FPD); (AS)
| | - Andrej Sali
- Department of Bioengineering and Therapeutic Sciences, Pharmaceutical Chemistry, and California Institute for Quantitative Biosciences, University of California, San Francisco, San Francisco, California, United States of America
- * E-mail: (FPD); (AS)
| |
Collapse
|
19
|
MacCallum JL, Hua L, Schnieders MJ, Pande VS, Jacobson MP, Dill KA. Assessment of the protein-structure refinement category in CASP8. Proteins 2010; 77 Suppl 9:66-80. [PMID: 19714776 DOI: 10.1002/prot.22538] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Here, we summarize the assessment of protein structure refinement in CASP8. Twenty-four groups refined a total of 12 target proteins. Averaging over all groups and all proteins, there was no net improvement over the original starting models. However, there are now some individual research groups who consistently do improve protein structures relative to a starting starting model. We compare various measures of quality assessment, including (i) standard backbone-based methods, (ii) new methods from the Richardson group, and (iii) ensemble-based methods for comparing experimental structures, such as NMR NOE violations and the suitability of the predicted models to serve as templates for molecular replacement. On the whole, there is a general correlation among various measures. However, there are interesting differences. Sometimes a structure that is in better agreement with the experimental data is judged to be slightly worse by GDT-TS. This suggests that for comparing protein structures that are already quite close to the native, it may be preferable to use ensemble-based experimentally derived measures of quality, in addition to single-structure-based methods such as GDT-TS.
Collapse
Affiliation(s)
- Justin L MacCallum
- Department of Pharmaceutical Chemistry, University of California San Francisco, San Francisco, California 94158, USA
| | | | | | | | | | | |
Collapse
|
20
|
Abia D, Bastolla U, Chacón P, Fábrega C, Gago F, Morreale A, Tramontano A. In memoriam. Proteins 2010; 78:iii-viii. [DOI: 10.1002/prot.22660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
21
|
Babor M, Kortemme T. Multi-constraint computational design suggests that native sequences of germline antibody H3 loops are nearly optimal for conformational flexibility. Proteins 2009; 75:846-58. [PMID: 19194863 DOI: 10.1002/prot.22293] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The limited size of the germline antibody repertoire has to recognize a far larger number of potential antigens. The ability of a single antibody to bind multiple ligands due to conformational flexibility in the antigen-binding site can significantly enlarge the repertoire. Among the six complementarity determining regions (CDRs) that generally comprise the binding site, the CDR H3 loop is particularly variable. Computational protein design studies showed that predicted low energy sequences compatible with a given backbone structure often have considerable similarity to the corresponding native sequences of naturally occurring proteins, indicating that native protein sequences are close to optimal for their structures. Here, we take a step forward to determine whether conformational flexibility, believed to play a key functional role in germline antibodies, is also central in shaping their native sequence. In particular, we use a multi-constraint computational design strategy, along with the Rosetta scoring function, to propose that the native sequences of CDR H3 loops from germline antibodies are nearly optimal for conformational flexibility. Moreover, we find that antibody maturation may lead to sequences with a higher degree of optimization for a single conformation, while disfavoring sequences that are intrinsically flexible. In addition, this computational strategy allows us to predict mutations in the CDR H3 loop to stabilize the antigen-bound conformation, a computational mimic of affinity maturation, that may increase antigen binding affinity by preorganizing the antigen binding loop. In vivo affinity maturation data are consistent with our predictions. The method described here can be useful to design antibodies with higher selectivity and affinity by reducing conformational diversity.
Collapse
Affiliation(s)
- Mariana Babor
- California Institute for Quantitative Biosciences, University of California San Francisco, San Francisco, California 94158-2330, USA
| | | |
Collapse
|
22
|
Velázquez-Muriel JA, Rueda M, Cuesta I, Pascual-Montano A, Orozco M, Carazo JM. Comparison of molecular dynamics and superfamily spaces of protein domain deformation. BMC STRUCTURAL BIOLOGY 2009; 9:6. [PMID: 19220918 PMCID: PMC2666742 DOI: 10.1186/1472-6807-9-6] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2008] [Accepted: 02/17/2009] [Indexed: 11/10/2022]
Abstract
BACKGROUND It is well known the strong relationship between protein structure and flexibility, on one hand, and biological protein function, on the other hand. Technically, protein flexibility exploration is an essential task in many applications, such as protein structure prediction and modeling. In this contribution we have compared two different approaches to explore the flexibility space of protein domains: i) molecular dynamics (MD-space), and ii) the study of the structural changes within superfamily (SF-space). RESULTS Our analysis indicates that the MD-space and the SF-space display a significant overlap, but are still different enough to be considered as complementary. The SF-space space is wider but less complex than the MD-space, irrespective of the number of members in the superfamily. Also, the SF-space does not sample all possibilities offered by the MD-space, but often introduces very large changes along just a few deformation modes, whose number tend to a plateau as the number of related folds in the superfamily increases. CONCLUSION Theoretically, we obtained two conclusions. First, that function restricts the access to some flexibility patterns to evolution, as we observe that when a superfamily member changes to become another, the path does not completely overlap with the physical deformability. Second, that conformational changes from variation in a superfamily are larger and much simpler than those allowed by physical deformability. Methodologically, the conclusion is that both spaces studied are complementary, and have different size and complexity. We expect this fact to have application in fields as 3D-EM/X-ray hybrid models or ab initio protein folding.
Collapse
|
23
|
Brown WM, Martin S, Pollock SN, Coutsias EA, Watson JP. Algorithmic dimensionality reduction for molecular structure analysis. J Chem Phys 2008; 129:064118. [PMID: 18715062 DOI: 10.1063/1.2968610] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Dimensionality reduction approaches have been used to exploit the redundancy in a Cartesian coordinate representation of molecular motion by producing low-dimensional representations of molecular motion. This has been used to help visualize complex energy landscapes, to extend the time scales of simulation, and to improve the efficiency of optimization. Until recently, linear approaches for dimensionality reduction have been employed. Here, we investigate the efficacy of several automated algorithms for nonlinear dimensionality reduction for representation of trans, trans-1,2,4-trifluorocyclo-octane conformation--a molecule whose structure can be described on a 2-manifold in a Cartesian coordinate phase space. We describe an efficient approach for a deterministic enumeration of ring conformations. We demonstrate a drastic improvement in dimensionality reduction with the use of nonlinear methods. We discuss the use of dimensionality reduction algorithms for estimating intrinsic dimensionality and the relationship to the Whitney embedding theorem. Additionally, we investigate the influence of the choice of high-dimensional encoding on the reduction. We show for the case studied that, in terms of reconstruction error root mean square deviation, Cartesian coordinate representations and encodings based on interatom distances provide better performance than encodings based on a dihedral angle representation.
Collapse
Affiliation(s)
- W Michael Brown
- Discrete Mathematics and Complex Systems, Sandia National Laboratories, Albuquerque, New Mexico 87185-1316, USA.
| | | | | | | | | |
Collapse
|
24
|
Randall A, Baldi P. SELECTpro: effective protein model selection using a structure-based energy function resistant to BLUNDERs. BMC STRUCTURAL BIOLOGY 2008; 8:52. [PMID: 19055744 PMCID: PMC2667183 DOI: 10.1186/1472-6807-8-52] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/26/2008] [Accepted: 12/03/2008] [Indexed: 11/10/2022]
Abstract
Background Protein tertiary structure prediction is a fundamental problem in computational biology and identifying the most native-like model from a set of predicted models is a key sub-problem. Consensus methods work well when the redundant models in the set are the most native-like, but fail when the most native-like model is unique. In contrast, structure-based methods score models independently and can be applied to model sets of any size and redundancy level. Additionally, structure-based methods have a variety of important applications including analogous fold recognition, refinement of sequence-structure alignments, and de novo prediction. The purpose of this work was to develop a structure-based model selection method based on predicted structural features that could be applied successfully to any set of models. Results Here we introduce SELECTpro, a novel structure-based model selection method derived from an energy function comprising physical, statistical, and predicted structural terms. Novel and unique energy terms include predicted secondary structure, predicted solvent accessibility, predicted contact map, β-strand pairing, and side-chain hydrogen bonding. SELECTpro participated in the new model quality assessment (QA) category in CASP7, submitting predictions for all 95 targets and achieved top results. The average difference in GDT-TS between models ranked first by SELECTpro and the most native-like model was 5.07. This GDT-TS difference was less than 1% of the GDT-TS of the most native-like model for 18 targets, and less than 10% for 66 targets. SELECTpro also ranked the single most native-like first for 15 targets, in the top five for 39 targets, and in the top ten for 53 targets, more often than any other method. Because the ranking metric is skewed by model redundancy and ignores poor models with a better ranking than the most native-like model, the BLUNDER metric is introduced to overcome these limitations. SELECTpro is also evaluated on a recent benchmark set of 16 small proteins with large decoy sets of 12500 to 20000 models for each protein, where it outperforms the benchmarked method (I-TASSER). Conclusion SELECTpro is an effective model selection method that scores models independently and is appropriate for use on any model set. SELECTpro is available for download as a stand alone application at: . SELECTpro is also available as a public server at the same site.
Collapse
Affiliation(s)
- Arlo Randall
- School of Information and Computer Sciences, University of California, Irvine, CA 92697, USA.
| | | |
Collapse
|
25
|
Sellers BD, Zhu K, Zhao S, Friesner RA, Jacobson MP. Toward better refinement of comparative models: predicting loops in inexact environments. Proteins 2008; 72:959-71. [PMID: 18300241 DOI: 10.1002/prot.21990] [Citation(s) in RCA: 75] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Achieving atomic-level accuracy in comparative protein models is limited by our ability to refine the initial, homolog-derived model closer to the native state. Despite considerable effort, progress in developing a generalized refinement method has been limited. In contrast, methods have been described that can accurately reconstruct loop conformations in native protein structures. We hypothesize that loop refinement in homology models is much more difficult than loop reconstruction in crystal structures, in part, because side-chain, backbone, and other structural inaccuracies surrounding the loop create a challenging sampling problem; the loop cannot be refined without simultaneously refining adjacent portions. In this work, we single out one sampling issue in an artificial but useful test set and examine how loop refinement accuracy is affected by errors in surrounding side-chains. In 80 high-resolution crystal structures, we first perturbed 6-12 residue loops away from the crystal conformation, and placed all protein side chains in non-native but low energy conformations. Even these relatively small perturbations in the surroundings made the loop prediction problem much more challenging. Using a previously published loop prediction method, median backbone (N-Calpha-C-O) RMSD's for groups of 6, 8, 10, and 12 residue loops are 0.3/0.6/0.4/0.6 A, respectively, on native structures and increase to 1.1/2.2/1.5/2.3 A on the perturbed cases. We then augmented our previous loop prediction method to simultaneously optimize the rotamer states of side chains surrounding the loop. Our results show that this augmented loop prediction method can recover the native state in many perturbed structures where the previous method failed; the median RMSD's for the 6, 8, 10, and 12 residue perturbed loops improve to 0.4/0.8/1.1/1.2 A. Finally, we highlight three comparative models from blind tests, in which our new method predicted loops closer to the native conformation than first modeled using the homolog template, a task generally understood to be difficult. Although many challenges remain in refining full comparative models to high accuracy, this work offers a methodical step toward that goal.
Collapse
Affiliation(s)
- Benjamin D Sellers
- Graduate Group in Biophysics, University of California, San Francisco, California 94158-2517, USA
| | | | | | | | | |
Collapse
|
26
|
Alternating evolutionary pressure in a genetic algorithm facilitates protein model selection. BMC STRUCTURAL BIOLOGY 2008; 8:34. [PMID: 18673557 PMCID: PMC2527322 DOI: 10.1186/1472-6807-8-34] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/04/2008] [Accepted: 08/01/2008] [Indexed: 11/12/2022]
Abstract
Background Automatic protein modelling pipelines are becoming ever more accurate; this has come hand in hand with an increasingly complicated interplay between all components involved. Nevertheless, there are still potential improvements to be made in template selection, refinement and protein model selection. Results In the context of an automatic modelling pipeline, we analysed each step separately, revealing several non-intuitive trends and explored a new strategy for protein conformation sampling using Genetic Algorithms (GA). We apply the concept of alternating evolutionary pressure (AEP), i.e. intermediate rounds within the GA runs where unrestrained, linear growth of the model populations is allowed. Conclusion This approach improves the overall performance of the GA by allowing models to overcome local energy barriers. AEP enabled the selection of the best models in 40% of all targets; compared to 25% for a normal GA.
Collapse
|
27
|
Chakravarty S, Godbole S, Zhang B, Berger S, Sanchez R. Systematic analysis of the effect of multiple templates on the accuracy of comparative models of protein structure. BMC STRUCTURAL BIOLOGY 2008; 8:31. [PMID: 18631402 PMCID: PMC2483983 DOI: 10.1186/1472-6807-8-31] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/06/2007] [Accepted: 07/16/2008] [Indexed: 11/10/2022]
Abstract
Background Although multiple templates are frequently used in comparative modeling, the effect of inclusion of additional template(s) on model accuracy (when compared to that of corresponding single-template based models) is not clear. To address this, we systematically analyze two-template models, the simplest case of multiple-template modeling. For an existing target-template pair (single-template modeling), a two-template based model of the target sequence is constructed by including an additional template without changing the original alignment to measure the effect of the second template on model accuracy. Results Even though in a large number of cases a two-template model showed higher accuracy than the corresponding one-template model, over the entire dataset only a marginal improvement was observed on average, as there were many cases where no change or the reverse change was observed. The increase in accuracy due to the structural complementarity of the templates increases at higher alignment accuracies. The combination of templates showing the highest potential for improvement is that where both templates share similar and low (less than 30%) sequence identity with the target, as well as low sequence identity with each other. The structural similarity between the templates also helps in identifying template combinations having a higher chance of resulting in an improved model. Conclusion Inclusion of additional template(s) does not necessarily improve model quality, but there are distinct combinations of the two templates, which can be selected a priori, that tend to show improvement in model quality over the single template model. The benefit derived from the structural complementarity is dependent on the accuracy of the modeling alignment. The study helps to explain the observation that a careful selection of templates together with an accurate target:template alignment are necessary to the benefit from using multiple templates in comparative modeling and provides guidelines to maximize the benefit from using multiple templates. This enables formulation of simple template selection rules to rank targets of a protein family in the context of structural genomics.
Collapse
Affiliation(s)
- Suvobrata Chakravarty
- Department of Structural and Chemical Biology, Mount Sinai School of Medicine, 1425 Madison Avenue, New York, NY 10029, USA.
| | | | | | | | | |
Collapse
|
28
|
Larsson P, Wallner B, Lindahl E, Elofsson A. Using multiple templates to improve quality of homology models in automated homology modeling. Protein Sci 2008; 17:990-1002. [PMID: 18441233 DOI: 10.1110/ps.073344908] [Citation(s) in RCA: 109] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
When researchers build high-quality models of protein structure from sequence homology, it is today common to use several alternative target-template alignments. Several methods can, at least in theory, utilize information from multiple templates, and many examples of improved model quality have been reported. However, to our knowledge, thus far no study has shown that automatic inclusion of multiple alignments is guaranteed to improve models without artifacts. Here, we have carried out a systematic investigation of the potential of multiple templates to improving homology model quality. We have used test sets consisting of targets from both recent CASP experiments and a larger reference set. In addition to Modeller and Nest, a new method (Pfrag) for multiple template-based modeling is used, based on the segment-matching algorithm from Levitt's SegMod program. Our results show that all programs can produce multi-template models better than any of the single-template models, but a large part of the improvement is simply due to extension of the models. Most of the remaining improved cases were produced by Modeller. The most important factor is the existence of high-quality single-sequence input alignments. Because of the existence of models that are worse than any of the top single-template models, the average model quality does not improve significantly. However, by ranking models with a model quality assessment program such as ProQ, the average quality is improved by approximately 5% in the CASP7 test set.
Collapse
Affiliation(s)
- Per Larsson
- Center for Biomembrane Research, Department of Biochemistry and Biophysics, Stockholm University, SE-106 91 Stockholm, Sweden
| | | | | | | |
Collapse
|
29
|
Han R, Leo-Macias A, Zerbino D, Bastolla U, Contreras-Moreira B, Ortiz AR. An efficient conformational sampling method for homology modeling. Proteins 2008; 71:175-88. [PMID: 17985353 DOI: 10.1002/prot.21672] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
The structural refinement of protein models is a challenging problem in protein structure prediction (Moult et al., Proteins 2003;53(Suppl 6):334-339). Most attempts to refine comparative models lead to degradation rather than improvement in model quality, so most current comparative modeling procedures omit the refinement step. However, it has been shown that even in the absence of alignment errors and using optimal templates, methods based on a single template have intrinsic limitations, and that refinement is needed to improve model accuracy. It is thought that failure of current methods originates on one hand from the inaccuracy of the effective free energy functions adopted, which do not represent properly the energetic balance in the native state, and on the other hand from the difficulty to sample the high dimensional and rugged free energy landscape of protein folding, in the search for the global minimum. Here, we address this second issue. We define the evolutionary and vibrational armonics subspace (EVA), a reduced sampling subspace that consists of a combination of evolutionarily favored directions, defined by the principal components of the structural variation within a homologous family, plus topologically favored directions, derived from the low frequency normal modes of the vibrational dynamics, up to 50 dimensions. This subspace is accurate enough so that the cores of most proteins can be represented within 1 A accuracy, and reduced enough so that Replica Exchange Monte Carlo (Hukushima and Nemoto, J Phys Soc Jpn 1996;65:1604-1608; Hukushima et al., Int J Mod Phys C: Phys Comput 1996;7:337-344; Mitsutake et al., J Chem Phys 2003;118:6664-6675; Mitsutake et al., J Chem Phys 2003;118:6676-6688) (REMC) can be applied. REMC is one of the best sampling methods currently available, but its applicability is restricted to spaces of small dimensionality. We show that the combination of the EVA subspace and REMC can essentially solve the optimization problem for backbone atoms in the reduced sampling subspace, even for rather rugged free energy landscapes. Applications and limitations of this methodology are finally discussed.
Collapse
Affiliation(s)
- Rongsheng Han
- Bioinformatics Unit, Centro de Biología Molecular "Severo Ochoa" (CSIC-UAM), Universidad Autónoma de Madrid, Cantoblanco, Madrid, Spain
| | | | | | | | | | | |
Collapse
|
30
|
Ishitani R, Terada T, Shimizu K. Refinement of comparative models of protein structure by using multicanonical molecular dynamics simulations. MOLECULAR SIMULATION 2008. [DOI: 10.1080/08927020801930539] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
31
|
Zhou H, Pandit SB, Lee SY, Borreguero J, Chen H, Wroblewska L, Skolnick J. Analysis of TASSER-based CASP7 protein structure prediction results. Proteins 2008; 69 Suppl 8:90-7. [PMID: 17705276 DOI: 10.1002/prot.21649] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
An improved TASSER (Threading/ASSEmbly/Refinement) methodology is applied to predict the tertiary structure for all CASP7 targets. TASSER employs template identification by threading, followed by tertiary structure assembly by rearranging continuous template fragments, where conformational space is searched via Parallel Hyperbolic Monte Carlo sampling with an optimized force-field that includes knowledge-based statistical potentials and restraints derived from threading templates. The final models are selected by clustering structures from the low temperature replicas. Improvements in TASSER over CASP6 involve use of better templates from 3D-jury applied to three threading programs, PROSPECTOR_3, SP(3), and SPARKS, and a fragment comparison method for better model ranking. For targets with no reliable templates, a variant of TASSER (chunk-TASSER) is also applied with potentials and restraints extracted from ab initio folded supersecondary chunks of the target to build full-length models. For all 124 CASP targets/domains, the average root-mean-square-deviation (RMSD) from native and alignment coverage of the best initial threading models from 3D-jury are 6.2 A and 93%, respectively. Following TASSER reassembly, the average RMSD of the best model in the template aligned region decreases to 4.9 A and the average TM-score increases from 0.617 for the template to 0.678 for the best full-length model. Based on target difficulty, the average TM-scores of the final model to native are 0.904, 0.671, and 0.307 for high-accuracy template-based modeling, template-based modeling, and free modeling targets/domains, respectively. For the more difficult targets, TASSER with modest human intervention performed better in comparison to its server counterpart, MetaTASSER, which used a limited time simulation.
Collapse
Affiliation(s)
- Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, Georgia 30318, USA
| | | | | | | | | | | | | |
Collapse
|
32
|
Abstract
Normal mode analysis and essential dynamics analysis are powerful methods used for the analysis of collective motions in biomolecules. Their application has led to an appreciation of the importance of protein dynamics in function and the relationship between structure and dynamical behavior. In this chapter, the methods and their implementation are introduced and recent developments such as elastic networks and advanced sampling techniques are described.
Collapse
|
33
|
Rueda M, Chacón P, Orozco M. Thorough validation of protein normal mode analysis: a comparative study with essential dynamics. Structure 2007; 15:565-75. [PMID: 17502102 DOI: 10.1016/j.str.2007.03.013] [Citation(s) in RCA: 133] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2007] [Revised: 03/27/2007] [Accepted: 03/29/2007] [Indexed: 11/19/2022]
Abstract
The deformation patterns of a large set of representative proteins determined by essential dynamics extracted from atomistic simulations and coarse-grained normal mode analysis are compared. Our analysis shows that the deformational space obtained with both approaches is quite similar when taking into account a representative number of modes. The results provide not only a comprehensive validation of the use of a low-frequency modal spectrum to describe protein flexibility, but also a complete picture of normal mode limitations.
Collapse
Affiliation(s)
- Manuel Rueda
- Molecular Modeling and Bioinformatics Unit, Institut de Recerca Biomèdica, Parc Cientific de Barcelona, 08028 Barcelona, Spain
| | | | | |
Collapse
|
34
|
Fu X, Apgar JR, Keating AE. Modeling backbone flexibility to achieve sequence diversity: the design of novel alpha-helical ligands for Bcl-xL. J Mol Biol 2007; 371:1099-117. [PMID: 17597151 PMCID: PMC1994813 DOI: 10.1016/j.jmb.2007.04.069] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2007] [Revised: 04/26/2007] [Accepted: 04/27/2007] [Indexed: 11/27/2022]
Abstract
Computational protein design can be used to select sequences that are compatible with a fixed-backbone template. This strategy has been used in numerous instances to engineer novel proteins. However, the fixed-backbone assumption severely restricts the sequence space that is accessible via design. For challenging problems, such as the design of functional proteins, this may not be acceptable. Here, we present a method for introducing backbone flexibility into protein design calculations and apply it to the design of diverse helical BH3 ligands that bind to the anti-apoptotic protein Bcl-xL, a member of the Bcl-2 protein family. We demonstrate how normal mode analysis can be used to sample different BH3 backbones, and show that this leads to a larger and more diverse set of low-energy solutions than can be achieved using a native high-resolution Bcl-xL complex crystal structure as a template. We tested several of the designed solutions experimentally and found that this approach worked well when normal mode calculations were used to deform a native BH3 helix structure, but less well when they were used to deform an idealized helix. A subsequent round of design and testing identified a likely source of the problem as inadequate sampling of the helix pitch. In all, we tested 17 designed BH3 peptide sequences, including several point mutants. Of these, eight bound well to Bcl-xL and four others showed weak but detectable binding. The successful designs showed a diversity of sequences that would have been difficult or impossible to achieve using only a fixed backbone. Thus, introducing backbone flexibility via normal mode analysis effectively broadened the set of sequences identified by computational design, and provided insight into positions important for binding Bcl-xL.
Collapse
Affiliation(s)
- Xiaoran Fu
- MIT Department of Biology, 77 Massachusetts Ave, Cambridge, MA 02139, USA
| | | | | |
Collapse
|
35
|
Song G, Jernigan RL. vGNM: a better model for understanding the dynamics of proteins in crystals. J Mol Biol 2007; 369:880-93. [PMID: 17451743 PMCID: PMC1993920 DOI: 10.1016/j.jmb.2007.03.059] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2006] [Revised: 03/20/2007] [Accepted: 03/22/2007] [Indexed: 10/23/2022]
Abstract
The dynamics of proteins are important for understanding their functions. In recent years, the simple coarse-grained Gaussian Network Model (GNM) has been fairly successful in interpreting crystallographic B-factors. However, the model clearly ignores the contribution of the rigid body motions and the effect of crystal packing. The model cannot explain the fact that the same protein may have significantly different B-factors under different crystal packing conditions. In this work, we propose a new GNM, called vGNM, which takes into account both the contribution of the rigid body motions and the effect of crystal packing, by allowing the amplitude of the internal modes to be variables. It hypothesizes that the effect of crystal packing should cause some modes to be amplified and others to become less important. In doing so, vGNM is able to resolve the apparent discrepancy in experimental B-factors among structures of the same protein but with different crystal packing conditions, which GNM cannot explain. With a small number of parameters, vGNM is able to reproduce experimental B-factors for a large set of proteins with significantly better correlations (having a mean value of 0.81 as compared to 0.59 by GNM). The results of applying vGNM also show that the rigid body motions account for nearly 60% of the total fluctuations, in good agreement with previous findings.
Collapse
Affiliation(s)
- Guang Song
- Program of Bioinformatics and Computational Biology, Iowa State University, Ames, IA 50011, USA.
| | | |
Collapse
|
36
|
|
37
|
Velazquez-Muriel JA, Carazo JMA. Flexible fitting in 3D-EM with incomplete data on superfamily variability. J Struct Biol 2006; 158:165-81. [PMID: 17257856 DOI: 10.1016/j.jsb.2006.10.014] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2006] [Revised: 09/20/2006] [Accepted: 10/13/2006] [Indexed: 11/26/2022]
Abstract
We present a substantial improvement of S-flexfit, our recently proposed method for flexible fitting in three dimensional electron microscopy (3D-EM) at a resolution range of 8-12A, together with a comparison of the method capabilities with Normal Mode Analysis (NMA), application examples and a user's guide. S-flexfit uses the evolutionary information contained in protein domain databases like CATH, by means of the structural alignment of the elements of a protein superfamily. The added development is based on a recent extension of the Singular Value Decomposition (SVD) algorithm specifically designed for situations with missing data: Incremental Singular Value Decomposition (ISVD). ISVD can manage gaps and allows considering more aminoacids in the structural alignment of a superfamily, extending the range of application and producing better models for the fitting step of our methodology. Our previous SVD-based flexible fitting approach can only take into account positions with no gaps in the alignment, being appropriate when the superfamily members are relatively similar and there are few gaps. However, with new data coming from structural proteomics works, the later situation is becoming less likely, making ISVD the technique of choice for further works. We present the results of using ISVD in the process of flexible fitting with both simulated and experimental 3D-EM maps (GroEL and Poliovirus 135S cell entry intermediate).
Collapse
Affiliation(s)
- Javier A Velazquez-Muriel
- Biocomputing Unit, National Center for Biotechnology, Campus Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | | |
Collapse
|
38
|
Velazquez-Muriel JA, Valle M, Santamaría-Pang A, Kakadiaris IA, Carazo JM. Flexible Fitting in 3D-EM Guided by the Structural Variability of Protein Superfamilies. Structure 2006; 14:1115-26. [PMID: 16843893 DOI: 10.1016/j.str.2006.05.013] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2006] [Revised: 04/27/2006] [Accepted: 05/01/2006] [Indexed: 10/24/2022]
Abstract
A method for flexible fitting of molecular models into three-dimensional electron microscopy (3D-EM) reconstructions at a resolution range of 8-12 A is proposed. The approach uses the evolutionarily related structural variability existing among the protein domains of a given superfamily, according to structural databases such as CATH. A structural alignment of domains belonging to the superfamily, followed by a principal components analysis, is performed, and the first three principal components of the decomposition are explored. Using rigid body transformations for the secondary structure elements (SSEs) plus the cyclic coordinate descent algorithm to close the loops, stereochemically correct models are built for the structure to fit. All of the models are fitted into the 3D-EM map, and the best one is selected based on crosscorrelation measures. This work applies the method to both simulated and experimental data and shows that the flexible fitting was able to produce better results than rigid body fitting.
Collapse
Affiliation(s)
- Javier-Angel Velazquez-Muriel
- Biocomputing Unit, National Center for Biotechnology-CSIC, Campus Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | | | | | | | | |
Collapse
|
39
|
Abstract
Homology modeling plays a central role in determining protein structure in the structural genomics project. The importance of homology modeling has been steadily increasing because of the large gap that exists between the overwhelming number of available protein sequences and experimentally solved protein structures, and also, more importantly, because of the increasing reliability and accuracy of the method. In fact, a protein sequence with over 30% identity to a known structure can often be predicted with an accuracy equivalent to a low-resolution X-ray structure. The recent advances in homology modeling, especially in detecting distant homologues, aligning sequences with template structures, modeling of loops and side chains, as well as detecting errors in a model, have contributed to reliable prediction of protein structure, which was not possible even several years ago. The ongoing efforts in solving protein structures, which can be time-consuming and often difficult, will continue to spur the development of a host of new computational methods that can fill in the gap and further contribute to understanding the relationship between protein structure and function.
Collapse
Affiliation(s)
- Zhexin Xiang
- Center for Molecular Modeling, Center for Information Technology, National Institutes of Health, Building 12A Room 2051, 12 South Drive, Bethesda, Maryland 20892-5624, USA.
| |
Collapse
|
40
|
Abstract
MOTIVATION A wide variety of methods for the construction of an atomic model for a given amino acid sequence are known, the more accurate being those that use experimentally determined structures as templates. However, far fewer methods are aimed at refining these models. The approach presented here carefully blends models created by several different means, in an attempt to combine the good quality regions from each into a final, more refined, model. RESULTS We describe here a number of refinement operators (collectively, 'move-set') that enable a relatively large region of conformational space to be searched. This is used within a genetic algorithm that reshuffles and repacks structural components. The utility of the move-set is demonstrated by introducing a cost function, containing both physical and other components guiding the input structures towards the target structure. We show that our move-set has the potential to improve the conformation of models and that this improvement can be beyond even the best template for some comparative modelling targets. AVAILABILITY The populus software package and the source code are available at http://bmm.cancerresearchuk.org/~offman01/populus.html.
Collapse
Affiliation(s)
- Marc N Offman
- Biomolecular Modelling Laboratory, Cancer Research UK London Research Institute, Lincoln's Inn Fields Laboratories London, WC2A 3PX, UK
| | | | | |
Collapse
|
41
|
Baker D. Prediction and design of macromolecular structures and interactions. Philos Trans R Soc Lond B Biol Sci 2006; 361:459-63. [PMID: 16524834 PMCID: PMC1609347 DOI: 10.1098/rstb.2005.1803] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
In this article, I summarize recent work from my group directed towards developing an improved model of intra and intermolecular interactions and applying this improved model to the prediction and design of macromolecular structures and interactions. Prediction and design applications can be of great biological interest in their own right, and also provide very stringent and objective tests which drive the improvement of the model and increases in fundamental understanding. I emphasize the results from the prediction and design tests that suggest progress is being made in high-resolution modelling, and that there is hope for reliably and accurately computing structural biology.
Collapse
Affiliation(s)
- David Baker
- University of Washington, Seattle, WA 98112, USA.
| |
Collapse
|
42
|
Davis IW, Arendall WB, Richardson DC, Richardson JS. The backrub motion: how protein backbone shrugs when a sidechain dances. Structure 2006; 14:265-74. [PMID: 16472746 DOI: 10.1016/j.str.2005.10.007] [Citation(s) in RCA: 200] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2005] [Revised: 10/10/2005] [Accepted: 10/12/2005] [Indexed: 11/16/2022]
Abstract
Surprisingly, the frozen structures from ultra-high-resolution protein crystallography reveal a prevalent, but subtle, mode of local backbone motion coupled to much larger, two-state changes of sidechain conformation. This "backrub" motion provides an influential and common type of local plasticity in protein backbone. Concerted reorientation of two adjacent peptides swings the central sidechain perpendicular to the chain direction, changing accessible sidechain conformations while leaving flanking structure undisturbed. Alternate conformations in sub-1 angstroms crystal structures show backrub motions for two-thirds of the significant Cbeta shifts and 3% of the total residues in these proteins (126/3882), accompanied by two-state changes in sidechain rotamer. The Backrub modeling tool is effective in crystallographic rebuilding. For homology modeling or protein redesign, backrubs can provide realistic, small perturbations to rigid backbones. For large sidechain changes in protein dynamics or for single mutations, backrubs allow backbone accommodation while maintaining H bonds and ideal geometry.
Collapse
Affiliation(s)
- Ian W Davis
- Department of Biochemistry, Duke University, Durham, North Carolina 27710, USA
| | | | | | | |
Collapse
|
43
|
Gray JJ. High-resolution protein-protein docking. Curr Opin Struct Biol 2006; 16:183-93. [PMID: 16546374 DOI: 10.1016/j.sbi.2006.03.003] [Citation(s) in RCA: 149] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2006] [Revised: 01/24/2006] [Accepted: 03/07/2006] [Indexed: 11/20/2022]
Abstract
The high-resolution prediction of protein-protein docking can now create structures with atomic-level accuracy. This progress arises from both improvements in the rapid sampling of conformations and increased accuracy of binding free energy calculations. Consequently, the quality of models submitted to the blind prediction challenge CAPRI (Critical Assessment of PRedicted Interactions) has steadily increased, including complexes predicted from homology structures of one binding partner and complexes with atomic accuracy at the interface. By exploiting experimental information, docking has created model structures for real applications, even when confronted with challenges such as moving backbones and uncertain monomer structures. Work remains to be done in docking large or flexible proteins, ranking models consistently, and producing models accurate enough to allow computational design of higher affinities or specificities.
Collapse
Affiliation(s)
- Jeffrey J Gray
- Department of Chemical & Biomolecular Engineering and Program in Molecular & Computational Biophysics, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218, USA.
| |
Collapse
|
44
|
Ginalski K. Comparative modeling for protein structure prediction. Curr Opin Struct Biol 2006; 16:172-7. [PMID: 16510277 DOI: 10.1016/j.sbi.2006.02.003] [Citation(s) in RCA: 167] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2005] [Revised: 01/17/2006] [Accepted: 02/14/2006] [Indexed: 10/25/2022]
Abstract
With the progression of structural genomics projects, comparative modeling remains an increasingly important method of choice. It helps to bridge the gap between the available sequence and structure information by providing reliable and accurate protein models. Comparative modeling based on more than 30% sequence identity is now approaching its natural template-based limits and further improvements require the development of effective refinement techniques capable of driving models toward native structure. For difficult targets, for which the most significant progress in recent years has been observed, optimal template selection and alignment accuracy are still the major problems.
Collapse
Affiliation(s)
- Krzysztof Ginalski
- Centre for Mathematical and Computational Modelling, Warsaw University, Pawińskiego 5a, 02-106 Warsaw, Poland.
| |
Collapse
|
45
|
Lindahl E, Delarue M. Refinement of docked protein-ligand and protein-DNA structures using low frequency normal mode amplitude optimization. Nucleic Acids Res 2005; 33:4496-506. [PMID: 16087736 PMCID: PMC1183489 DOI: 10.1093/nar/gki730] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Prediction of structural changes resulting from complex formation, both in ligands and receptors, is an important and unsolved problem in structural biology. In this work, we use all-atom normal modes calculated with the Elastic Network Model as a basis set to model structural flexibility during formation of macromolecular complexes and refine the non-bonded intermolecular energy between the two partners (protein–ligand or protein–DNA) along 5–10 of the lowest frequency normal mode directions. The method handles motions unrelated to the docking transparently by first applying the modes that improve non-bonded energy most and optionally restraining amplitudes; in addition, the method can correct small errors in the ligand position when the first six rigid-body modes are switched on. For a test set of six protein receptors that show an open-to-close transition when binding small ligands, our refinement scheme reduces the protein coordinate cRMS by 0.3–3.2 Å. For two test cases of DNA structures interacting with proteins, the program correctly refines the docked B-DNA starting form into the expected bent DNA, reducing the DNA cRMS from 8.4 to 4.8 Å and from 8.7 to 5.4 Å, respectively. A public web server implementation of the refinement method is available at .
Collapse
Affiliation(s)
- Erik Lindahl
- Unité de Biochimie Structurale, URA 2185 du CNRS, Institut Pasteur25 Rue du Dr Roux, F-75015 Paris, France
- Stockholm Bioinformatics Center, Stockholm UniversitySE-17156 Stockholm, Sweden
| | - Marc Delarue
- Unité de Biochimie Structurale, URA 2185 du CNRS, Institut Pasteur25 Rue du Dr Roux, F-75015 Paris, France
- To whom correspondence should be addressed. Tel: +33 1 45 68 8605; Fax: +33 1 45 68 8604;
| |
Collapse
|
46
|
Centeno NB, Planas-Iglesias J, Oliva B. Comparative modelling of protein structure and its impact on microbial cell factories. Microb Cell Fact 2005; 4:20. [PMID: 15989691 PMCID: PMC1183243 DOI: 10.1186/1475-2859-4-20] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2005] [Accepted: 06/30/2005] [Indexed: 11/22/2022] Open
Abstract
Comparative modeling is becoming an increasingly helpful technique in microbial cell factories as the knowledge of the three-dimensional structure of a protein would be an invaluable aid to solve problems on protein production. For this reason, an introduction to comparative modeling is presented, with special emphasis on the basic concepts, opportunities and challenges of protein structure prediction. This review is intended to serve as a guide for the biologist who has no special expertise and who is not involved in the determination of protein structure. Selected applications of comparative modeling in microbial cell factories are outlined, and the role of microbial cell factories in the structural genomics initiative is discussed.
Collapse
Affiliation(s)
- Nuria B Centeno
- Structural Bioinformatics Laboratory, Research Group on Biomedical Informatics (GRIB), IMIM/UPF. c/ Dr. Aiguader 80. 08003 Barcelona, Spain
| | - Joan Planas-Iglesias
- Structural Bioinformatics Laboratory, Research Group on Biomedical Informatics (GRIB), IMIM/UPF. c/ Dr. Aiguader 80. 08003 Barcelona, Spain
| | - Baldomero Oliva
- Structural Bioinformatics Laboratory, Research Group on Biomedical Informatics (GRIB), IMIM/UPF. c/ Dr. Aiguader 80. 08003 Barcelona, Spain
| |
Collapse
|
47
|
Leo-Macias A, Lopez-Romero P, Lupyan D, Zerbino D, Ortiz AR. Core deformations in protein families: a physical perspective. Biophys Chem 2005; 115:125-8. [PMID: 15752593 DOI: 10.1016/j.bpc.2004.12.016] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2004] [Revised: 11/10/2004] [Accepted: 12/10/2004] [Indexed: 11/18/2022]
Abstract
An analysis is presented on how structural cores change shape within protein families, and whether or not there is a relationship between these structural changes and the vibrational modes that proteins experiment due to topological constraints. A set of 13 representative and well-populated protein families are studied. The evolutionary directions of deformation are obtained by applying a new multiple structural alignment technique to superimpose the structures and extract a conserved core, together with Principal Components Analysis (PCA) to extract the main deformation modes. A low-resolution Normal Mode Analysis (NMA) technique is used in parallel to study the properties of the mechanical core plasticity of the same proteins. We find that the evolutionary deformations span a low dimensional space. A statistically significant correspondence exists between these principal deformations and the vibrational modes accessible to a particular topology. We conclude that, to a significant extent, the structures of evolving proteins seem to respond to sequence changes by collective deformations along combinations of low-frequency modes. The findings have implications in structure prediction by homology modeling.
Collapse
Affiliation(s)
- Alejandra Leo-Macias
- Bioinformatics Unit, Centro de Biologia Molecular Severo Ochoa, CSIC-UAM, Universidad Autonoma de Madrid, Cantoblanco 28049, Madrid, Spain
| | | | | | | | | |
Collapse
|
48
|
Leo-Macias A, Lopez-Romero P, Lupyan D, Zerbino D, Ortiz AR. An analysis of core deformations in protein superfamilies. Biophys J 2004; 88:1291-9. [PMID: 15542556 PMCID: PMC1305131 DOI: 10.1529/biophysj.104.052449] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
An analysis is presented on how structural cores modify their shape across homologous proteins, and whether or not a relationship exists between these structural changes and the vibrational normal modes that proteins experience as a result of the topological constraints imposed by the fold. A set of 35 representative, well-populated protein families is studied. The evolutionary directions of deformation are obtained by using multiple structural alignments to superimpose the structures and extract a conserved core, together with principal components analysis to extract the main deformation modes from the three-dimensional superimposition. In parallel, a low-resolution normal mode analysis technique is employed to study the properties of the mechanical core plasticity of these same families. We show that the evolutionary deformations span a low dimensional space of 4-5 dimensions on average. A statistically significant correspondence exists between these principal deformations and the approximately 20 slowest vibrational modes accessible to a particular topology. We conclude that, to a significant extent, the structural response of a protein topology to sequence changes takes place by means of collective deformations along combinations of a small number of low-frequency modes. The findings have implications in structure prediction by homology modeling.
Collapse
Affiliation(s)
- Alejandra Leo-Macias
- Bioinformatics Unit, Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Cantoblanco, Madrid, Spain
| | | | | | | | | |
Collapse
|