1
|
Chong SH, Ham S. Evolutionary conservation of amino acids contributing to the protein folding transition state. J Comput Chem 2023; 44:1002-1009. [PMID: 36571461 DOI: 10.1002/jcc.27060] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2022] [Revised: 11/22/2022] [Accepted: 12/06/2022] [Indexed: 12/27/2022]
Abstract
The question of whether amino acids critical to protein folding kinetics are evolutionarily conserved has been investigated intensively in the past, but no consensus has yet been reached. Recently, we have demonstrated that the transition state, dictating folding kinetics, is characterized as the state of maximum dynamic cooperativity, i.e., the state of maximum correlations between amino acid contact formations. Here, we investigate the evolutionary conservation of those amino acids contributing significantly to the dynamic cooperativity. We find a strong indication of a new kind of relationship-necessary but not sufficient causality-between the evolutionary conservation and the dynamic cooperativity: larger contributions to the dynamic cooperativity arise from more conserved residues, but not vice versa. This holds for all the protein systems for which long folding simulation trajectories are available. To our knowledge, this is the first systematic demonstration of any kind of evolutionary conservation of amino acids relevant to folding kinetics.
Collapse
Affiliation(s)
- Song-Ho Chong
- Global Center for Natural Resources Sciences, Faculty of Life Sciences, Kumamoto University, Kumamoto, Japan
| | - Sihyun Ham
- Department of Chemistry, Sookmyung Women's University, Seoul, South Korea
| |
Collapse
|
2
|
León-González JA, Flatet P, Juárez-Ramírez MS, Farías-Rico JA. Folding and Evolution of a Repeat Protein on the Ribosome. Front Mol Biosci 2022; 9:851038. [PMID: 35707224 PMCID: PMC9189291 DOI: 10.3389/fmolb.2022.851038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2022] [Accepted: 04/27/2022] [Indexed: 12/04/2022] Open
Abstract
Life on earth is the result of the work of proteins, the cellular nanomachines that fold into elaborated 3D structures to perform their functions. The ribosome synthesizes all the proteins of the biosphere, and many of them begin to fold during translation in a process known as cotranslational folding. In this work we discuss current advances of this field and provide computational and experimental data that highlight the role of ribosome in the evolution of protein structures. First, we used the sequence of the Ankyrin domain from the Drosophila Notch receptor to launch a deep sequence-based search. With this strategy, we found a conserved 33-residue motif shared by different protein folds. Then, to see how the vectorial addition of the motif would generate a full structure we measured the folding on the ribosome of the Ankyrin repeat protein. Not only the on-ribosome folding data is in full agreement with classical in vitro biophysical measurements but also it provides experimental evidence on how folded proteins could have evolved by duplication and fusion of smaller fragments in the RNA world. Overall, we discuss how the ribosomal exit tunnel could be conceptualized as an active site that is under evolutionary pressure to influence protein folding.
Collapse
Affiliation(s)
- José Alberto León-González
- Synthetic Biology Program, Center for Genome Sciences, National Autonomous University of Mexico, Cuernavaca, Mexico
| | - Perline Flatet
- Department of Biochemistry and Biophysics, Stockholm University, Stockholm, Sweden
| | - María Soledad Juárez-Ramírez
- Synthetic Biology Program, Center for Genome Sciences, National Autonomous University of Mexico, Cuernavaca, Mexico
| | - José Arcadio Farías-Rico
- Synthetic Biology Program, Center for Genome Sciences, National Autonomous University of Mexico, Cuernavaca, Mexico
- *Correspondence: José Arcadio Farías-Rico,
| |
Collapse
|
3
|
Crippa M, Andreghetti D, Capelli R, Tiana G. Evolution of frustrated and stabilising contacts in reconstructed ancient proteins. EUROPEAN BIOPHYSICS JOURNAL 2021; 50:699-712. [PMID: 33569610 PMCID: PMC8260555 DOI: 10.1007/s00249-021-01500-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Revised: 12/14/2020] [Accepted: 01/13/2021] [Indexed: 11/30/2022]
Abstract
Energetic properties of a protein are a major determinant of its evolutionary fitness. Using a reconstruction algorithm, dating the reconstructed proteins and calculating the interaction network between their amino acids through a coevolutionary approach, we studied how the interactions that stabilise 890 proteins, belonging to five families, evolved for billions of years. In particular, we focused our attention on the network of most strongly attractive contacts and on that of poorly optimised, frustrated contacts. Our results support the idea that the cluster of most attractive interactions extends its size along evolutionary time, but from the data, we cannot conclude that protein stability or that the degree of frustration tends always to decrease.
Collapse
Affiliation(s)
- Martina Crippa
- Department of Physics and Center for Complexity and Biosystems, Università degli Studi di Milano and INFN, via Celoria 16, 20133, Milan, Italy
- Department of Applied Science and Technology, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
| | - Damiano Andreghetti
- Department of Physics and Center for Complexity and Biosystems, Università degli Studi di Milano and INFN, via Celoria 16, 20133, Milan, Italy
| | - Riccardo Capelli
- Department of Applied Science and Technology, Politecnico di Torino, Corso Duca degli Abruzzi 24, 10129, Turin, Italy
| | - Guido Tiana
- Department of Physics and Center for Complexity and Biosystems, Università degli Studi di Milano and INFN, via Celoria 16, 20133, Milan, Italy.
| |
Collapse
|
4
|
de Oliveira VM, Caetano DLZ, da Silva FB, Mouro PR, de Oliveira AB, de Carvalho SJ, Leite VBP. pH and Charged Mutations Modulate Cold Shock Protein Folding and Stability: A Constant pH Monte Carlo Study. J Chem Theory Comput 2020; 16:765-772. [PMID: 31756296 DOI: 10.1021/acs.jctc.9b00894] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The folding and stability of proteins is a fundamental problem in several research fields. In the present paper, we have used different computational approaches to study the effects caused by changes in pH and for charged mutations in cold shock proteins from Bacillus subtilis (Bs-CspB). First, we have investigated the contribution of each ionizable residue for these proteins to their thermal stability using the TKSA-MC, a Web server for rational mutation via optimizing the protein charge interactions. Based on these results, we have proposed a new mutation in an already optimized Bs-CspB variant. We have evaluated the effects of this new mutation in the folding energy landscape using structure-based models in Monte Carlo simulation at constant pH, SBM-CpHMC. Our results using this approach have indicated that the charge rearrangements already in the unfolded state are critical to the thermal stability of Bs-CspB. Furthermore, the conjunction of these simplified methods was able not only to predict stabilizing mutations in different pHs but also to provide essential information about their effects in each stage of protein folding.
Collapse
Affiliation(s)
- Vinícius M de Oliveira
- Brazilian Biosciences National Laboratory, National Center for Research in Energy and Materials, LNBio/CNPEM , Campinas , São Paulo , 13083-970 , Brazil
| | - Daniel L Z Caetano
- Department of Physics , São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences , São José do Rio Preto , São Paulo , 15054-000 , Brazil
| | - Fernando B da Silva
- Department of Physics , São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences , São José do Rio Preto , São Paulo , 15054-000 , Brazil
| | - Paulo R Mouro
- Department of Physics , São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences , São José do Rio Preto , São Paulo , 15054-000 , Brazil
| | - Antonio B de Oliveira
- Department of Physics , São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences , São José do Rio Preto , São Paulo , 15054-000 , Brazil
| | - Sidney J de Carvalho
- Department of Physics , São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences , São José do Rio Preto , São Paulo , 15054-000 , Brazil
| | - Vitor B P Leite
- Department of Physics , São Paulo State University (UNESP), Institute of Biosciences, Humanities and Exact Sciences , São José do Rio Preto , São Paulo , 15054-000 , Brazil.,Center for Theoretical Biological Physics , Rice University , Houston , Texas 77005 , United States
| |
Collapse
|
5
|
Guin D, Gruebele M. Weak Chemical Interactions That Drive Protein Evolution: Crowding, Sticking, and Quinary Structure in Folding and Function. Chem Rev 2019; 119:10691-10717. [PMID: 31356058 DOI: 10.1021/acs.chemrev.8b00753] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
In recent years, better instrumentation and greater computing power have enabled the imaging of elusive biomolecule dynamics in cells, driving many advances in understanding the chemical organization of biological systems. The focus of this Review is on interactions in the cell that affect both biomolecular stability and function and modulate them. The same protein or nucleic acid can behave differently depending on the time in the cell cycle, the location in a specific compartment, or the stresses acting on the cell. We describe in detail the crowding, sticking, and quinary structure in the cell and the current methods to quantify them both in vitro and in vivo. Finally, we discuss protein evolution in the cell in light of current biophysical evidence. We describe the factors that drive protein evolution and shape protein interaction networks. These interactions can significantly affect the free energy, ΔG, of marginally stable and low-population proteins and, due to epistasis, direct the evolutionary pathways in an organism. We finally conclude by providing an outlook on experiments to come and the possibility of collaborative evolutionary biology and biophysical efforts.
Collapse
Affiliation(s)
- Drishti Guin
- Department of Chemistry , University of Illinois , Urbana , Illinois 61801 , United States
| | - Martin Gruebele
- Department of Chemistry , University of Illinois , Urbana , Illinois 61801 , United States.,Department of Physics , University of Illinois , Urbana , Illinois 61801 , United States.,Center for Biophysics and Quantitative Biology , University of Illinois , Urbana , Illinois 61801 , United States
| |
Collapse
|
6
|
|
7
|
Franklin MW, Nepomnyachyi S, Feehan R, Ben-Tal N, Kolodny R, Slusky JS. Evolutionary pathways of repeat protein topology in bacterial outer membrane proteins. eLife 2018; 7:40308. [PMID: 30489257 PMCID: PMC6340704 DOI: 10.7554/elife.40308] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 11/28/2018] [Indexed: 11/13/2022] Open
Abstract
Outer membrane proteins (OMPs) are the proteins in the surface of Gram-negative bacteria. These proteins have diverse functions but a single topology: the β-barrel. Sequence analysis has suggested that this common fold is a β-hairpin repeat protein, and that amplification of the β-hairpin has resulted in 8-26-stranded barrels. Using an integrated approach that combines sequence and structural analyses, we find events in which non-amplification diversification also increases barrel strand number. Our network-based analysis reveals strand-number-based evolutionary pathways, including one that progresses from a primordial 8-stranded barrel to 16-strands and further, to 18-strands. Among these pathways are mechanisms of strand number accretion without domain duplication, like a loop-to-hairpin transition. These mechanisms illustrate perpetuation of repeat protein topology without genetic duplication, likely induced by the hydrophobic membrane. Finally, we find that the evolutionary trace is particularly prominent in the C-terminal half of OMPs, implicating this region in the nucleation of OMP folding.
Collapse
Affiliation(s)
| | - Sergey Nepomnyachyi
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel.,Department of Computer Science, University of Haifa, Haifa, Israel
| | - Ryan Feehan
- Center for Computational Biology, University of Kansas, Kansas, United States
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Israel
| | - Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel
| | - Joanna Sg Slusky
- Center for Computational Biology, University of Kansas, Kansas, United States.,Department of Molecular Biosciences, University of Kansas, Kansas, United States
| |
Collapse
|
8
|
Pancsa R, Raimondi D, Cilia E, Vranken WF. Early Folding Events, Local Interactions, and Conservation of Protein Backbone Rigidity. Biophys J 2017; 110:572-583. [PMID: 26840723 DOI: 10.1016/j.bpj.2015.12.028] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2015] [Revised: 12/21/2015] [Accepted: 12/29/2015] [Indexed: 01/20/2023] Open
Abstract
Protein folding is in its early stages largely determined by the protein sequence and complex local interactions between amino acids, resulting in lower energy conformations that provide the context for further folding into the native state. We compiled a comprehensive data set of early folding residues based on pulsed labeling hydrogen deuterium exchange experiments. These early folding residues have corresponding higher backbone rigidity as predicted by DynaMine from sequence, an effect also present when accounting for the secondary structures in the folded protein. We then show that the amino acids involved in early folding events are not more conserved than others, but rather, early folding fragments and the secondary structure elements they are part of show a clear trend toward conserving a rigid backbone. We therefore propose that backbone rigidity is a fundamental physical feature conserved by proteins that can provide important insights into their folding mechanisms and stability.
Collapse
Affiliation(s)
- Rita Pancsa
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Daniele Raimondi
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Elisa Cilia
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium
| | - Wim F Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Brussels, Belgium.
| |
Collapse
|
9
|
Sacquin-Mora S. Fold and flexibility: what can proteins' mechanical properties tell us about their folding nucleus? J R Soc Interface 2016; 12:rsif.2015.0876. [PMID: 26577596 DOI: 10.1098/rsif.2015.0876] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The determination of a protein's folding nucleus, i.e. a set of native contacts playing an important role during its folding process, remains an elusive yet essential problem in biochemistry. In this work, we investigate the mechanical properties of 70 protein structures belonging to 14 protein families presenting various folds using coarse-grain Brownian dynamics simulations. The resulting rigidity profiles combined with multiple sequence alignments show that a limited set of rigid residues, which we call the consensus nucleus, occupy conserved positions along the protein sequence. These residues' side chains form a tight interaction network within the protein's core, thus making our consensus nuclei potential folding nuclei. A review of experimental and theoretical literature shows that most (above 80%) of these residues were indeed identified as folding nucleus member in earlier studies.
Collapse
Affiliation(s)
- Sophie Sacquin-Mora
- Laboratoire de Biochimie Théorique, CNRS UPR9080, Institut de Biologie Physico-Chimique, 13 rue Pierre et Marie Curie, 75005 Paris, France
| |
Collapse
|
10
|
Nelson ED, Grishin NV. Evolution of off-lattice model proteins under ligand binding constraints. Phys Rev E 2016; 94:022410. [PMID: 27627338 DOI: 10.1103/physreve.94.022410] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2016] [Indexed: 12/12/2022]
Abstract
We investigate protein evolution using an off-lattice polymer model evolved to imitate the behavior of small enzymes. Model proteins evolve through mutations to nucleotide sequences (including insertions and deletions) and are selected to fold and maintain a specific binding site compatible with a model ligand. We show that this requirement is, in itself, sufficient to maintain an ordered folding domain, and we compare it to the requirement of folding an ordered (but otherwise unrestricted) domain. We measure rates of amino acid change as a function of local environment properties such as solvent exposure, packing density, and distance from the active site, as well as overall rates of sequence and structure change, both along and among model lineages in star phylogenies. The model recapitulates essentially all of the behavior found in protein phylogenetic analyses, and predicts that amino acid substitution rates vary linearly with distance from the binding site.
Collapse
Affiliation(s)
- Erik D Nelson
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 6001 Forest Park Blvd., Room ND10.124, Dallas, Texas 75235-9050, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 6001 Forest Park Blvd., Room ND10.124, Dallas, Texas 75235-9050, USA
| |
Collapse
|
11
|
Jeon J, Arnold R, Singh F, Teyra J, Braun T, Kim PM. PAT: predictor for structured units and its application for the optimization of target molecules for the generation of synthetic antibodies. BMC Bioinformatics 2016; 17:150. [PMID: 27039071 PMCID: PMC4818438 DOI: 10.1186/s12859-016-1001-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2016] [Accepted: 03/23/2016] [Indexed: 11/22/2022] Open
Abstract
Background The identification of structured units in a protein sequence is an important first step for most biochemical studies. Importantly for this study, the identification of stable structured region is a crucial first step to generate novel synthetic antibodies. While many approaches to find domains or predict structured regions exist, important limitations remain, such as the optimization of domain boundaries and the lack of identification of non-domain structured units. Moreover, no integrated tool exists to find and optimize structural domains within protein sequences. Results Here, we describe a new tool, PAT (http://www.kimlab.org/software/pat) that can efficiently identify both domains (with optimized boundaries) and non-domain putative structured units. PAT automatically analyzes various structural properties, evaluates the folding stability, and reports possible structural domains in a given protein sequence. For reliability evaluation of PAT, we applied PAT to identify antibody target molecules based on the notion that soluble and well-defined protein secondary and tertiary structures are appropriate target molecules for synthetic antibodies. Conclusion PAT is an efficient and sensitive tool to identify structured units. A performance analysis shows that PAT can characterize structurally well-defined regions in a given sequence and outperforms other efforts to define reliable boundaries of domains. Specially, PAT successfully identifies experimentally confirmed target molecules for antibody generation. PAT also offers the pre-calculated results of 20,210 human proteins to accelerate common queries. PAT can therefore help to investigate large-scale structured domains and improve the success rate for synthetic antibody generation. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1001-1) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jouhyun Jeon
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Roland Arnold
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Fateh Singh
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Joan Teyra
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Tatjana Braun
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada
| | - Philip M Kim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, M5S 3E1, ON, Canada. .,Department of Molecular Genetics, University of Toronto, Toronto, M5S 3E1, ON, Canada. .,Department of Computer Science, University of Toronto, Toronto, M5S 3E1, ON, Canada.
| |
Collapse
|
12
|
Echave J, Spielman SJ, Wilke CO. Causes of evolutionary rate variation among protein sites. Nat Rev Genet 2016; 17:109-21. [PMID: 26781812 DOI: 10.1038/nrg.2015.18] [Citation(s) in RCA: 176] [Impact Index Per Article: 22.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
It has long been recognized that certain sites within a protein, such as sites in the protein core or catalytic residues in enzymes, are evolutionarily more conserved than other sites. However, our understanding of rate variation among sites remains surprisingly limited. Recent progress to address this includes the development of a wide array of reliable methods to estimate site-specific substitution rates from sequence alignments. In addition, several molecular traits have been identified that correlate with site-specific mutation rates, and novel mechanistic biophysical models have been proposed to explain the observed correlations. Nonetheless, current models explain, at best, approximately 60% of the observed variance, highlighting the limitations of current methods and models and the need for new research directions.
Collapse
Affiliation(s)
- Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San Martín, 1650 San Martín, Buenos Aires, Argentina
| | - Stephanie J Spielman
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| | - Claus O Wilke
- Department of Integrative Biology, Center for Computational Biology and Bioinformatics, and Institute for Cellular and Molecular Biology, The University of Texas at Austin, Austin, Texas 78712, USA
| |
Collapse
|
13
|
Xia X, Longo LM, Sutherland MA, Blaber M. Evolution of a protein folding nucleus. Protein Sci 2015; 25:1227-40. [PMID: 26610273 DOI: 10.1002/pro.2848] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 11/10/2015] [Indexed: 12/22/2022]
Abstract
The folding nucleus (FN) is a cryptic element within protein primary structure that enables an efficient folding pathway and is the postulated heritable element in the evolution of protein architecture; however, almost nothing is known regarding how the FN structurally changes as complex protein architecture evolves from simpler peptide motifs. We report characterization of the FN of a designed purely symmetric β-trefoil protein by ϕ-value analysis. We compare the structure and folding properties of key foldable intermediates along the evolutionary trajectory of the β-trefoil. The results show structural acquisition of the FN during gene fusion events, incorporating novel turn structure created by gene fusion. Furthermore, the FN is adjusted by circular permutation in response to destabilizing functional mutation. FN plasticity by way of circular permutation is made possible by the intrinsic C3 cyclic symmetry of the β-trefoil architecture, identifying a possible selective advantage that helps explain the prevalence of cyclic structural symmetry in the proteome.
Collapse
Affiliation(s)
- Xue Xia
- Department of Biomedical Sciences, College of Medicine, Florida State University, Tallahassee, Florida, 32306-4300
| | - Liam M Longo
- Department of Biomedical Sciences, College of Medicine, Florida State University, Tallahassee, Florida, 32306-4300.,Department of Biological Chemistry, Weizmann Institute of Science, Rehovot, Israel
| | - Mason A Sutherland
- Department of Biomedical Sciences, College of Medicine, Florida State University, Tallahassee, Florida, 32306-4300
| | - Michael Blaber
- Department of Biomedical Sciences, College of Medicine, Florida State University, Tallahassee, Florida, 32306-4300
| |
Collapse
|
14
|
Nepal R, Spencer J, Bhogal G, Nedunuri A, Poelman T, Kamath T, Chung E, Kantardjieff K, Gottlieb A, Lustig B. Logistic regression models to predict solvent accessible residues using sequence- and homology-based qualitative and quantitative descriptors applied to a domain-complete X-ray structure learning set. J Appl Crystallogr 2015; 48:1976-1984. [PMID: 26664348 PMCID: PMC4665666 DOI: 10.1107/s1600576715018531] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2014] [Accepted: 10/03/2015] [Indexed: 11/11/2022] Open
Abstract
A working example of relative solvent accessibility (RSA) prediction for proteins is presented. Novel logistic regression models with various qualitative descriptors that include amino acid type and quantitative descriptors that include 20- and six-term sequence entropy have been built and validated. A domain-complete learning set of over 1300 proteins is used to fit initial models with various sequence homology descriptors as well as query residue qualitative descriptors. Homology descriptors are derived from BLASTp sequence alignments, whereas the RSA values are determined directly from the crystal structure. The logistic regression models are fitted using dichotomous responses indicating buried or accessible solvent, with binary classifications obtained from the RSA values. The fitted models determine binary predictions of residue solvent accessibility with accuracies comparable to other less computationally intensive methods using the standard RSA threshold criteria 20 and 25% as solvent accessible. When an additional non-homology descriptor describing Lobanov-Galzitskaya residue disorder propensity is included, incremental improvements in accuracy are achieved with 25% threshold accuracies of 76.12 and 74.79% for the Manesh-215 and CASP(8+9) test sets, respectively. Moreover, the described software and the accompanying learning and validation sets allow students and researchers to explore the utility of RSA prediction with simple, physically intuitive models in any number of related applications.
Collapse
Affiliation(s)
- Reecha Nepal
- Department of Chemistry, San Jose State University, San Jose, CA 95192-0101, USA
| | - Joanna Spencer
- Department of Mathematics and Statistics, San Jose State University, San Jose, CA 95192-0101, USA
| | - Guneet Bhogal
- Department of Biomedical, Chemical and Materials Engineering, San Jose State University, San Jose, CA 95192-0101, USA
| | - Amulya Nedunuri
- Department of General Engineering, San Jose State University, San Jose, CA 95192-0101, USA
| | - Thomas Poelman
- Department of Chemistry and Biochemistry, Cal Poly San Luis Obispo, San Luis Obispo, CA 93407, USA
| | - Thejas Kamath
- Department of Bioengineering, University of California, San Diego, San Diego, CA 92093-0412, USA
| | - Edwin Chung
- Department of Biomedical, Chemical and Materials Engineering, San Jose State University, San Jose, CA 95192-0101, USA
| | - Katherine Kantardjieff
- College of Science and Mathematics, California State University San Marcos, San Marcos, CA 92096-0001, USA
| | - Andrea Gottlieb
- Department of Mathematics and Statistics, San Jose State University, San Jose, CA 95192-0101, USA
| | - Brooke Lustig
- Department of Chemistry, San Jose State University, San Jose, CA 95192-0101, USA
| |
Collapse
|
15
|
Tripathi S, Waxham MN, Cheung MS, Liu Y. Lessons in Protein Design from Combined Evolution and Conformational Dynamics. Sci Rep 2015; 5:14259. [PMID: 26388515 PMCID: PMC4585694 DOI: 10.1038/srep14259] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2015] [Accepted: 08/21/2015] [Indexed: 11/09/2022] Open
Abstract
Protein-protein interactions play important roles in the control of every cellular process. How natural selection has optimized protein design to produce molecules capable of binding to many partner proteins is a fascinating problem but not well understood. Here, we performed a combinatorial analysis of protein sequence evolution and conformational dynamics to study how calmodulin (CaM), which plays essential roles in calcium signaling pathways, has adapted to bind to a large number of partner proteins. We discovered that amino acid residues in CaM can be partitioned into unique classes according to their degree of evolutionary conservation and local stability. Holistically, categorization of CaM residues into these classes reveals enriched physico-chemical interactions required for binding to diverse targets, balanced against the need to maintain the folding and structural modularity of CaM to achieve its overall function. The sequence-structure-function relationship of CaM provides a concrete example of the general principle of protein design. We have demonstrated the synergy between the fields of molecular evolution and protein biophysics and created a generalizable framework broadly applicable to the study of protein-protein interactions.
Collapse
Affiliation(s)
- Swarnendu Tripathi
- Department of Physics, University of Houston, Houston, TX.,Center for Theoretical Biological Physics, Rice University, Houston, TX
| | - M Neal Waxham
- Department of Neurobiology and Anatomy, University of Texas, Health Science Center, Houston, TX
| | - Margaret S Cheung
- Department of Physics, University of Houston, Houston, TX.,Center for Theoretical Biological Physics, Rice University, Houston, TX
| | - Yin Liu
- Department of Neurobiology and Anatomy, University of Texas, Health Science Center, Houston, TX
| |
Collapse
|
16
|
Faísca PF. Knotted proteins: A tangled tale of Structural Biology. Comput Struct Biotechnol J 2015; 13:459-68. [PMID: 26380658 PMCID: PMC4556803 DOI: 10.1016/j.csbj.2015.08.003] [Citation(s) in RCA: 58] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2015] [Revised: 07/31/2015] [Accepted: 08/07/2015] [Indexed: 01/19/2023] Open
Abstract
Knotted proteins have their native structures arranged in the form of an open knot. In the last ten years researchers have been making significant efforts to reveal their folding mechanism and understand which functional advantage(s) knots convey to their carriers. Molecular simulations have been playing a fundamental role in this endeavor, and early computational predictions about the knotting mechanism have just been confirmed in wet lab experiments. Here we review a collection of simulation results that allow outlining the current status of the field of knotted proteins, and discuss directions for future research.
Collapse
|
17
|
Bywater RP. Prediction of protein structural features from sequence data based on Shannon entropy and Kolmogorov complexity. PLoS One 2015; 10:e0119306. [PMID: 25856073 PMCID: PMC4391790 DOI: 10.1371/journal.pone.0119306] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2013] [Accepted: 01/29/2015] [Indexed: 11/21/2022] Open
Abstract
While the genome for a given organism stores the information necessary for the organism to function and flourish it is the proteins that are encoded by the genome that perhaps more than anything else characterize the phenotype for that organism. It is therefore not surprising that one of the many approaches to understanding and predicting protein folding and properties has come from genomics and more specifically from multiple sequence alignments. In this work I explore ways in which data derived from sequence alignment data can be used to investigate in a predictive way three different aspects of protein structure: secondary structures, inter-residue contacts and the dynamics of switching between different states of the protein. In particular the use of Kolmogorov complexity has identified a novel pathway towards achieving these goals.
Collapse
|
18
|
Matsuoka M, Sugita M, Kikuchi T. Implication of the cause of differences in 3D structures of proteins with high sequence identity based on analyses of amino acid sequences and 3D structures. BMC Res Notes 2014; 7:654. [PMID: 25231773 PMCID: PMC4180342 DOI: 10.1186/1756-0500-7-654] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Accepted: 09/05/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Proteins that share a high sequence homology while exhibiting drastically different 3D structures are investigated in this study. Recently, artificial proteins related to the sequences of the GA and IgG binding GB domains of human serum albumin have been designed. These artificial proteins, referred to as GA and GB, share 98% amino acid sequence identity but exhibit different 3D structures, namely, a 3α bundle versus a 4β + α structure. Discriminating between their 3D structures based on their amino acid sequences is a very difficult problem. In the present work, in addition to using bioinformatics techniques, an analysis based on inter-residue average distance statistics is used to address this problem. RESULTS It was hard to distinguish which structure a given sequence would take only with the results of ordinary analyses like BLAST and conservation analyses. However, in addition to these analyses, with the analysis based on the inter-residue average distance statistics and our sequence tendency analysis, we could infer which part would play an important role in its structural formation. CONCLUSIONS The results suggest possible determinants of the different 3D structures for sequences with high sequence identity. The possibility of discriminating between the 3D structures based on the given sequences is also discussed.
Collapse
Affiliation(s)
| | | | - Takeshi Kikuchi
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga, Japan.
| |
Collapse
|
19
|
Soler MA, Nunes A, Faísca PFN. Effects of knot type in the folding of topologically complex lattice proteins. J Chem Phys 2014; 141:025101. [DOI: 10.1063/1.4886401] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
|
20
|
Matsuoka M, Kikuchi T. Sequence analysis on the information of folding initiation segments in ferredoxin-like fold proteins. BMC STRUCTURAL BIOLOGY 2014; 14:15. [PMID: 24884463 PMCID: PMC4055915 DOI: 10.1186/1472-6807-14-15] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/23/2013] [Accepted: 05/15/2014] [Indexed: 02/06/2023]
Abstract
BACKGROUND While some studies have shown that the 3D protein structures are more conservative than their amino acid sequences, other experimental studies have shown that even if two proteins share the same topology, they may have different folding pathways. There are many studies investigating this issue with molecular dynamics or Go-like model simulations, however, one should be able to obtain the same information by analyzing the proteins' amino acid sequences, if the sequences contain all the information about the 3D structures. In this study, we use information about protein sequences to predict the location of their folding segments. We focus on proteins with a ferredoxin-like fold, which has a characteristic topology. Some of these proteins have different folding segments. RESULTS Despite the simplicity of our methods, we are able to correctly determine the experimentally identified folding segments by predicting the location of the compact regions considered to play an important role in structural formation. We also apply our sequence analyses to some homologues of each protein and confirm that there are highly conserved folding segments despite the homologues' sequence diversity. These homologues have similar folding segments even though the homology of two proteins' sequences is not so high. CONCLUSION Our analyses have proven useful for investigating the common or different folding features of the proteins studied.
Collapse
Affiliation(s)
| | - Takeshi Kikuchi
- Department of Bioinformatics, College of Life Sciences, Ritsumeikan University, 1-1-1 Nojihigashi, Kusatsu, Shiga 525-8577, Japan.
| |
Collapse
|
21
|
Mannige RV. Origination of the Protein Fold Repertoire from Oily Pluripotent Peptides. Proteomes 2014; 2:154-168. [PMID: 28250375 PMCID: PMC5302733 DOI: 10.3390/proteomes2020154] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2013] [Revised: 02/27/2014] [Accepted: 03/20/2014] [Indexed: 11/16/2022] Open
Abstract
While the repertoire of protein folds that exists today underlies most of life’s capabilities, our mechanistic picture of protein fold origination is incomplete. This paper discusses a hypothetical mechanism for the emergence of the protein fold repertoire from highly dynamic and collapsed peptides, exemplified by peptides with high oil content or hydrophobicity. These peptides are called pluripotent to emphasize their capacity to evolve into numerous folds transiently available to them. As evidence, the paper will discuss previous simulation work on the superior fold evolvability of oily peptides, trace (“fossil”) evidence within proteomes seen today, and a general relationship between protein dynamism and evolvability. Aside from implications on the origination of protein folds, the hypothesis implies that the vanishing utility of a random peptide in protein origination may be relatively exaggerated, as some random peptides with a certain composition (e.g., oily) may fare better than others. In later sections, the hypothesis is discussed in the context of existing discussions regarding the spontaneous origination of biomolecules.
Collapse
Affiliation(s)
- Ranjan V Mannige
- Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720,USA.
| |
Collapse
|
22
|
Zhu L, Kurt N, Choi J, Lapidus LJ, Cavagnero S. Sub-millisecond chain collapse of the Escherichia coli globin ApoHmpH. J Phys Chem B 2013; 117:7868-77. [PMID: 23750553 DOI: 10.1021/jp400174e] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Myoglobins are ubiquitous proteins that play a seminal role in oxygen storage, transport, and NO metabolism. The folding mechanism of apomyoglobins from different species has been studied to a fair extent over the last two decades. However, integrated investigations of the entire process, including both the early (sub-ms) and late (ms-s) folding stages, have been missing. Here, we study the folding kinetics of the single-Trp Escherichia coli globin apoHmpH via a combination of continuous-flow microfluidic and stopped-flow approaches. A rich series of molecular events emerges, spanning a very wide temporal range covering more than 7 orders of magnitude, from sub-microseconds to tens of seconds. Variations in fluorescence intensity and spectral shifts reveal that the protein region around Trp120 undergoes a fast collapse within the 8 μs mixing time and gradually reaches a native-like conformation with a half-life of 144 μs from refolding initiation. There are no further fluorescence changes beyond ca. 800 μs, and folding proceeds much more slowly, up to 20 s, with acquisition of the missing helicity (ca. 30%), long after consolidation of core compaction. The picture that emerges is a gradual acquisition of native structure on a free-energy landscape with few large barriers. Interestingly, the single tryptophan, which lies within the main folding core of globins, senses some local structural consolidation events after establishment of native-like core polarity (i.e., likely after core dedydration). In all, this work highlights how the main core of the globin fold is capable of becoming fully native efficiently, on the sub-millisecond time scale.
Collapse
Affiliation(s)
- Li Zhu
- Department of Physics and Astronomy, Michigan State University, East Lansing, Michigan 48824, USA
| | | | | | | | | |
Collapse
|
23
|
Bécu JM, Pelé J, Rodien P, Abdi H, Chabbert M. Structural evolution of G-protein-coupled receptors: a sequence space approach. Methods Enzymol 2013; 520:49-66. [PMID: 23332695 DOI: 10.1016/b978-0-12-391861-1.00003-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Class A G-protein-coupled receptors (GPCRs) provide a fascinating example of evolutionary success. In this review, we discuss how metric multidimensional scaling (MDS), a multivariate analysis method, complements traditional tree-based phylogenetic methods and helps decipher the mechanisms that drove the evolution of class A GPCRs. MDS provides low-dimensional representations of a distance matrix. Applied to a multiple sequence alignment, MDS represents the sequences in a Euclidean space as points whose interdistances are as close as possible to the distances in the alignment (the so-called sequence space). We detail how to perform the MDS analysis of a multiple sequence alignment and how to analyze and interpret the resulting sequence space. We also show that the projection of supplementary data (a property of the MDS method) can be used to straightforwardly monitor the evolutionary drift of specific subfamilies. The sequence space of class A GPCRs reveals the key role of mutations at the level of the TM2 and TM5 proline residues in the evolution of class A GPCRs.
Collapse
|
24
|
Aledo JC, Valverde H, Ruíz-Camacho M. Thermodynamic stability explains the differential evolutionary dynamics of cytochrome b and COX I in mammals. J Mol Evol 2012; 74:69-80. [PMID: 22362464 DOI: 10.1007/s00239-012-9489-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2011] [Accepted: 02/02/2012] [Indexed: 12/29/2022]
Abstract
By using a combination of evolutionary and structural data from 231 species, we have addressed the relationship between evolution and structural features of cytochrome b and COX I, two mtDNA-encoded proteins. The interior of cytochrome b, in contrast to that of COX I, exhibits a remarkable tolerance to changes. The higher evolvability of cytochrome b contrasts with the lower rate of synonymous substitutions of its gene when compared to that of COX I, suggesting that the latter is subjected to a stronger purifying selection. We present evidences that the stability effect of mutations (ΔΔG) may be behind these differential behaviour.
Collapse
Affiliation(s)
- Juan Carlos Aledo
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Universidad de Málaga, 29071, Málaga, Spain.
| | | | | |
Collapse
|
25
|
Determinants, discriminants, conserved residues--a heuristic approach to detection of functional divergence in protein families. PLoS One 2011; 6:e24382. [PMID: 21931701 PMCID: PMC3171465 DOI: 10.1371/journal.pone.0024382] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2011] [Accepted: 08/08/2011] [Indexed: 11/19/2022] Open
Abstract
In this work, belonging to the field of comparative analysis of protein sequences, we focus on detection of functional specialization on the residue level. As the input, we take a set of sequences divided into groups of orthologues, each group known to be responsible for a different function. This provides two independent pieces of information: within group conservation and overlap in amino acid type across groups. We build our discussion around the set of scoring functions that keep the two separated and the source of the signal easy to trace back to its source.We propose a heuristic description of functional divergence that includes residue type exchangeability, both in the conservation and in the overlap measure, and does not make any assumptions on the rate of evolution in the groups other than the one under consideration. Residue types acceptable at a certain position within an orthologous group are described as a distribution which evolves in time, starting from a single ancestral type, and is subject to constraints that can be inferred only indirectly. To estimate the strength of the constraints, we compare the observed degrees of conservation and overlap with those expected in the hypothetical case of a freely evolving distribution.Our description matches the experiment well, but we also conclude that any attempt to capture the evolutionary behavior of specificity determining residues in terms of a scalar function will be tentative, because no single model can cover the variety of evolutionary behavior such residues exhibit. Especially, models expecting the same type of evolutionary behavior across functionally divergent groups tend to miss a portion of information otherwise retrievable by the conservation and overlap measures they use.
Collapse
|
26
|
Coluzza I. A coarse-grained approach to protein design: learning from design to understand folding. PLoS One 2011; 6:e20853. [PMID: 21747930 PMCID: PMC3128589 DOI: 10.1371/journal.pone.0020853] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2011] [Accepted: 05/10/2011] [Indexed: 11/20/2022] Open
Abstract
Computational studies have given a great contribution in building our current understanding of the complex behavior of protein molecules; nevertheless, a complete characterization of their free energy landscape still represents a major challenge. Here, we introduce a new coarse-grained approach that allows for an extensive sampling of the conformational space of a large number of sequences. We explicitly discuss its application in protein design, and by studying four representative proteins, we show that the method generates sequences with a relatively smooth free energy surface directed towards the target structures.
Collapse
Affiliation(s)
- Ivan Coluzza
- Department of Physics, University of Vienna, Vienna, Austria.
| |
Collapse
|
27
|
Luccioli S, Imparato A, Lepri S, Piazza F, Torcini A. Discrete breathers in a realistic coarse-grained model of proteins. Phys Biol 2011; 8:046008. [PMID: 21670494 DOI: 10.1088/1478-3975/8/4/046008] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We report the results of molecular dynamics simulations of an off-lattice protein model featuring a physical force-field and amino-acid sequence. We show that localized modes of nonlinear origin, discrete breathers (DBs), emerge naturally as continuations of a subset of high-frequency normal modes residing at specific sites dictated by the native fold. DBs are time-periodic, space-localized vibrational modes that exist generically in nonlinear discrete systems and are known for their resilience and ability to concentrate energy for long times. In the case of the small β-barrel structure that we consider, DB-mediated localization occurs on the turns connecting the strands. At high energies, DBs stabilize the structure by concentrating energy on a few sites, while their collapse marks the onset of large-amplitude fluctuations of the protein. Furthermore, we show how breathers develop as energy-accumulating centres following perturbations even at distant locations, thus mediating efficient and irreversible energy transfers. Remarkably, due to the presence of angular potentials, the breather induces a local static distortion of the native fold. Altogether, the combination of these two nonlinear effects may provide a ready means for remotely controlling local conformational changes in proteins.
Collapse
Affiliation(s)
- Stefano Luccioli
- CNR-Consiglio Nazionale delle Ricerche, Istituto dei Sistemi Complessi, via Madonna del Piano 10, I-50019 Sesto Fiorentino, Italy.
| | | | | | | | | |
Collapse
|
28
|
Kumauchi M, Kaledhonkar S, Philip AF, Wycoff J, Hara M, Li Y, Xie A, Hoff WD. A conserved helical capping hydrogen bond in PAS domains controls signaling kinetics in the superfamily prototype photoactive yellow protein. J Am Chem Soc 2011; 132:15820-30. [PMID: 20954744 DOI: 10.1021/ja107716r] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
PAS domains form a divergent protein superfamily with more than 20 000 members that perform a wide array of sensing and regulatory functions in all three domains of life. Only nine residues are well-conserved in PAS domains, with an Asn residue at the start of α-helix 3 showing the strongest conservation. The molecular functions of these nine conserved residues are unknown. We use static and time-resolved visible and FTIR spectroscopy to investigate receptor activation in the photosensor photoactive yellow protein (PYP), a PAS domain prototype. The N43A and N43S mutants allow an investigation of the role of side-chain hydrogen bonding at this conserved position. The mutants exhibit a blue-shifted visible absorbance maximum and up-shifted chromophore pK(a). Disruption of the hydrogen bonds in N43A PYP causes both a reduction in protein stability and a 3400-fold increase in the lifetime of the signaling state of this photoreceptor. A significant part of this increase in lifetime can be attributed to the helical capping interaction of Asn43. This extends the known importance of helical capping for protein structure to regulating functional protein kinetics. A model for PYP activation has been proposed in which side-chain hydrogen bonding of Asn43 is critical for relaying light-induced conformational changes. However, FTIR spectroscopy shows that both Asn43 mutants retain full allosteric transmission of structural changes. Analysis of 30 available high-resolution structures of PAS domains reveals that the side-chain hydrogen bonding of residue 43 but not residue identity is highly conserved and suggests that its helical cap affects signaling kinetics in other PAS domains.
Collapse
Affiliation(s)
- Masato Kumauchi
- Department of Microbiology and Molecular Genetics, Oklahoma State University, Stillwater, Oklahoma 74078, United States
| | | | | | | | | | | | | | | |
Collapse
|
29
|
Gromiha MM. Influence of long-range contacts and surrounding residues on the transition state structures of proteins. Anal Biochem 2011; 408:32-6. [DOI: 10.1016/j.ab.2010.08.029] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2010] [Revised: 08/16/2010] [Accepted: 08/22/2010] [Indexed: 10/19/2022]
|
30
|
Robustness and evolvability in the functional anatomy of a PER-ARNT-SIM (PAS) domain. Proc Natl Acad Sci U S A 2010; 107:17986-91. [PMID: 20889915 DOI: 10.1073/pnas.1004823107] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The robustness of proteins against point mutations implies that only a small subset of residues determines functional properties. We test this prediction using photoactive yellow protein (PYP), a 125-residue prototype of the PER-ARNT-SIM (PAS) domain superfamily of signaling proteins. PAS domains are defined by a small number of conserved residues of unknown function. We report high-throughput biophysical measurements on a complete Ala scan set of purified PYP mutants. The dataset of 1,193 values on active site properties, functional kinetics, stability, and production level reveals that 124 mutants retain the characteristic photocycle of PYP, but that the majority of substitutions significantly alter functional properties. Only 35% of substitutions that strongly affect function are located at the active site. Unexpectedly, most PAS-conserved residues are required for maintaining protein production. PAS domain activation often involves conformational changes in α-helices linked to the PAS core. However, the mechanism of transmission and kinetic regulation of allosteric structural changes from the PAS domain to these helices is not clear. The Ala scan data reveal interactions governing allosteric switching in PYP. The photocycle kinetics is significantly altered by substitutions at 58 positions and spans a 3,000-fold range. Nine residues that dock the N-terminal α-helices of PYP to its PAS core regulate signaling kinetics. Ile39 and Asn43 are identified as part of a mechanism for regulating allosteric switching that is conserved among PAS domains. These results show that PYP combines robustness with a high degree of evolvability and imply production level as an important factor in protein evolution.
Collapse
|
31
|
Stojanović SĐ, Zarić BL, Zarić SD. Protein subunit interfaces: a statistical analysis of hot spots in Sm proteins. J Mol Model 2010; 16:1743-51. [DOI: 10.1007/s00894-010-0787-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2009] [Accepted: 06/16/2010] [Indexed: 11/30/2022]
|
32
|
What lessons can be learned from studying the folding of homologous proteins? Methods 2010; 52:38-50. [PMID: 20570731 PMCID: PMC2965948 DOI: 10.1016/j.ymeth.2010.06.003] [Citation(s) in RCA: 48] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2010] [Revised: 05/25/2010] [Accepted: 06/01/2010] [Indexed: 01/30/2023] Open
Abstract
The studies of the folding of structurally related proteins have proved to be a very important tool for investigating protein folding. Here we review some of the insights that have been gained from such studies. Our highlighted studies show just how such an investigation should be designed and emphasise the importance of the synergy between experiment and theory. We also stress the importance of choosing the right system carefully, exploiting the excellent structural and sequence databases at our disposal.
Collapse
|
33
|
Hills RD, Kathuria SV, Wallace LA, Day IJ, Brooks CL, Matthews CR. Topological frustration in beta alpha-repeat proteins: sequence diversity modulates the conserved folding mechanisms of alpha/beta/alpha sandwich proteins. J Mol Biol 2010; 398:332-50. [PMID: 20226790 DOI: 10.1016/j.jmb.2010.03.001] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2009] [Revised: 02/27/2010] [Accepted: 03/03/2010] [Indexed: 10/19/2022]
Abstract
The thermodynamic hypothesis of Anfinsen postulates that structures and stabilities of globular proteins are determined by their amino acid sequences. Chain topology, however, is known to influence the folding reaction, in that motifs with a preponderance of local interactions typically fold more rapidly than those with a larger fraction of nonlocal interactions. Together, the topology and sequence can modulate the energy landscape and influence the rate at which the protein folds to the native conformation. To explore the relationship of sequence and topology in the folding of beta alpha-repeat proteins, which are dominated by local interactions, we performed a combined experimental and simulation analysis on two members of the flavodoxin-like, alpha/beta/alpha sandwich fold. Spo0F and the N-terminal receiver domain of NtrC (NT-NtrC) have similar topologies but low sequence identity, enabling a test of the effects of sequence on folding. Experimental results demonstrated that both response-regulator proteins fold via parallel channels through highly structured submillisecond intermediates before accessing their cis prolyl peptide bond-containing native conformations. Global analysis of the experimental results preferentially places these intermediates off the productive folding pathway. Sequence-sensitive Gō-model simulations conclude that frustration in the folding in Spo0F, corresponding to the appearance of the off-pathway intermediate, reflects competition for intra-subdomain van der Waals contacts between its N- and C-terminal subdomains. The extent of transient, premature structure appears to correlate with the number of isoleucine, leucine, and valine (ILV) side chains that form a large sequence-local cluster involving the central beta-sheet and helices alpha2, alpha 3, and alpha 4. The failure to detect the off-pathway species in the simulations of NT-NtrC may reflect the reduced number of ILV side chains in its corresponding hydrophobic cluster. The location of the hydrophobic clusters in the structure may also be related to the differing functional properties of these response regulators. Comparison with the results of previous experimental and simulation analyses on the homologous CheY argues that prematurely folded unproductive intermediates are a common property of the beta alpha-repeat motif.
Collapse
Affiliation(s)
- Ronald D Hills
- Department of Molecular Biology and Kellogg School of Science and Technology, The Scripps Research Institute, 10550 North Torrey Pines Road TPC6, La Jolla, CA 92037, USA
| | | | | | | | | | | |
Collapse
|
34
|
Levy R, Edelman M, Sobolev V. Prediction of 3D metal binding sites from translated gene sequences based on remote-homology templates. Proteins 2010; 76:365-74. [PMID: 19173310 DOI: 10.1002/prot.22352] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
Database-scale analysis was performed to determine whether structural models, based on remote homologues, are effective in predicting 3D transition metal binding sites in proteins directly from translated gene sequences. The extent by which side chain modeling alone reduces sensitivity and selectivity is shown to be <10%. Surprisingly, selectivity was not dependent on the level of sequence homology between template and target, or on the presence of a metal ion in the structural template. Applying a modification of the CHED algorithm (Babor et al., Proteins 2008;70:208-217) and machine learning filters, a selectivity of approximately 90% was achieved for protein sequences using unrelated structural templates over a sequence identity range of 18-100%. Below approximately 18% identity, the number of analyzable target-template pairs and predictability of metal binding sites falls off sharply. A full third of structural templates were found to have target partners only in the remote homology range of 18-30%. In this range, nonmetal-binding templates are calculated to be the majority and serve to predict with 50% sensitivity at the geometric level. Overall, sensitivity at the geometric level for targets having templates in the 18-30% sequence identity range is 73%, with an average of one false positive site per true site. Protein sequences described as "unknown" in the UniProt database and composed largely of unidentified genome project sequences were studied and metal binding sites predicted. A web server for prediction of metal binding sites from protein sequence is provided.
Collapse
Affiliation(s)
- Ronen Levy
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel
| | | | | |
Collapse
|
35
|
Exploring the sequence determinants of amyloid structure using position-specific scoring matrices. Nat Methods 2010; 7:237-42. [PMID: 20154676 DOI: 10.1038/nmeth.1432] [Citation(s) in RCA: 499] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2009] [Accepted: 12/16/2009] [Indexed: 01/31/2023]
Abstract
Protein aggregation results in beta-sheet-like assemblies that adopt either a variety of amorphous morphologies or ordered amyloid-like structures. These differences in structure also reflect biological differences; amyloid and amorphous beta-sheet aggregates have different chaperone affinities, accumulate in different cellular locations and are degraded by different mechanisms. Further, amyloid function depends entirely on a high intrinsic degree of order. Here we experimentally explored the sequence space of amyloid hexapeptides and used the derived data to build Waltz, a web-based tool that uses a position-specific scoring matrix to determine amyloid-forming sequences. Waltz allows users to identify and better distinguish between amyloid sequences and amorphous beta-sheet aggregates and allowed us to identify amyloid-forming regions in functional amyloids.
Collapse
|
36
|
Abstract
MOTIVATION To test whether protein folding constraints and secondary structure sequence preferences significantly reduce the space of amino acid words in proteins, we compared the frequencies of four- and five-amino acid word clumps (independent words) in proteins to the frequencies predicted by four random sequence models. RESULTS While the human proteome has many overrepresented word clumps, these words come from large protein families with biased compositions (e.g. Zn-fingers). In contrast, in a non-redundant sample of Pfam-AB, only 1% of four-amino acid word clumps (4.7% of 5mer words) are 2-fold overrepresented compared with our simplest random model [MC(0)], and 0.1% (4mers) to 0.5% (5mers) are 2-fold overrepresented compared with a window-shuffled random model. Using a false discovery rate q-value analysis, the number of exceptional four- or five-letter words in real proteins is similar to the number found when comparing words from one random model to another. Consensus overrepresented words are not enriched in conserved regions of proteins, but four-letter words are enriched 1.18- to 1.56-fold in alpha-helical secondary structures (but not beta-strands). Five-residue consensus exceptional words are enriched for alpha-helix 1.43- to 1.61-fold. Protein word preferences in regular secondary structure do not appear to significantly restrict the use of sequence words in unrelated proteins, although the consensus exceptional words have a secondary structure bias for alpha-helix. Globally, words in protein sequences appear to be under very few constraints; for the most part, they appear to be random. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Daniel T Lavelle
- Department of Biochemistry and Molecular Genetics, University of Virginia, Jordan Hall Box 800733, Charlottesville, VA 22908, USA
| | | |
Collapse
|
37
|
Stagg L, Samiotakis A, Homouz D, Cheung MS, Wittung-Stafshede P. Residue-specific analysis of frustration in the folding landscape of repeat beta/alpha protein apoflavodoxin. J Mol Biol 2009; 396:75-89. [PMID: 19913555 DOI: 10.1016/j.jmb.2009.11.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2009] [Revised: 11/04/2009] [Accepted: 11/05/2009] [Indexed: 11/17/2022]
Abstract
Flavodoxin adopts the common repeat beta/alpha topology and folds in a complex kinetic reaction with intermediates. To better understand this reaction, we analyzed a set of Desulfovibrio desulfuricans apoflavodoxin variants with point mutations in most secondary structure elements by in vitro and in silico methods. By equilibrium unfolding experiments, we first revealed how different secondary structure elements contribute to overall protein resistance to heat and urea. Next, using stopped-flow mixing coupled with far-UV circular dichroism, we probed how individual residues affect the amount of structure formed in the experimentally detected burst-phase intermediate. Together with in silico folding route analysis of the same point-mutated variants and computation of growth in nucleation size during early folding, computer simulations suggested the presence of two competing folding nuclei at opposite sides of the central beta-strand 3 (i.e., at beta-strands 1 and 4), which cause early topological frustration (i.e., misfolding) in the folding landscape. Particularly, the extent of heterogeneity in folding nuclei growth correlates with the in vitro burst-phase circular dichroism amplitude. In addition, phi-value analysis (in vitro and in silico) of the overall folding barrier to apoflavodoxin's native state revealed that native-like interactions in most of the beta-strands must form in transition state. Our study reveals that an imbalanced competition between the two sides of apoflavodoxin's central beta-sheet directs initial misfolding, while proper alignment on both sides of beta-strand 3 is necessary for productive folding.
Collapse
Affiliation(s)
- Loren Stagg
- Department of Biochemistry and Cell Biology, Rice University, Houston, TX 77251, USA
| | | | | | | | | |
Collapse
|
38
|
Abstract
Progress in understanding protein folding allows to simulate, with atomic detail, the evolution of amino-acid sequences folding to a given native conformation. A particularly attractive example is the HIV-1 protease, main target of therapies to fight AIDS, which under drug pressure is able to develop resistance within few months from the starting of therapy. By comparing the results of simulations of the evolution of the protease with the corresponding proteomic data, one can approximately determine the value of the associated evolution pressure under which the enzyme has become and, as a consequence, map out the energy landscape in sequence space of the HIV-1 protease. It is found that there are several families of sequences folding to the native conformations of the enzyme. Each of these families are characterized by different sets of highly conserved ("hot") amino acids which play a critical role in the folding and stability of the protease. There are two main possibilities for the virus to move from one family to a different one: (a) in a single generation, through the concerted mutations of the hot amino acids, a highly unlikely event, (b) through a folding path (if it exists), again a very improbable event. In fact, the number of generations needed by the virus to change stepwise its sequence from one family to another is astronomically large. These results point to the "hot" segments of the protease as promising targets for a nonconventional inhibition strategy, likely not to create resistance.
Collapse
Affiliation(s)
- G Tiana
- Department of Physics, University of Milano and INFN, via Celoria 16, 20133 Milano, Italy.
| | | |
Collapse
|
39
|
Comparing the functional roles of nonconserved sequence positions in homologous transcription repressors: implications for sequence/function analyses. J Mol Biol 2009; 395:785-802. [PMID: 19818797 DOI: 10.1016/j.jmb.2009.10.001] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2009] [Revised: 10/01/2009] [Accepted: 10/02/2009] [Indexed: 11/21/2022]
Abstract
The explosion of protein sequences deduced from genetic code has led to both a problem and a potential resource: Efficient data use requires interpreting the functional impact of sequence change without experimentally characterizing each protein variant. Several groups have hypothesized that interpretation could be aided by analyzing the sequences of naturally occurring homologues. To that end, myriad sequence/function analyses have been developed to predict which conserved, semi-conserved, and nonconserved positions are functionally important. These positions must be discriminated from the nonconserved positions that are functionally silent. However, the assumptions that underlie sequence analyses are based on experimental results that are sparse and usually designed to address different questions. Here, we use three homologues from a test family common to bioinformatics-the LacI/GalR transcription repressors-to test a common assumption: If a position is functionally important for one family member, it has similar importance in all homologues. We generated experimental sequence/function information for each nonconserved position in the 18 amino acids that link the DNA-binding and regulatory domains of three LacI/GalR homologues. We find that the functional importance of each position is preserved among the three linkers, albeit to different degrees. We also find that every linker position contributes to function, which has twofold implications. (1) Since the linker positions range from highly conserved to semi-conserved to nonconserved and contribute to affinity, selectivity, and allosteric response, we assert that sequence/function analyses must identify positions in the LacI/GalR linkers to be qualified as "successful". Many analyses overlook this region since most of the residues do not directly contact ligand. (2) No position in the LacI/GalR linker is functionally silent. This finding is inconsistent with another underlying principle of many analyses: Using sequence sets to discriminate important from non-contributing positions obligates silent positions, which denotes that most homologues tolerate a variety of amino acid substitutions at the position without functional change. Instead, additional combinatorial mutants in the LacI/GalR linkers show that particular substitutions can be silent in a context-dependent manner. Thus, specific permutations of sequence change (rather than change at silent positions) would facilitate neutral drift during evolution. Finally, the combinatorial mutants also reveal functional synergy between semi- and nonconserved positions. Such functional relationships would be missed by analyses that rely primarily upon co-evolution.
Collapse
|
40
|
Jayaraj V, Suhanya R, Vijayasarathy M, Anandagopu P, Rajasekaran E. Role of large hydrophobic residues in proteins. Bioinformation 2009; 3:409-12. [PMID: 19759817 PMCID: PMC2732037 DOI: 10.6026/97320630003409] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2009] [Revised: 03/07/2009] [Accepted: 04/16/2009] [Indexed: 11/23/2022] Open
Abstract
Large Hydrophobic Residues (LHR) such as phenylalanine, isoleucine, leucine, methionine and valine play an important role in protein structure and activity. We describe the role of LHR in complete set of protein sequences in 15 different species. That is the distribution of LHR in different proteins of different species is reported. It is observed that the proteins prefer to have 27% of large hydrophobic residues in total and all along the sequence. It is also observed that proteins accumulate more LHR in its active sites. A window analysis on these protein sequences shows that the 27% of LHR is more frequent at window length of 45 amino acids. The influenza virus and P. falciparum show a random distribution of LHR in its proteins compared to other model organisms.
Collapse
Affiliation(s)
- Veerasamy Jayaraj
- Department of Computer Application, Periyar Maniammai University, Thanjavur - 613403, Tamil Nadu, India
| | | | | | | | | |
Collapse
|
41
|
Zhou T, Weems M, Wilke CO. Translationally optimal codons associate with structurally sensitive sites in proteins. Mol Biol Evol 2009; 26:1571-80. [PMID: 19349643 DOI: 10.1093/molbev/msp070] [Citation(s) in RCA: 152] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The mistranslation-induced protein misfolding hypothesis predicts that selection should prefer high-fidelity codons at sites at which translation errors are structurally disruptive and lead to protein misfolding and aggregation. To test this hypothesis, we analyzed the relationship between codon usage bias and protein structure in the genomes of four model organisms, Escherichia coli, yeast, fly, and mouse. Using both the Mantel-Haenszel procedure, which applies to categorical data, and a newly developed association test for continuous variables, we find that translationally optimal codons associate with buried residues and also with residues at sites where mutations lead to large changes in free energy (DeltaDeltaG). In each species, only a subset of all amino acids show this signal, but most amino acids show the signal in at least one species. By repeating the analysis on a reduced data set that excludes interdomain linkers, we show that our results are not caused by an association of rare codons with solvent-accessible linker regions. Finally, we find that our results depend weakly on expression level; the association between optimal codons and buried sites exists at all expression levels, but increases in strength as expression level increases.
Collapse
Affiliation(s)
- Tong Zhou
- Center for Computational Biology and Bioinformatics, The University of Texas at Austin, TX, USA
| | | | | |
Collapse
|
42
|
Caldarini M, Vasile F, Provasi D, Longhi R, Tiana G, Broglia RA. Identification and characterization of folding inhibitors of hen egg lysozyme: an example of a new paradigm of drug design. Proteins 2009; 74:390-9. [PMID: 18623063 DOI: 10.1002/prot.22161] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Studies of protein folding indicate the presence of native contacts in the denatured state, giving rise to folding elements which contribute to the accomplishment of the native state. The possibility of finding molecules which can interact with specific folding elements of a target protein preventing it from reaching its native state, and hence from becoming biologically active, is particularly attractive. The notion that folding elements not only provide molecular recognition directing the folding process, but also have conserved sequence, implies that targeting such elements will make protein folding inhibitors less susceptible to mutations which, in many cases, abrogate drug effects. The folding-inhibition strategy can lead to a truly novel and rational approach to drug design, aside from providing new insight into folding. This is illustrated in the case of hen egg lysozyme.
Collapse
Affiliation(s)
- M Caldarini
- Department of Physics, University of Milano and INFN, via Celoria 16, 20133 Milano, Italy
| | | | | | | | | | | |
Collapse
|
43
|
Pugalenthi G, Tang K, Suganthan PN, Chakrabarti S. Identification of structurally conserved residues of proteins in absence of structural homologs using neural network ensemble. ACTA ACUST UNITED AC 2008; 25:204-10. [PMID: 19038986 PMCID: PMC2638999 DOI: 10.1093/bioinformatics/btn618] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Motivation: So far various bioinformatics and machine learning techniques applied for identification of sequence and functionally conserved residues in proteins. Although few computational methods are available for the prediction of structurally conserved residues from protein structure, almost all methods require homologous structural information and structure-based alignments, which still prove to be a bottleneck in protein structure comparison studies. In this work, we developed a neural network approach for identification of structurally important residues from a single protein structure without using homologous structural information and structural alignment. Results: A neural network ensemble (NNE) method that utilizes negative correlation learning (NCL) approach was developed for identification of structurally conserved residues (SCRs) in proteins using features that represent amino acid conservation and composition, physico-chemical properties and structural properties. The NCL-NNE method was applied to 6042 SCRs that have been extracted from 496 protein domains. This method obtained high prediction sensitivity (92.8%) and quality (Matthew's correlation coefficient is 0.852) in identification of SCRs. Further benchmarking using 60 protein domains containing 1657 SCRs that were not part of the training and testing datasets shows that the NCL-NNE can correctly predict SCRs with ∼ 90% sensitivity. These results suggest the usefulness of NCL-NNE for facilitating the identification of SCRs utilizing information derived from a single protein structure. Therefore, this method could be extremely effective in large-scale benchmarking studies where reliable structural homologs and alignments are limited. Availability: The executable for the NCL-NNE algorithm is available at http://www3.ntu.edu.sg/home/EPNSugan/index_files/SCR.htm Contact:epnsugan@ntu.edu.sg; chakraba@ncbi.nlm.nih.gov. Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ganesan Pugalenthi
- School of Electrical and Electronic Engineering, Nanyang Technological University, Singapore
| | | | | | | |
Collapse
|
44
|
Niv MY, Skrabanek L, Roberts RJ, Scheraga HA, Weinstein H. Identification of GATC- and CCGG-recognizing Type II REases and their putative specificity-determining positions using Scan2S--a novel motif scan algorithm with optional secondary structure constraints. Proteins 2008; 71:631-40. [PMID: 17972284 DOI: 10.1002/prot.21777] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering.
Collapse
Affiliation(s)
- Masha Y Niv
- Department of Physiology and Biophysics, Weill Medical College of Cornell University, 1300 York Ave., New York, New York 10021, USA.
| | | | | | | | | |
Collapse
|
45
|
Ortutay C, Vihinen M. Efficiency of the immunome protein interaction network increases during evolution. Immunome Res 2008; 4:4. [PMID: 18430195 PMCID: PMC2373292 DOI: 10.1186/1745-7580-4-4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2008] [Accepted: 04/22/2008] [Indexed: 12/01/2022] Open
Abstract
Background Details of the mechanisms and selection pressures that shape the emergence and development of complex biological systems, such as the human immune system, are poorly understood. A recent definition of a reference set of proteins essential for the human immunome, combined with information about protein interaction networks for these proteins, facilitates evolutionary study of this biological machinery. Results Here, we present a detailed study of the development of the immunome protein interaction network during eight evolutionary steps from Bilateria ancestors to human. New nodes show preferential attachment to high degree proteins. The efficiency of the immunome protein interaction network increases during the evolutionary steps, whereas the vulnerability of the network decreases. Conclusion Our results shed light on selective forces acting on the emergence of biological networks. It is likely that the high efficiency and low vulnerability are intrinsic properties of many biological networks, which arise from the effects of evolutionary processes yet to be uncovered.
Collapse
Affiliation(s)
- Csaba Ortutay
- Institute of Medical Technology, FI-33014 University of Tampere, Finland.
| | | |
Collapse
|
46
|
Siltberg-Liberles J, Martinez A. Searching distant homologs of the regulatory ACT domain in phenylalanine hydroxylase. Amino Acids 2008; 36:235-49. [PMID: 18368466 DOI: 10.1007/s00726-008-0057-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2008] [Accepted: 03/11/2008] [Indexed: 11/29/2022]
Abstract
High sequence divergence, evolutionary mobility, and superfold topology characterize the ACT domain. Frequently found in multidomain proteins, these domains induce allosteric effects by binding a regulatory ligand usually to an ACT domain dimer interface. In mammalian phenylalanine hydroxylase (PAH), no contacts are formed between ACT domains, and the domain promotes an allosteric effect despite the apparent lack of ligand binding. The increased functional scenario of this abundant domain encouraged us to search for distant homologs, aiming to enhance the understanding of the ACT domain in general and the ACT domain of PAH in particular. The PDB was searched using the FATCAT server with the ACT domain of PAH as a query. The hits that were confirmed by the SSAP algorithm were divided into known ACT domains (KADs) and potential ACT domains (PADs). The FATCAT/SSAP procedure recognized most of the established KADs, as well 18 so far unrecognized non-redundant PADs with extremely low sequence identities and high divergence in functionality and oligomerization. However, analysis of the structural similarity provides remarkable clustering of the proteins according to similarities in ligand binding. Despite enormous sequence divergence and high functional variability, there is a common regulatory theme among these domains. The results reveal the close relationships of the ACT domain of PAH with amino acid binding and metallobinding ACT domains and with acylphosphatase.
Collapse
|
47
|
Hermoso A, Espadaler J, Enrique Querol E, Aviles FX, Sternberg MJ, Oliva B, Fernandez-Fuentes N. Including Functional Annotations and Extending the Collection of Structural Classifications of Protein Loops (ArchDB). Bioinform Biol Insights 2008. [DOI: 10.1177/117793220700100004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Loops represent an important part of protein structures. The study of loop is critical for two main reasons: First, loops are often involved in protein function, stability and folding. Second, despite improvements in experimental and computational structure prediction methods, modeling the conformation of loops remains problematic. Here, we present a structural classification of loops, ArchDB, a mine of information with application in both mentioned fields: loop structure prediction and function prediction. ArchDB ( http://sbi.imim.es/archdb ) is a database of classified protein loop motifs. The current database provides four different classification sets tailored for different purposes. ArchDB-40, a loop classification derived from SCOP40, well suited for modeling common loop motifs. Since features relevant to loop structure or function can be more easily determined on well-populated clusters, we have developed ArchDB-95, a loop classification derived from SCOP95. This new classification set shows a ~40% increase in the number of subclasses, and a large 7-fold increase in the number of putative structure/function-related subclasses. We also present ArchDB-EC, a classification of loop motifs from enzymes, and ArchDB-KI, a manually annotated classification of loop motifs from kinases. Information about ligand contacts and PDB sites has been included in all classification sets. Improvements in our classification scheme are described, as well as several new database features, such as the ability to query by conserved annotations, sequence similarity, or uploading 3D coordinates of a protein. The lengths of classified loops range between 0 and 36 residues long. ArchDB offers an exhaustive sampling of loop structures. Functional information about loops and links with related biological databases are also provided. All this information and the possibility to browse/query the database through a web-server outline an useful tool with application in the comparative study of loops, the analysis of loops involved in protein function and to obtain templates for loop modeling.
Collapse
Affiliation(s)
- Antoni Hermoso
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Jordi Espadaler
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
- Laboratori de Bioinformàtica Estructural (GRIB), Universitat Pompeu Fabra/IMIM, Parc de Recerca Biomèdica de Barcelona, Barcelona 08003, Catalonia, Spain
| | - E Enrique Querol
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Francesc X. Aviles
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Michael J.E. Sternberg
- Structural Bioinformatics Group, Department of Biological Sciences, Imperial College, London SW7 2AZ, U.K
| | - Baldomero Oliva
- Laboratori de Bioinformàtica Estructural (GRIB), Universitat Pompeu Fabra/IMIM, Parc de Recerca Biomèdica de Barcelona, Barcelona 08003, Catalonia, Spain
| | - Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, Section of Experimental Therapeutics, St. James University Hospital, Leeds LS7 9TF. U.K
| |
Collapse
|
48
|
Babor M, Gerzon S, Raveh B, Sobolev V, Edelman M. Prediction of transition metal-binding sites from apo protein structures. Proteins 2008; 70:208-17. [PMID: 17657805 DOI: 10.1002/prot.21587] [Citation(s) in RCA: 88] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Metal ions are crucial for protein function. They participate in enzyme catalysis, play regulatory roles, and help maintain protein structure. Current tools for predicting metal-protein interactions are based on proteins crystallized with their metal ions present (holo forms). However, a majority of resolved structures are free of metal ions (apo forms). Moreover, metal binding is a dynamic process, often involving conformational rearrangement of the binding pocket. Thus, effective predictions need to be based on the structure of the apo state. Here, we report an approach that identifies transition metal-binding sites in apo forms with a resulting selectivity >95%. Applying the approach to apo forms in the Protein Data Bank and structural genomics initiative identifies a large number of previously unknown, putative metal-binding sites, and their amino acid residues, in some cases providing a first clue to the function of the protein.
Collapse
Affiliation(s)
- Mariana Babor
- Department of Plant Sciences, Weizmann Institute of Science, Rehovot, Israel
| | | | | | | | | |
Collapse
|
49
|
Kmiecik S, Kolinski A. Folding pathway of the b1 domain of protein G explored by multiscale modeling. Biophys J 2007; 94:726-36. [PMID: 17890394 PMCID: PMC2186257 DOI: 10.1529/biophysj.107.116095] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The understanding of the folding mechanisms of single-domain proteins is an essential step in the understanding of protein folding in general. Recently, we developed a mesoscopic CA-CB side-chain protein model, which was successfully applied in protein structure prediction, studies of protein thermodynamics, and modeling of protein complexes. In this research, this model is employed in a detailed characterization of the folding process of a simple globular protein, the B1 domain of IgG-binding protein G (GB1). There is a vast body of experimental facts and theoretical findings for this protein. Performing unbiased, ab initio simulations, we demonstrated that the GB1 folding proceeds via the formation of an extended folding nucleus, followed by slow structure fine-tuning. Remarkably, a subset of native interactions drives the folding from the very beginning. The emerging comprehensive picture of GB1 folding perfectly matches and extends the previous experimental and theoretical studies.
Collapse
Affiliation(s)
| | - Andrzej Kolinski
- Address reprint requests to Andrzej Kolinski, Faculty of Chemistry, University of Warsaw, L. Pasteura 1, 02-093 Warsaw, Poland. Tel.: 48-022-8220211 ext. 320; Fax: 48-022 820221.
| |
Collapse
|
50
|
|