1
|
Cagiada M, Ovchinnikov S, Lindorff‐Larsen K. Predicting absolute protein folding stability using generative models. Protein Sci 2025; 34:e5233. [PMID: 39673466 PMCID: PMC11645669 DOI: 10.1002/pro.5233] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2024] [Revised: 10/30/2024] [Accepted: 11/11/2024] [Indexed: 12/16/2024]
Abstract
While there has been substantial progress in our ability to predict changes in protein stability due to amino acid substitutions, progress has been slower in methods to predict the absolute stability of a protein. Here, we show how a generative model for protein sequence can be leveraged to predict absolute protein stability. We benchmark our predictions across a broad set of proteins and find a mean error of 1.5 kcal/mol and a correlation coefficient of 0.7 for the absolute stability across a range of natural, small- to medium-sized proteins up to ca. 150 amino acid residues. We analyze current limitations and future directions including how such a model may be useful for predicting conformational free energies. Our approach is simple to use and freely available at an online implementation available via https://github.com/KULL-Centre/_2024_cagiada_stability.
Collapse
Affiliation(s)
- Matteo Cagiada
- Linderstrøm‐Lang Centre for Protein Science, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| | - Sergey Ovchinnikov
- Department of BiologyMassachusetts Institute of TechnologyCambridgeMassachusettsUSA
| | - Kresten Lindorff‐Larsen
- Linderstrøm‐Lang Centre for Protein Science, Department of BiologyUniversity of CopenhagenCopenhagenDenmark
| |
Collapse
|
2
|
Ogun OJ, Thaller G, Becker D. Molecular Structural Analysis of Porcine CMAH-Native Ligand Complex and High Throughput Virtual Screening to Identify Novel Inhibitors. Pathogens 2023; 12:pathogens12050684. [PMID: 37242354 DOI: 10.3390/pathogens12050684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 05/01/2023] [Accepted: 05/02/2023] [Indexed: 05/28/2023] Open
Abstract
Porcine meat is the most consumed red meat worldwide. Pigs are also vital tools in biological and medical research. However, xenoreactivity between porcine's N-glycolylneuraminic acid (Neu5Gc) and human anti-Neu5Gc antibodies poses a significant challenge. On the one hand, dietary Neu5Gc intake has been connected to particular human disorders. On the other hand, some pathogens connected to pig diseases have a preference for Neu5Gc. The Cytidine monophospho-N-acetylneuraminic acid hydroxylase (CMAH) catalyses the conversion of N-acetylneuraminic acid (Neu5Ac) to Neu5Gc. In this study, we predicted the tertiary structure of CMAH, performed molecular docking, and analysed the protein-native ligand complex. We performed a virtual screening from a drug library of 5M compounds and selected the two top inhibitors with Vina scores of -9.9 kcal/mol for inhibitor 1 and -9.4 kcal/mol for inhibitor 2. We further analysed their pharmacokinetic and pharmacophoric properties. We conducted stability analyses of the complexes with molecular dynamic simulations of 200 ns and binding free energy calculations. The overall analyses revealed the inhibitors' stable binding, which was further validated by the MMGBSA studies. In conclusion, this result may pave the way for future studies to determine how to inhibit CMAH activities. Further in vitro studies can provide in-depth insight into these compounds' therapeutic potential.
Collapse
Affiliation(s)
- Oluwamayowa Joshua Ogun
- Institute of Animal Breeding and Husbandry, University of Kiel, Olshausenstraße 40, 24098 Kiel, Germany
| | - Georg Thaller
- Institute of Animal Breeding and Husbandry, University of Kiel, Olshausenstraße 40, 24098 Kiel, Germany
| | - Doreen Becker
- Institute of Genome Biology, Research Institute for Farm Animal Biology (FBN), Wilhelm-Stahl-Allee 2, 18196 Dummerstorf, Germany
| |
Collapse
|
3
|
McBride JM, Eckmann JP, Tlusty T. General Theory of Specific Binding: Insights from a Genetic-Mechano-Chemical Protein Model. Mol Biol Evol 2022; 39:msac217. [PMID: 36208205 PMCID: PMC9641994 DOI: 10.1093/molbev/msac217] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Proteins need to selectively interact with specific targets among a multitude of similar molecules in the cell. However, despite a firm physical understanding of binding interactions, we lack a general theory of how proteins evolve high specificity. Here, we present such a model that combines chemistry, mechanics, and genetics and explains how their interplay governs the evolution of specific protein-ligand interactions. The model shows that there are many routes to achieving molecular discrimination-by varying degrees of flexibility and shape/chemistry complementarity-but the key ingredient is precision. Harder discrimination tasks require more collective and precise coaction of structure, forces, and movements. Proteins can achieve this through correlated mutations extending far from a binding site, which fine-tune the localized interaction with the ligand. Thus, the solution of more complicated tasks is enabled by increasing the protein size, and proteins become more evolvable and robust when they are larger than the bare minimum required for discrimination. The model makes testable, specific predictions about the role of flexibility and shape mismatch in discrimination, and how evolution can independently tune affinity and specificity. Thus, the proposed theory of specific binding addresses the natural question of "why are proteins so big?". A possible answer is that molecular discrimination is often a hard task best performed by adding more layers to the protein.
Collapse
Affiliation(s)
- John M McBride
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
| | - Jean-Pierre Eckmann
- Département de Physique Théorique and Section de Mathématiques, University of Geneva, Geneva, Switzerland
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, South Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology, Ulsan 44919, South Korea
| |
Collapse
|
4
|
Harman JL, Reardon PN, Costello SM, Warren GD, Phillips SR, Connor PJ, Marqusee S, Harms MJ. Evolution avoids a pathological stabilizing interaction in the immune protein S100A9. Proc Natl Acad Sci U S A 2022; 119:e2208029119. [PMID: 36194634 PMCID: PMC9565474 DOI: 10.1073/pnas.2208029119] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2022] [Accepted: 09/07/2022] [Indexed: 01/03/2023] Open
Abstract
Stability constrains evolution. While much is known about constraints on destabilizing mutations, less is known about the constraints on stabilizing mutations. We recently identified a mutation in the innate immune protein S100A9 that provides insight into such constraints. When introduced into human S100A9, M63F simultaneously increases the stability of the protein and disrupts its natural ability to activate Toll-like receptor 4. Using chemical denaturation, we found that M63F stabilizes a calcium-bound conformation of hS100A9. We then used NMR to solve the structure of the mutant protein, revealing that the mutation distorts the hydrophobic binding surface of hS100A9, explaining its deleterious effect on function. Hydrogen-deuterium exchange (HDX) experiments revealed stabilization of the region around M63F in the structure, notably Phe37. In the structure of the M63F mutant, the Phe37 and Phe63 sidechains are in contact, plausibly forming an edge-face π-stack. Mutating Phe37 to Leu abolished the stabilizing effect of M63F as probed by both chemical denaturation and HDX. It also restored the biological activity of S100A9 disrupted by M63F. These findings reveal that Phe63 creates a molecular staple with Phe37 that stabilizes a nonfunctional conformation of the protein, thus disrupting function. Using a bioinformatic analysis, we found that S100A9 proteins from different organisms rarely have Phe at both positions 37 and 63, suggesting that avoiding a pathological stabilizing interaction indeed constrains S100A9 evolution. This work highlights an important evolutionary constraint on stabilizing mutations, namely, that they must avoid inappropriately stabilizing nonfunctional protein conformations.
Collapse
Affiliation(s)
- Joseph L. Harman
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick N. Reardon
- College of Science, NMR Facility, Oregon State University, Corvallis, OR 97331
| | - Shawn M. Costello
- Biophysics Graduate Program, University of California, Berkeley, Berkeley, CA 94720
| | - Gus D. Warren
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Sophia R. Phillips
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Patrick J. Connor
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| | - Susan Marqusee
- Department of Molecular and Cell Biology, University of California, Berkeley, Berkeley, CA 94720
- Department of Chemistry, University of California, Berkeley, Berkeley, CA 94720
- California Institute for Quantitative Biosciences, University of California, Berkeley, Berkeley, CA 94720
| | - Michael J. Harms
- Department of Chemistry and Biochemistry, University of Oregon, Eugene, OR 97403
- Institute of Molecular Biology, University of Oregon, Eugene, OR 97403
| |
Collapse
|
5
|
Interactions of the Receptor Binding Domain of SARS-CoV-2 Variants with hACE2: Insights from Molecular Docking Analysis and Molecular Dynamic Simulation. BIOLOGY 2021; 10:biology10090880. [PMID: 34571756 PMCID: PMC8470537 DOI: 10.3390/biology10090880] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 08/28/2021] [Accepted: 09/02/2021] [Indexed: 12/23/2022]
Abstract
Since the beginning of the coronavirus 19 (COVID-19) pandemic in late 2019, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been evolving through the acquisition of genomic mutations, leading to the emergence of multiple variants of concern (VOCs) and variants of interest (VOIs). Currently, four VOCs (Alpha, Beta, Delta, and Gamma) and seven VOIs (Epsilon, Zeta, Eta, Theta, Iota, Kappa, and Lambda) of SARS-CoV-2 have been identified in worldwide circulation. Here, we investigated the interactions of the receptor-binding domain (RBD) of five SARS-CoV-2 variants with the human angiotensin-converting enzyme 2 (hACE2) receptor in host cells, to determine the extent of molecular divergence and the impact of mutation, using protein-protein docking and dynamics simulation approaches. Along with the wild-type (WT) SARS-CoV-2, this study included the Brazilian (BR/lineage P.1/Gamma), Indian (IN/lineage B.1.617/Delta), South African (SA/lineage B.1.351/Beta), United Kingdom (UK/lineage B.1.1.7/Alpha), and United States (US/lineage B.1.429/Epsilon) variants. The protein-protein docking and dynamics simulation studies revealed that these point mutations considerably affected the structural behavior of the spike (S) protein compared to the WT, which also affected the binding of RBD with hACE2 at the respective sites. Additional experimental studies are required to determine whether these effects have an influence on drug-S protein binding and its potential therapeutic effect.
Collapse
|
6
|
Ban X, Xie X, Li C, Gu Z, Hong Y, Cheng L, Kaustubh B, Li Z. The desirable salt bridges in amylases: Distribution, configuration and location. Food Chem 2021; 354:129475. [PMID: 33744660 DOI: 10.1016/j.foodchem.2021.129475] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 01/30/2021] [Accepted: 02/22/2021] [Indexed: 12/07/2022]
Abstract
The α-amylases are the most widely used industrial enzymes, and are particularly useful as liquifying enzymes in industrial processes based upon starch. Since starch liquefication is carried out at evaluated temperatures, typically above 60 °C, there is substantial demand for thermostable α -amylases. Most naturally occurring α -amylases exhibit moderate thermostability, so substantial effort has been invested in attempts to increase their thermostability. One structural feature that has the potential to increase protein thermostability is the introduction of salt bridges. However, not every salt bridge contributes to protein thermostability. The salt bridges in amylases have their characteristics in terms of distribution, configuration and location. The summary of these features helps to introduce new salt bridges based on the characteristics. This review focuses on salt bridges of α-amylases, both naturally present and introduced using mutagenesis. Its aim is to provide a bird's eye view of distribution, configuration, location of desirable salt bridges.
Collapse
Affiliation(s)
- Xiaofeng Ban
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China
| | - Xiaofang Xie
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China
| | - Caiming Li
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China
| | - Zhengbiao Gu
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China; School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China; Synergetic Innovation Center of Food Safety and Nutrition, Jiangnan University, Wuxi, Jiangsu 214122, People's Republic of China
| | - Yan Hong
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China
| | - Li Cheng
- School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China
| | - Bhalerao Kaustubh
- Department of Agricultural and Biological Engineering, University of Illinois at Urbana-Champaign, USA
| | - Zhaofeng Li
- State Key Laboratory of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China; National Engineering Laboratory for Cereal Fermentation Technology, Wuxi 214122, People's Republic of China; School of Food Science and Technology, Jiangnan University, Wuxi 214122, People's Republic of China.
| |
Collapse
|
7
|
Guckeisen T, Hosseinpour S, Peukert W. Effect of pH and urea on the proteins secondary structure at the water/air interface and in solution. J Colloid Interface Sci 2021; 590:38-49. [PMID: 33524719 DOI: 10.1016/j.jcis.2021.01.015] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2020] [Revised: 01/05/2021] [Accepted: 01/06/2021] [Indexed: 01/09/2023]
Abstract
HYPOTHESIS The secondary structure of proteins affects their functionality and performance in physiological environments or industrial applications. Change of the solution pH or the presence of protein denaturants are the main chemical means that can alter the secondary structure of proteins or lead to protein denaturation. Since proteins in the bulk solution and those residing at the solution/air interface experience different local environments, their response to chemical denaturation can be different. EXPERIMENTS We utilize circular dichroism and chiral/achiral sum frequency generation spectroscopy to study the secondary structure of selected proteins as a function of the solution pH or in the presence of 8 M urea in the bulk solution and at the solution/air interface, respectively. FINDINGS The liquid/air interface can enhance or decrease protein conformation stability. The change in the secondary structure of the surface adsorbed proteins in alkaline solutions occurs at pH values lower than those denaturing the studied proteins in the bulk solution. In contrast, while 8 M urea completely denatures the studied proteins in the bulk solution, the liquid/air interface prevents the urea-induced denaturation of the surface adsorbed proteins by limiting the access of urea to the hydrophobic side chains of proteins protruding to air.
Collapse
Affiliation(s)
- Tobias Guckeisen
- Institute of Particle Technology (LFG), Friedrich-Alexander-Universität-Erlangen-Nürnberg (FAU), Cauerstraße 4, 91058 Erlangen, Germany.
| | - Saman Hosseinpour
- Institute of Particle Technology (LFG), Friedrich-Alexander-Universität-Erlangen-Nürnberg (FAU), Cauerstraße 4, 91058 Erlangen, Germany.
| | - Wolfgang Peukert
- Institute of Particle Technology (LFG), Friedrich-Alexander-Universität-Erlangen-Nürnberg (FAU), Cauerstraße 4, 91058 Erlangen, Germany.
| |
Collapse
|
8
|
Schwersensky M, Rooman M, Pucci F. Large-scale in silico mutagenesis experiments reveal optimization of genetic code and codon usage for protein mutational robustness. BMC Biol 2020; 18:146. [PMID: 33081759 PMCID: PMC7576759 DOI: 10.1186/s12915-020-00870-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 09/16/2020] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND How, and the extent to which, evolution acts on DNA and protein sequences to ensure mutational robustness and evolvability is a long-standing open question in the field of molecular evolution. We addressed this issue through the first structurome-scale computational investigation, in which we estimated the change in folding free energy upon all possible single-site mutations introduced in more than 20,000 protein structures, as well as through available experimental stability and fitness data. RESULTS At the amino acid level, we found the protein surface to be more robust against random mutations than the core, this difference being stronger for small proteins. The destabilizing and neutral mutations are more numerous in the core and on the surface, respectively, whereas the stabilizing mutations are about 4% in both regions. At the genetic code level, we observed smallest destabilization for mutations that are due to substitutions of base III in the codon, followed by base I, bases I+III, base II, and other multiple base substitutions. This ranking highly anticorrelates with the codon-anticodon mispairing frequency in the translation process. This suggests that the standard genetic code is optimized to limit the impact of random mutations, but even more so to limit translation errors. At the codon level, both the codon usage and the usage bias appear to optimize mutational robustness and translation accuracy, especially for surface residues. CONCLUSION Our results highlight the non-universality of mutational robustness and its multiscale dependence on protein features, the structure of the genetic code, and the codon usage. Our analyses and approach are strongly supported by available experimental mutagenesis data.
Collapse
Affiliation(s)
- Martin Schwersensky
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, CP 165/61, Roosevelt Ave. 50, Brussels, 1050, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, CP 165/61, Roosevelt Ave. 50, Brussels, 1050, Belgium.
- Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, Brussels, 1050, Belgium.
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, CP 165/61, Roosevelt Ave. 50, Brussels, 1050, Belgium.
- Interuniversity Institute of Bioinformatics in Brussels, Boulevard du Triomphe, Brussels, 1050, Belgium.
| |
Collapse
|
9
|
Aydınkal RM, Serçinoğlu O, Ozbek P. ProSNEx: a web-based application for exploration and analysis of protein structures using network formalism. Nucleic Acids Res 2019; 47:W471-W476. [PMID: 31114881 PMCID: PMC6602423 DOI: 10.1093/nar/gkz390] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2019] [Revised: 04/17/2019] [Accepted: 05/09/2019] [Indexed: 01/14/2023] Open
Abstract
ProSNEx (Protein Structure Network Explorer) is a web service for construction and analysis of Protein Structure Networks (PSNs) alongside amino acid flexibility, sequence conservation and annotation features. ProSNEx constructs a PSN by adding nodes to represent residues and edges between these nodes using user-specified interaction distance cutoffs for either carbon-alpha, carbon-beta or atom-pair contact networks. Different types of weighted networks can also be constructed by using either (i) the residue-residue interaction energies in the format returned by gRINN, resulting in a Protein Energy Network (PEN); (ii) the dynamical cross correlations from a coarse-grained Normal Mode Analysis (NMA) of the protein structure; (iii) interaction strength. Upon construction of the network, common network metrics (such as node centralities) as well as shortest paths between nodes and k-cliques are calculated. Moreover, additional features of each residue in the form of conservation scores and mutation/natural variant information are included in the analysis. By this way, tool offers an enhanced and direct comparison of network-based residue metrics with other types of biological information. ProSNEx is free and open to all users without login requirement at http://prosnex-tool.com.
Collapse
Affiliation(s)
- Rasim Murat Aydınkal
- Department of Bioengineering, Faculty of Engineering, Marmara University, Kadikoy, Istanbul 34722, Turkey
- Ali Nihat Gokyigit Foundation, Etiler, Istanbul 34340, Turkey
| | - Onur Serçinoğlu
- Department of Bioengineering, Faculty of Engineering, Marmara University, Kadikoy, Istanbul 34722, Turkey
- Department of Bioengineering, Faculty of Engineering, Recep Tayyip Erdoğan University, Rize 53100, Turkey
| | - Pemra Ozbek
- Department of Bioengineering, Faculty of Engineering, Marmara University, Kadikoy, Istanbul 34722, Turkey
| |
Collapse
|
10
|
Feyertag F, Alvarez-Ponce D. Disulfide Bonds Enable Accelerated Protein Evolution. Mol Biol Evol 2018; 34:1833-1837. [PMID: 28431018 DOI: 10.1093/molbev/msx135] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The different proteins of any proteome evolve at enormously different rates. What factors contribute to this variability, and to what extent, is still a largely open question. We hypothesized that disulfide bonds, by increasing protein stability, should make proteins' structures relatively independent of their amino acid sequences, thus acting as buffers of deleterious mutations and enabling accelerated sequence evolution. In agreement with this hypothesis, we observed that membrane proteins with disulfide bonds evolved 88% faster than those without disulfide bonds, and that extracellular proteins with disulfide bonds evolved 49% faster than those without disulfide bonds. In addition, genes encoding proteins with disulfide bonds exhibit an increased likelihood of showing signatures of positive selection. Multivariate analyses indicate that the trend is independent of a number of potentially confounding factors. The effect, however, is not observed among the longest proteins, which can become stabilized by mechanisms other than disulfide bonds.
Collapse
Affiliation(s)
- Felix Feyertag
- Department of Biology, University of Nevada-Reno, Reno, NV
| | | |
Collapse
|
11
|
Rivera-de-Torre E, Palacios-Ortega J, García-Linares S, Gavilanes JG, Martínez-Del-Pozo Á. One single salt bridge explains the different cytolytic activities shown by actinoporins sticholysin I and II from the venom of Stichodactyla helianthus. Arch Biochem Biophys 2017; 636:79-89. [PMID: 29138096 DOI: 10.1016/j.abb.2017.11.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2017] [Revised: 11/06/2017] [Accepted: 11/10/2017] [Indexed: 10/18/2022]
Abstract
Sticholysins I and II (StnI and StnII), α-pore forming toxins from the sea anemone Stichodactyla helianthus, are water-soluble toxic proteins which upon interaction with lipid membranes of specific composition bind to the bilayer, extend and insert their N-terminal α-helix, and become oligomeric integral membrane structures. The result is a pore that leads to cell death by osmotic shock. StnI and StnII show 93% of sequence identity, but also different membrane pore-forming activities. The hydrophobicity profile along the first 18 residues revealed differences which were canceled by substituting StnI amino acids 2 and 9. Accordingly, the StnID9A mutant, and the corresponding StnIE2AD9A variant, showed enhanced hemolytic activity. They also revealed a key role for an exposed salt bridge between Asp9 and Lys68. This interaction is not possible in StnII but appears conserved in the other two well-characterized actinoporins, equinatoxin II and fragaceatoxin C. The StnII mutant A8D showed that this single replacement was enough to transform StnII into a version with impaired pore-forming activity. Overall, the results show the key importance of this salt bridge linking the N-terminal stretch to the β-sandwich core. A conclusion of general application for the understanding of salt bridges role in protein design, folding and stability.
Collapse
Affiliation(s)
- Esperanza Rivera-de-Torre
- Departamento de Bioquímica y Biología Molecular I, Facultades de Química y Biología, Universidad Complutense, 28040 Madrid, Spain
| | - Juan Palacios-Ortega
- Departamento de Bioquímica y Biología Molecular I, Facultades de Química y Biología, Universidad Complutense, 28040 Madrid, Spain
| | - Sara García-Linares
- Departamento de Bioquímica y Biología Molecular I, Facultades de Química y Biología, Universidad Complutense, 28040 Madrid, Spain
| | - José G Gavilanes
- Departamento de Bioquímica y Biología Molecular I, Facultades de Química y Biología, Universidad Complutense, 28040 Madrid, Spain.
| | - Álvaro Martínez-Del-Pozo
- Departamento de Bioquímica y Biología Molecular I, Facultades de Química y Biología, Universidad Complutense, 28040 Madrid, Spain.
| |
Collapse
|
12
|
Brown KL, Cummings CF, Vanacore RM, Hudson BG. Building collagen IV smart scaffolds on the outside of cells. Protein Sci 2017; 26:2151-2161. [PMID: 28845540 DOI: 10.1002/pro.3283] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2017] [Revised: 08/22/2017] [Accepted: 08/23/2017] [Indexed: 12/22/2022]
Abstract
Collagen IV scaffolds assemble through an intricate pathway that begins intracellularly and is completed extracellularly. Multiple intracellular enzymes act in concert to assemble collagen IV protomers, the building blocks of collagen IV scaffolds. After being secreted from cells, protomers are activated to initiate oligomerization, forming insoluble networks that are structurally reinforced with covalent crosslinks. Within these networks, embedded binding sites along the length of the protomer lead to the "decoration" of collagen IV triple helix with numerous functional molecules. We refer to these networks as "smart" scaffolds, which as a component of the basement membrane enable the development and function of multicellular tissues in all animal phyla. In this review, we present key molecular mechanisms that drive the assembly of collagen IV smart scaffolds.
Collapse
Affiliation(s)
- Kyle L Brown
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Center for Structural Biology, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, Tennessee, 37232
| | | | - Roberto M Vanacore
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, Tennessee, 37232
| | - Billy G Hudson
- Department of Medicine, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Department of Biochemistry, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Cell and Developmental Biology, Vanderbilt University Medical Center, Nashville, Tennessee, 37232.,Department of Pathology, Microbiology and Immunology, Vanderbilt University Medical Center, Nashville, Tennessee, 37232
| |
Collapse
|
13
|
Cummings CF, Pedchenko V, Brown KL, Colon S, Rafi M, Jones-Paris C, Pokydeshava E, Liu M, Pastor-Pareja JC, Stothers C, Ero-Tolliver IA, McCall AS, Vanacore R, Bhave G, Santoro S, Blackwell TS, Zent R, Pozzi A, Hudson BG. Extracellular chloride signals collagen IV network assembly during basement membrane formation. J Cell Biol 2017; 213:479-94. [PMID: 27216258 PMCID: PMC4878091 DOI: 10.1083/jcb.201510065] [Citation(s) in RCA: 46] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2015] [Accepted: 04/29/2016] [Indexed: 01/07/2023] Open
Abstract
Basement membranes are defining features of the cellular microenvironment; however, little is known regarding their assembly outside cells. We report that extracellular Cl(-) ions signal the assembly of collagen IV networks outside cells by triggering a conformational switch within collagen IV noncollagenous 1 (NC1) domains. Depletion of Cl(-) in cell culture perturbed collagen IV networks, disrupted matrix architecture, and repositioned basement membrane proteins. Phylogenetic evidence indicates this conformational switch is a fundamental mechanism of collagen IV network assembly throughout Metazoa. Using recombinant triple helical protomers, we prove that NC1 domains direct both protomer and network assembly and show in Drosophila that NC1 architecture is critical for incorporation into basement membranes. These discoveries provide an atomic-level understanding of the dynamic interactions between extracellular Cl(-) and collagen IV assembly outside cells, a critical step in the assembly and organization of basement membranes that enable tissue architecture and function. Moreover, this provides a mechanistic framework for understanding the molecular pathobiology of NC1 domains.
Collapse
Affiliation(s)
- Christopher F Cummings
- Department of Biochemistry, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Vadim Pedchenko
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Kyle L Brown
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Structural Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Selene Colon
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Aspirnaut Program, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Mohamed Rafi
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Celestial Jones-Paris
- Aspirnaut Program, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Elena Pokydeshava
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Min Liu
- School of Life Sciences, Tsinghua University, Beijing 100084, China
| | | | - Cody Stothers
- Department of Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Aspirnaut Program, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Isi A Ero-Tolliver
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Aspirnaut Program, Vanderbilt University Medical Center, Nashville, TN 37232
| | - A Scott McCall
- Department of Pharmacology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Roberto Vanacore
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Gautam Bhave
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Samuel Santoro
- Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Timothy S Blackwell
- Department of Medicine, Division of Allergy, Pulmonary, and Critical Care Medicine, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Roy Zent
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Cell and Developmental Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Ambra Pozzi
- Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Cancer Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Molecular Physiology and Biophysics, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Billy G Hudson
- Department of Biochemistry, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Medicine, Division of Nephrology and Hypertension, Vanderbilt University Medical Center, Nashville, TN 37232 Center for Matrix Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Aspirnaut Program, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Pathology, Microbiology, and Immunology, Vanderbilt University Medical Center, Nashville, TN 37232 Department of Cell and Developmental Biology, Vanderbilt University Medical Center, Nashville, TN 37232 Vanderbilt Ingram Cancer Center, Vanderbilt University Medical Center, Nashville, TN 37232 Vanderbilt Institute of Chemical Biology, Vanderbilt University Medical Center, Nashville, TN 37232
| |
Collapse
|
14
|
Bastolla U, Dehouck Y, Echave J. What evolution tells us about protein physics, and protein physics tells us about evolution. Curr Opin Struct Biol 2017; 42:59-66. [DOI: 10.1016/j.sbi.2016.10.020] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Revised: 10/19/2016] [Accepted: 10/24/2016] [Indexed: 12/21/2022]
|
15
|
Yu S, Du T, Liu Z, Wu Q, Feng G, Dong M, Zhou X, Jiang L, Dai Q. Im10A, a short conopeptide isolated from Conus imperialis and possesses two highly concentrated disulfide bridges and analgesic activity. Peptides 2016; 81:15-20. [PMID: 27131596 DOI: 10.1016/j.peptides.2016.04.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 02/01/2016] [Revised: 04/02/2016] [Accepted: 04/26/2016] [Indexed: 12/27/2022]
Abstract
In the present study, we isolated, synthesized and NMR structurally characterized a novel conopeptide Im10A consisting of 11 amino acids (NTICCEGCMCY-NH2) from Conus imperialis. Unlike other conopeptides with four cysteine residues, Im10A had only two residues in loop 1 and one residue in loop 2 (CC-loop1-C-loop2-C), which formed a stable disulfide connectivity "I-IV, II- III" (framework X) with a type I β-turn. Interestingly, Im10A exhibited 50.7% analgesic activity on rat partial sciatic nerve ligation (PNL) at 2h after Im10A administration. However, 10μM Im10A exhibited no apparent effect on neuronal nicotinic acetylcholine receptor, and it did not target DRG voltage-dependent sodium, potassium and calcium ion channels and opioid receptor. To our knowledge, Im10A had the most concentrated disulfide bridges among conopeptides with four cysteine residues. This finding provided a new motif for the future development of biomimetic compounds.
Collapse
Affiliation(s)
- Shuo Yu
- Beijing Institute of Biotechnology, Beijing 10071, PR China
| | - Tianpeng Du
- Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, State Key laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Chinese Academy of Science, Wuhan 430071, PR China
| | - Zhuguo Liu
- Beijing Institute of Biotechnology, Beijing 10071, PR China
| | - Qiaoling Wu
- Beijing Institute of Biotechnology, Beijing 10071, PR China
| | - Guixue Feng
- Beijing Institute of Biotechnology, Beijing 10071, PR China
| | - Mingxin Dong
- Beijing Institute of Biotechnology, Beijing 10071, PR China
| | - Xiaowei Zhou
- Beijing Institute of Biotechnology, Beijing 10071, PR China
| | - Ling Jiang
- Key Laboratory of Magnetic Resonance in Biological Systems, National Center for Magnetic Resonance in Wuhan, State Key laboratory of Magnetic Resonance and Atomic and Molecular Physics, Wuhan Institute of Physics and Mathematics, Chinese Academy of Science, Wuhan 430071, PR China.
| | - Qiuyun Dai
- Beijing Institute of Biotechnology, Beijing 10071, PR China.
| |
Collapse
|
16
|
Folding and assembly of the large molecular machine Hsp90 studied in single-molecule experiments. Proc Natl Acad Sci U S A 2016; 113:1232-7. [PMID: 26787848 DOI: 10.1073/pnas.1518827113] [Citation(s) in RCA: 48] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Folding of small proteins often occurs in a two-state manner and is well understood both experimentally and theoretically. However, many proteins are much larger and often populate misfolded states, complicating their folding process significantly. Here we study the complete folding and assembly process of the 1,418 amino acid, dimeric chaperone Hsp90 using single-molecule optical tweezers. Although the isolated C-terminal domain shows two-state folding, we find that the isolated N-terminal as well as the middle domain populate ensembles of fast-forming, misfolded states. These intradomain misfolds slow down folding by an order of magnitude. Modeling folding as a competition between productive and misfolding pathways allows us to fully describe the folding kinetics. Beyond intradomain misfolding, folding of the full-length protein is further slowed by the formation of interdomain misfolds, suggesting that with growing chain lengths, such misfolds will dominate folding kinetics. Interestingly, we find that small stretching forces applied to the chain can accelerate folding by preventing the formation of cross-domain misfolding intermediates by leading the protein along productive pathways to the native state. The same effect is achieved by cotranslational folding at the ribosome in vivo.
Collapse
|
17
|
Detecting selection on protein stability through statistical mechanical models of folding and evolution. Biomolecules 2014; 4:291-314. [PMID: 24970217 PMCID: PMC4030984 DOI: 10.3390/biom4010291] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/25/2013] [Revised: 02/13/2014] [Accepted: 02/14/2014] [Indexed: 12/31/2022] Open
Abstract
The properties of biomolecules depend both on physics and on the evolutionary process that formed them. These two points of view produce a powerful synergism. Physics sets the stage and the constraints that molecular evolution has to obey, and evolutionary theory helps in rationalizing the physical properties of biomolecules, including protein folding thermodynamics. To complete the parallelism, protein thermodynamics is founded on the statistical mechanics in the space of protein structures, and molecular evolution can be viewed as statistical mechanics in the space of protein sequences. In this review, we will integrate both points of view, applying them to detecting selection on the stability of the folded state of proteins. We will start discussing positive design, which strengthens the stability of the folded against the unfolded state of proteins. Positive design justifies why statistical potentials for protein folding can be obtained from the frequencies of structural motifs. Stability against unfolding is easier to achieve for longer proteins. On the contrary, negative design, which consists in destabilizing frequently formed misfolded conformations, is more difficult to achieve for longer proteins. The folding rate can be enhanced by strengthening short-range native interactions, but this requirement contrasts with negative design, and evolution has to trade-off between them. Finally, selection can accelerate functional movements by favoring low frequency normal modes of the dynamics of the native state that strongly correlate with the functional conformation change.
Collapse
|
18
|
Bošnjak I, Bojović V, Šegvić-Bubić T, Bielen A. Occurrence of protein disulfide bonds in different domains of life: a comparison of proteins from the Protein Data Bank. Protein Eng Des Sel 2014; 27:65-72. [PMID: 24407015 DOI: 10.1093/protein/gzt063] [Citation(s) in RCA: 55] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Disulfide bonds (SS bonds) are important post-translational modifications of proteins. They stabilize a three-dimensional (3D) structure (structural SS bonds) and also have the catalytic or regulatory functions (redox-active SS bonds). Although SS bonds are present in all groups of organisms, no comparative analyses of their frequency in proteins from different domains of life have been made to date. Using the Protein Data Bank, the number and subcellular locations of SS bonds in Archaea, Bacteria and Eukarya have been compared. Approximately three times higher frequency of proteins with SS bonds in eukaryotic secretory organelles (e.g. endoplasmic reticulum) than in bacterial periplasmic/secretory pathways was calculated. Protein length also affects the SS bond frequency: the average number of SS bonds is positively correlated with the length for longer proteins (>200 amino acids), while for the shorter and less stable proteins (<200 amino acids) this correlation is negative. Medium-sized proteins (250-350 amino acids) indicated a high number of SS bonds only in Archaea which could be explained by the need for additional protein stabilization in hyperthermophiles. The results emphasize higher capacity for the SS bond formation and isomerization in Eukarya when compared with Archaea and Bacteria.
Collapse
Affiliation(s)
- I Bošnjak
- Laboratory for Biology and Microbial Genetics, Department of Biochemical Engineering, Faculty of Food Technology and Biotechnology, Pierottijeva 6, 10000 Zagreb, Croatia
| | | | | | | |
Collapse
|
19
|
Frequent gene fissions associated with human pathogenic bacteria. Genomics 2014; 103:65-75. [DOI: 10.1016/j.ygeno.2014.02.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2013] [Revised: 01/21/2014] [Accepted: 02/01/2014] [Indexed: 01/05/2023]
|
20
|
Bordner AJ, Mittelmann HD. A new formulation of protein evolutionary models that account for structural constraints. Mol Biol Evol 2013; 31:736-49. [PMID: 24307688 DOI: 10.1093/molbev/mst240] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Despite the importance of a thermodynamically stable structure with a conserved fold for protein function, almost all evolutionary models neglect site-site correlations that arise from physical interactions between neighboring amino acid sites. This is mainly due to the difficulty in formulating a computationally tractable model since rate matrices can no longer be used. Here, we introduce a general framework, based on factor graphs, for constructing probabilistic models of protein evolution with site interdependence. Conveniently, efficient approximate inference algorithms, such as Belief Propagation, can be used to calculate likelihoods for these models. We fit an amino acid substitution model of this type that accounts for both solvent accessibility and site-site correlations. Comparisons of the new model with rate matrix models and alternative structure-dependent models demonstrate that it better fits the sequence data. We also examine evolution within a family of homohexameric enzymes and find that site-site correlations between most contacting subunits contribute to a higher likelihood. In addition, we show that the new substitution model has a similar mathematical form to the one introduced in Rodrigue et al. (Rodrigue N, Lartillot N, Bryant D, Philippe H. 2005. Site interdependence attributed to tertiary structure in amino acid sequence evolution. Gene 347:207-217), although with different parameter interpretations and values. We also perform a statistical analysis of the effects of amino acids at neighboring sites on substitution probabilities and find a significant perturbation of most probabilities, further supporting the significant role of site-site interactions in protein evolution and motivating the development of new evolutionary models similar to the one described here. Finally, we discuss possible extensions and applications of the new substitution model.
Collapse
|
21
|
Arenas M, Dos Santos HG, Posada D, Bastolla U. Protein evolution along phylogenetic histories under structurally constrained substitution models. ACTA ACUST UNITED AC 2013; 29:3020-8. [PMID: 24037213 DOI: 10.1093/bioinformatics/btt530] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Models of molecular evolution aim at describing the evolutionary processes at the molecular level. However, current models rarely incorporate information from protein structure. Conversely, structure-based models of protein evolution have not been commonly applied to simulate sequence evolution in a phylogenetic framework, and they often ignore relevant evolutionary processes such as recombination. A simulation evolutionary framework that integrates substitution models that account for protein structure stability should be able to generate more realistic in silico evolved proteins for a variety of purposes. RESULTS We developed a method to simulate protein evolution that combines models of protein folding stability, such that the fitness depends on the stability of the native state both with respect to unfolding and misfolding, with phylogenetic histories that can be either specified by the user or simulated with the coalescent under complex evolutionary scenarios, including recombination, demographics and migration. We have implemented this framework in a computer program called ProteinEvolver. Remarkably, comparing these models with empirical amino acid replacement models, we found that the former produce amino acid distributions closer to distributions observed in real protein families, and proteins that are predicted to be more stable. Therefore, we conclude that evolutionary models that consider protein stability and realistic evolutionary histories constitute a better approximation of the real evolutionary process.
Collapse
Affiliation(s)
- Miguel Arenas
- Centre for Molecular Biology 'Severo Ochoa', Consejo Superior de Investigaciones Científicas (CSIC), Madrid, Spain and Department of Biochemistry, Genetics and Immunology, University of Vigo, Vigo, Spain
| | | | | | | |
Collapse
|
22
|
Gilbert JDJ, Acquisti C, Martinson HM, Elser JJ, Kumar S, Fagan WF. GRASP [Genomic Resource Access for Stoichioproteomics]: comparative explorations of the atomic content of 12 Drosophila proteomes. BMC Genomics 2013; 14:599. [PMID: 24007337 PMCID: PMC3844568 DOI: 10.1186/1471-2164-14-599] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2012] [Accepted: 06/05/2013] [Indexed: 12/05/2022] Open
Abstract
BACKGROUND "Stoichioproteomics" relates the elemental composition of proteins and proteomes to variation in the physiological and ecological environment. To help harness and explore the wealth of hypotheses made possible under this framework, we introduce GRASP (http://www.graspdb.net), a public bioinformatic knowledgebase containing information on the frequencies of 20 amino acids and atomic composition of their side chains. GRASP integrates comparative protein composition data with annotation data from multiple public databases. Currently, GRASP includes information on proteins of 12 sequenced Drosophila (fruit fly) proteomes, which will be expanded to include increasingly diverse organisms over time. In this paper we illustrate the potential of GRASP for testing stoichioproteomic hypotheses by conducting an exploratory investigation into the composition of 12 Drosophila proteomes, testing the prediction that protein atomic content is associated with species ecology and with protein expression levels. RESULTS Elements varied predictably along multivariate axes. Species were broadly similar, with the D. willistoni proteome a clear outlier. As expected, individual protein atomic content within proteomes was influenced by protein function and amino acid biochemistry. Evolution in elemental composition across the phylogeny followed less predictable patterns, but was associated with broad ecological variation in diet. Using expression data available for D. melanogaster, we found evidence consistent with selection for efficient usage of elements within the proteome: as expected, nitrogen content was reduced in highly expressed proteins in most tissues, most strongly in the gut, where nutrients are assimilated, and least strongly in the germline. CONCLUSIONS The patterns identified here using GRASP provide a foundation on which to base future research into the evolution of atomic composition in Drosophila and other taxa.
Collapse
Affiliation(s)
- James D J Gilbert
- A08 Heydon-Lawrence Bdg, University of Sydney, Sydney NSW 2006, Australia
- University of Maryland, College Park, MD 20742, USA
| | - Claudia Acquisti
- WWU Munster, Institute for Evolution and Biodiversity, Hufferstr. 1, Munster 48149, Germany
- Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5301, USA
- School of Life Sciences, Arizona State University, Tempe, AZ 85287-4501, USA
| | | | - James J Elser
- School of Life Sciences, Arizona State University, Tempe, AZ 85287-4501, USA
| | - Sudhir Kumar
- Center for Evolutionary Medicine and Informatics, Biodesign Institute, Arizona State University, Tempe, AZ 85287-5301, USA
- School of Life Sciences, Arizona State University, Tempe, AZ 85287-4501, USA
| | | |
Collapse
|
23
|
Shirota M, Kinoshita K. Analyses of the general rule on residue pair frequencies in local amino acid sequences of soluble, ordered proteins. Protein Sci 2013; 22:725-33. [PMID: 23526551 DOI: 10.1002/pro.2255] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2012] [Revised: 01/26/2013] [Accepted: 03/14/2013] [Indexed: 11/10/2022]
Abstract
The amino acid sequences of soluble, ordered proteins with stable structures have evolved due to biological and physical requirements, thus distinguishing them from random sequences. Previous analyses have focused on extracting the features that frequently appear in protein substructures, such as α-helix and β-sheet, but the universal features of protein sequences have not been addressed. To clarify the differences between native protein sequences and random sequences, we analyzed 7368 soluble, ordered protein sequences, by inspecting the observed and expected occurrences of 400 amino acid pairs in local proximity, up to 10 residues along the sequence in comparison with their expected occurrence in random sequence. We found the trend that the hydrophobic residue pairs and the polar residue pairs are significantly decreased, whereas the pairs between a hydrophobic residue and a polar residue are increased. This trend was universally observed regardless of the secondary structure content but was not observed in protein sequences that include intrinsically disordered regions, indicating that it can be a general rule of protein foldability. The possible benefits of this rule are discussed from the viewpoints of protein aggregation and disorder, which are both caused by low-complexity regions of hydrophobic or polar residues.
Collapse
Affiliation(s)
- Matsuyuki Shirota
- Department of Applied Information Sciences, Graduate School of Information Sciences, Tohoku University, Sendai, Miyagi, Japan.
| | | |
Collapse
|
24
|
Minning J, Porto M, Bastolla U. Detecting selection for negative design in proteins through an improved model of the misfolded state. Proteins 2013; 81:1102-12. [PMID: 23280507 DOI: 10.1002/prot.24244] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Accepted: 12/17/2012] [Indexed: 11/05/2022]
Abstract
Proteins that need to be structured in their native state must be stable both against the unfolded ensemble and against incorrectly folded (misfolded) conformations with low free energy. Positive design targets the first type of stability by strengthening native interactions. The second type of stability is achieved by destabilizing interactions that occur frequently in the misfolded ensemble, a strategy called negative design. Here, we investigate negative design adopting a statistical mechanical model of the misfolded ensemble, which improves the usual Gaussian approximation by taking into account the third moment of the energy distribution and contact correlations. Applying this model, we detect and quantify selection for negative design in most natural proteins, and we analytically design protein sequences that are stable both against unfolding and against misfolding.
Collapse
Affiliation(s)
- Jonas Minning
- Institut für Festkörperphysik, Technische Universität Darmstadt, Darmstadt, Germany
| | | | | |
Collapse
|
25
|
Bastolla U, Bruscolini P, Velasco JL. Sequence determinants of protein folding rates: Positive correlation between contact energy and contact range indicates selection for fast folding. Proteins 2012; 80:2287-304. [DOI: 10.1002/prot.24118] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2011] [Revised: 05/14/2012] [Accepted: 05/17/2012] [Indexed: 11/12/2022]
|
26
|
Liberles DA, Teichmann SA, Bahar I, Bastolla U, Bloom J, Bornberg-Bauer E, Colwell LJ, de Koning APJ, Dokholyan NV, Echave J, Elofsson A, Gerloff DL, Goldstein RA, Grahnen JA, Holder MT, Lakner C, Lartillot N, Lovell SC, Naylor G, Perica T, Pollock DD, Pupko T, Regan L, Roger A, Rubinstein N, Shakhnovich E, Sjölander K, Sunyaev S, Teufel AI, Thorne JL, Thornton JW, Weinreich DM, Whelan S. The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci 2012; 21:769-85. [PMID: 22528593 PMCID: PMC3403413 DOI: 10.1002/pro.2071] [Citation(s) in RCA: 152] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 03/22/2012] [Accepted: 03/23/2012] [Indexed: 12/20/2022]
Abstract
Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction.
Collapse
Affiliation(s)
- David A Liberles
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Sarah A Teichmann
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - Ivet Bahar
- Department of Computational and Systems Biology, School of Medicine, University of PittsburghPittsburgh, Pennsylvania 15213
| | - Ugo Bastolla
- Bioinformatics Unit. Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autonoma de Madrid28049 Cantoblanco Madrid, Spain
| | - Jesse Bloom
- Division of Basic Sciences, Fred Hutchinson Cancer Research CenterSeattle, Washington 98109
| | - Erich Bornberg-Bauer
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of MuensterGermany
| | - Lucy J Colwell
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - A P Jason de Koning
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of ColoradoAurora, Colorado
| | - Nikolay V Dokholyan
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel HillNorth Carolina 27599
| | - Julian Echave
- Escuela de Ciencia y Tecnología, Universidad Nacional de San MartínMartín de Irigoyen 3100, 1650 San Martín, Buenos Aires, Argentina
| | - Arne Elofsson
- Department of Biochemistry and Biophysics, Center for Biomembrane Research, Stockholm Bioinformatics Center, Science for Life Laboratory, Swedish E-science Research Center, Stockholm University106 91 Stockholm, Sweden
| | - Dietlind L Gerloff
- Biomolecular Engineering Department, University of CaliforniaSanta Cruz, California 95064
| | - Richard A Goldstein
- Division of Mathematical Biology, National Institute for Medical Research (MRC)Mill Hill, London NW7 1AA, United Kingdom
| | - Johan A Grahnen
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Mark T Holder
- Department of Ecology and Evolutionary Biology, University of KansasLawrence, Kansas 66045
| | - Clemens Lakner
- Bioinformatics Research Center, North Carolina State UniversityRaleigh, North Carolina 27695
| | - Nicholas Lartillot
- Département de Biochimie, Faculté de Médecine, Université de MontréalMontréal, QC H3T1J4, Canada
| | - Simon C Lovell
- Faculty of Life Sciences, University of ManchesterManchester M13 9PT, United Kingdom
| | - Gavin Naylor
- Department of Biology, College of CharlestonCharleston, South Carolina 29424
| | - Tina Perica
- MRC Laboratory of Molecular BiologyHills Road, Cambridge CB2 0QH, United Kingdom
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, School of Medicine, University of ColoradoAurora, Colorado
| | - Tal Pupko
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Lynne Regan
- Department of Molecular Biophysics and Biochemistry, Yale UniversityNew Haven 06511
| | - Andrew Roger
- Department of Biochemistry and Molecular Biology, Dalhousie UniversityHalifax, NS, Canada
| | - Nimrod Rubinstein
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv UniversityTel Aviv, Israel
| | - Eugene Shakhnovich
- Department of Chemistry and Chemical Biology, Harvard UniversityCambridge, Massachusetts 02138
| | - Kimmen Sjölander
- Department of Bioengineering, University of CaliforniaBerkeley, Berkeley, California 94720
| | - Shamil Sunyaev
- Division of Genetics, Brigham and Women's Hospital, Harvard Medical School77 Avenue Louis Pasteur, Boston, Massachusetts 02115
| | - Ashley I Teufel
- Department of Molecular Biology, University of WyomingLaramie, Wyoming 82071
| | - Jeffrey L Thorne
- Bioinformatics Research Center, North Carolina State UniversityRaleigh, North Carolina 27695
| | - Joseph W Thornton
- Howard Hughes Medical Institute and Institute for Ecology and Evolution, University of OregonEugene, Oregon 97403
- Department of Human Genetics, University of ChicagoChicago, Illinois 60637
- Department of Ecology and Evolution, University of ChicagoChicago, Illinois 60637
| | - Daniel M Weinreich
- Department of Ecology and Evolutionary Biology, and Center for Computational Molecular Biology, Brown UniversityProvidence, Rhode Island 02912
| | - Simon Whelan
- Faculty of Life Sciences, University of ManchesterManchester M13 9PT, United Kingdom
| |
Collapse
|
27
|
Danchin A, Binder PM, Noria S. Antifragility and Tinkering in Biology (and in Business) Flexibility Provides an Efficient Epigenetic Way to Manage Risk. Genes (Basel) 2011; 2:998-1016. [PMID: 24710302 PMCID: PMC3927596 DOI: 10.3390/genes2040998] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2011] [Revised: 10/25/2011] [Accepted: 11/16/2011] [Indexed: 12/25/2022] Open
Abstract
The notion of antifragility, an attribute of systems that makes them thrive under variable conditions, has recently been proposed by Nassim Taleb in a business context. This idea requires the ability of such systems to 'tinker', i.e., to creatively respond to changes in their environment. A fairly obvious example of this is natural selection-driven evolution. In this ubiquitous process, an original entity, challenged by an ever-changing environment, creates variants that evolve into novel entities. Analyzing functions that are essential during stationary-state life yield examples of entities that may be antifragile. One such example is proteins with flexible regions that can undergo functional alteration of their side residues or backbone and thus implement the tinkering that leads to antifragility. This in-built property of the cell chassis must be taken into account when considering construction of cell factories driven by engineering principles.
Collapse
Affiliation(s)
- Antoine Danchin
- AMAbiotics SAS, CEA/Genoscope, 2 rue Gaston Crémieux, 91057 Evry Cedex, France.
| | - Philippe M Binder
- Natural Sciences Division, University of Hawaii, Hilo, HI 96720-4091, USA.
| | - Stanislas Noria
- Fondation Fourmentin-Guilbert, 2 avenue du Pavé Neuf, 93160 Noisy-le-Grand, France.
| |
Collapse
|
28
|
Turunen HT, Sipilä P, Pujianto DA, Damdimopoulos AE, Björkgren I, Huhtaniemi I, Poutanen M. Members of the murine Pate family are predominantly expressed in the epididymis in a segment-specific fashion and regulated by androgens and other testicular factors. Reprod Biol Endocrinol 2011; 9:128. [PMID: 21942998 PMCID: PMC3192744 DOI: 10.1186/1477-7827-9-128] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 06/21/2011] [Accepted: 09/26/2011] [Indexed: 02/02/2023] Open
Abstract
BACKGROUND Spermatozoa leaving the testis are not able to fertilize the egg in vivo. They must undergo further maturation in the epididymis. Proteins secreted to the epididymal lumen by the epithelial cells interact with the spermatozoa and enable these maturational changes, and are responsible for proper storage conditions before ejaculation. The present study was carried out in order to characterize the expression of a novel Pate (prostate and testis expression) gene family, coding for secreted cysteine-rich proteins, in the epididymis. METHODS Murine genome databases were searched and sequence comparisons were performed to identify members of the Pate gene family, and their expression profiles in several mouse tissues were characterized by RT-PCR. Alternate transcripts were identified by RT-PCR, sequencing and Northern hybridization. Also, to study the regulation of expression of Pate family genes by the testis, quantitative (q) RT-PCR analyses were performed to compare gene expression levels in the epididymides of intact mice, gonadectomized mice, and gonadectomized mice under testosterone replacement treatment. RESULTS A revised family tree of Pate genes is presented, including a previously uncharacterized Pate gene named Pate-X, and the data revealed that Acrv1 and Sslp1 should also be considered as members of the Pate family. Alternate splicing was observed for Pate-X, Pate-C and Pate-M. All the Pate genes studied are predominantly expressed in the epididymis, whereas expression in the testis and prostate is notably lower. Loss of androgens and/or testicular luminal factors was observed to affect the epididymal expression of several Pate genes. CONCLUSIONS We have characterized a gene cluster consisting of at least 14 expressed Pate gene members, including Acrv1, Sslp1 and a previously uncharacterized gene which we named Pate-X. The genes code for putatively secreted, cysteine-rich proteins with a TFP/Ly-6/uPAR domain. Members of the Pate gene cluster characterized are predominantly expressed in the murine epididymis, not in the testis or prostate, and are regulated by testicular factors. Similar proteins are present in venoms of several reptiles, and they are thought to mediate their effects by regulating certain ion channels, and are thus expected to have a clinical relevance in sperm maturation and epididymal infections.
Collapse
Affiliation(s)
- Heikki T Turunen
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
- Turku Graduate School of Biomedical Sciences, Kiinamyllynkatu 13, FIN-20520, Turku, Finland
| | - Petra Sipilä
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
- Turku Center for Disease Modeling, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
| | - Dwi Ari Pujianto
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
- Department of Biology, Faculty of Medicine, University of Indonesia, Jakarta Pusat, Indonesia
| | - Anastasios E Damdimopoulos
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
| | - Ida Björkgren
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
- Turku Graduate School of Biomedical Sciences, Kiinamyllynkatu 13, FIN-20520, Turku, Finland
| | - Ilpo Huhtaniemi
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
- Institute of Reproductive and Developmental Biology, Imperial College London, Hammersmith Campus, London W12 0NN, UK
| | - Matti Poutanen
- Department of Physiology, Institute of Biomedicine, University of Turku, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
- Turku Center for Disease Modeling, Kiinamyllynkatu 10, FIN-20520, Turku, Finland
| |
Collapse
|
29
|
Sawle L, Ghosh K. How do thermophilic proteins and proteomes withstand high temperature? Biophys J 2011; 101:217-27. [PMID: 21723832 PMCID: PMC3127178 DOI: 10.1016/j.bpj.2011.05.059] [Citation(s) in RCA: 98] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2011] [Revised: 05/26/2011] [Accepted: 05/27/2011] [Indexed: 11/30/2022] Open
Abstract
We attempt to understand the origin of enhanced stability in thermophilic proteins by analyzing thermodynamic data for 116 proteins, the largest data set achieved to date. We compute changes in entropy and enthalpy at the convergence temperature where different driving forces are maximally decoupled, in contrast to the majority of previous studies that were performed at the melting temperature. We find, on average, that the gain in enthalpy upon folding is lower in thermophiles than in mesophiles, whereas the loss in entropy upon folding is higher in mesophiles than in thermophiles. This implies that entropic stabilization may be responsible for the high melting temperature, and hints at residual structure or compactness of the denatured state in thermophiles. We find a similar trend by analyzing a homologous set of proteins classified based only on the optimum growth temperature of the organisms from which they were extracted. We find that the folding free energy at the temperature of maximal stability is significantly more favorable in thermophiles than in mesophiles, whereas the maximal stability temperature itself is similar between these two classes. Furthermore, we extend the thermodynamic analysis to model the entire proteome. The results explain the high optimal growth temperature in thermophilic organisms and are in excellent quantitative agreement with full thermal growth rate data obtained in a dozen thermophilic and mesophilic organisms.
Collapse
Affiliation(s)
| | - Kingshuk Ghosh
- Department of Physics and Astronomy, University of Denver, Denver, Colorado
| |
Collapse
|
30
|
Donald JE, Kulp DW, DeGrado WF. Salt bridges: geometrically specific, designable interactions. Proteins 2011; 79:898-915. [PMID: 21287621 DOI: 10.1002/prot.22927] [Citation(s) in RCA: 258] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2010] [Revised: 10/04/2010] [Accepted: 10/22/2010] [Indexed: 12/19/2022]
Abstract
Salt bridges occur frequently in proteins, providing conformational specificity and contributing to molecular recognition and catalysis. We present a comprehensive analysis of these interactions in protein structures by surveying a large database of protein structures. Salt bridges between Asp or Glu and His, Arg, or Lys display extremely well-defined geometric preferences. Several previously observed preferences are confirmed, and others that were previously unrecognized are discovered. Salt bridges are explored for their preferences for different separations in sequence and in space, geometric preferences within proteins and at protein-protein interfaces, co-operativity in networked salt bridges, inclusion within metal-binding sites, preference for acidic electrons, apparent conformational side chain entropy reduction on formation, and degree of burial. Salt bridges occur far more frequently between residues at close than distant sequence separations, but, at close distances, there remain strong preferences for salt bridges at specific separations. Specific types of complex salt bridges, involving three or more members, are also discovered. As we observe a strong relationship between the propensity to form a salt bridge and the placement of salt-bridging residues in protein sequences, we discuss the role that salt bridges might play in kinetically influencing protein folding and thermodynamically stabilizing the native conformation. We also develop a quantitative method to select appropriate crystal structure resolution and B-factor cutoffs. Detailed knowledge of these geometric and sequence dependences should aid de novo design and prediction algorithms.
Collapse
Affiliation(s)
- Jason E Donald
- Department of Biochemistry and Biophysics, University of Pennsylvania School of Medicine, Philadelphia, Pennsylvania, USA
| | | | | |
Collapse
|
31
|
Wong JWH, Ho SYW, Hogg PJ. Disulfide bond acquisition through eukaryotic protein evolution. Mol Biol Evol 2010; 28:327-34. [PMID: 20675408 DOI: 10.1093/molbev/msq194] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Disulfide bonds play critical roles in protein stability and function. They are generally considered to be strongly conserved among species. Although there is compelling evidence in the literature for this conservation on a case-by-case basis, comparative genomic analyses of disulfide conservation have in the past been limited. By analyzing the conservation of all structurally validated disulfide bonds from the Protein Data Bank across 29 completely sequenced eukaryotic genomes, we observe elevated conservation of disulfide-bonded cysteines (half-cystines) compared with unpaired cysteines and other amino acids. Remarkably, half-cystines are even more conserved than tryptophan--the most conserved amino acid. Overall, once disulfide bonds are acquired in proteins, they are rarely lost. Moreover, the acquisition of disulfide bonds shows a strong positive correlation (R(2) = 0.74) with organismal complexity. Although the correlation weakens (R(2) = 0.59) when yeast is excluded from the analysis, this trend is still apparent when compared with the slightly negative correlation of unpaired cysteine acquisition with organismal complexity. The accrual of disulfide bonds is likely to reflect the demand for greater sophistication in protein function in complex species. Our findings provide further support for the increasing usage of cysteines in modern proteomes and suggest that there has been positive selection for disulfide bonds through eukaryotic evolution. Finally, we show that the acquisition of the functionally relevant disulfide bond in domain 2 of the CD4 protein occurred independently in primates and rodents.
Collapse
Affiliation(s)
- Jason W H Wong
- Prince of Wales Clinical School and Lowy Cancer Research Centre, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia
| | | | | |
Collapse
|
32
|
Sammet SG, Bastolla U, Porto M. Comparison of translation loads for standard and alternative genetic codes. BMC Evol Biol 2010; 10:178. [PMID: 20546599 PMCID: PMC2909233 DOI: 10.1186/1471-2148-10-178] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2009] [Accepted: 06/14/2010] [Indexed: 11/25/2022] Open
Abstract
Background The (almost) universality of the genetic code is one of the most intriguing properties of cellular life. Nevertheless, several variants of the standard genetic code have been observed, which differ in one or several of 64 codon assignments and occur mainly in mitochondrial genomes and in nuclear genomes of some bacterial and eukaryotic parasites. These variants are usually considered to be the result of non-adaptive evolution. It has been shown that the standard genetic code is preferential to randomly assembled codes for its ability to reduce the effects of errors in protein translation. Results Using a genotype-to-phenotype mapping based on a quantitative model of protein folding, we compare the standard genetic code to seven of its naturally occurring variants with respect to the fitness loss associated to mistranslation and mutation. These fitness losses are computed through computer simulations of protein evolution with mutations that are either neutral or lethal, and different mutation biases, which influence the balance between unfolding and misfolding stability. We show that the alternative codes may produce significantly different mutation and translation loads, particularly for genomes evolving with a rather large mutation bias. Most of the alternative genetic codes are found to be disadvantageous to the standard code, in agreement with the view that the change of genetic code is a mutationally driven event. Nevertheless, one of the studied alternative genetic codes is predicted to be preferable to the standard code for a broad range of mutation biases. Conclusions Our results show that, with one exception, the standard genetic code is generally better able to reduce the translation load than the naturally occurring variants studied here. Besides this exception, some of the other alternative genetic codes are predicted to be better adapted for extreme mutation biases. Hence, the fixation of alternative genetic codes might be a neutral or nearly-neutral event in the majority of the cases, but adaptation cannot be excluded for some of the studied cases.
Collapse
Affiliation(s)
- Stefanie Gabriele Sammet
- Institut für Festkörperphysik, Technische Universität Darmstadt, Hochschulstr, 8, 64289 Darmstadt, Germany
| | | | | |
Collapse
|
33
|
Bastolla U, Ortíz AR, Porto M, Teichert F. Effective connectivity profile: a structural representation that evidences the relationship between protein structures and sequences. Proteins 2008; 73:872-88. [PMID: 18536008 DOI: 10.1002/prot.22113] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The complexity of protein structures calls for simplified representations of their topology. The simplest possible mathematical description of a protein structure is a one-dimensional profile representing, for instance, buriedness or secondary structure. This kind of representation has been introduced for studying the sequence to structure relationship, with applications to fold recognition. Here we define the effective connectivity profile (EC), a network theoretical profile that self-consistently represents the network structure of the protein contact matrix. The EC profile makes mathematically explicit the relationship between protein structure and protein sequence, because it allows predicting the average hydrophobicity profile (HP) and the distributions of amino acids at each site for families of homologous proteins sharing the same structure. In this sense, the EC provides an analytic solution to the statistical inverse folding problem, which consists in finding the statistical properties of the set of sequences compatible with a given structure. We tested these predictions with simulations of the structurally constrained neutral (SCN) model of protein evolution with structure conservation, for single- and multi-domain proteins, and for a wide range of mutation processes, the latter producing sequences with very different hydrophobicity profiles, finding that the EC-based predictions are accurate even when only one sequence of the family is known. The EC profile is very significantly correlated with the HP for sequence-structure pairs in the PDB as well. The EC profile generalizes the properties of previously introduced structural profiles to modular proteins such as multidomain chains, and its correlation with the sequence profile is substantially improved with respect to the previously defined profiles, particularly for long proteins. Furthermore, the EC profile has a dynamic interpretation, since the EC components are strongly inversely related with the temperature factors measured in X-ray experiments, meaning that positions with large EC component are more strongly constrained in their equilibrium dynamics. Last, the EC profile allows to define a natural measure of modularity that correlates with the number of domains composing the protein, suggesting its application for domain decomposition. Finally, we show that structurally similar proteins have similar EC profiles, so that the similarity between aligned EC profiles can be used as a structure similarity measure, a property that we have recently applied for protein structure alignment. The code for computing the EC profile is available upon request writing to ubastolla@cbm.uam.es, and the structural profiles discussed in this article can be downloaded from the SLOTH webserver http://www.fkp.tu-darmstadt.de/SLOTH/.
Collapse
Affiliation(s)
- Ugo Bastolla
- Centro de Biología Molecular Severo Ochoa, (CSIC-UAM), Cantoblanco, 28049 Madrid, Spain.
| | | | | | | |
Collapse
|
34
|
Ferrada E, Wagner A. Protein robustness promotes evolutionary innovations on large evolutionary time-scales. Proc Biol Sci 2008; 275:1595-602. [PMID: 18430649 DOI: 10.1098/rspb.2007.1617] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Recent laboratory experiments suggest that a molecule's ability to evolve neutrally is important for its ability to generate evolutionary innovations. In contrast to laboratory experiments, life unfolds on time-scales of billions of years. Here, we ask whether a molecule's ability to evolve neutrally-a measure of its robustness-facilitates evolutionary innovation also on these large time-scales. To this end, we use protein designability, the number of sequences that can adopt a given protein structure, as an estimate of the structure's ability to evolve neutrally. Based on two complementary measures of functional diversity-catalytic diversity and molecular functional diversity in gene ontology-we show that more robust proteins have a greater capacity to produce functional innovations. Significant associations among structural designability, folding rate and intrinsic disorder also exist, underlining the complex relationship of the structural factors that affect protein evolution.
Collapse
Affiliation(s)
- Evandro Ferrada
- Department of Biochemistry, University of Zurich, Building Y27, Winterthurerstrasse 190, 8057 Zurich, Switzerland.
| | | |
Collapse
|
35
|
Shirota M, Ishida T, Kinoshita K. Effects of surface-to-volume ratio of proteins on hydrophilic residues: decrease in occurrence and increase in buried fraction. Protein Sci 2008; 17:1596-602. [PMID: 18556475 DOI: 10.1110/ps.035592.108] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
The size of a protein is an important factor for understanding the sequence-structure relationship, and it affects both the amino acid composition and the residue burial of proteins. However, it is usually measured as the number of amino acids, although these effects would result from the reduction of surface regions relative to the volume of core regions in larger proteins. In addition, although these two effects are dependent on each other, they have been studied separately. In this study, we investigated them by considering the surface-to-volume ratio (SVR), and observed the correlation between them. We found that the reduction of several hydrophilic residues is more strongly correlated with SVR than with protein size (the number of amino acids) and that SVR directly affects the amino acid composition. The difference as a descriptor between SVR and size is also supported by the observation that the secondary structural elements correlate completely differently with SVR and with size. Furthermore, for the four most hydrophilic residues, glutamine, arginine, glutamic acid, and lysine, balances between the decrease in composition and the increase in core burial were observed. We found that the burial of glutamine and arginine became accelerated at SVR = 0.3 A(-1) (approximately 132 residues) as the protein size increased, but that lysine has an upper limit of 0.9% for its occurrence in the core. The uniqueness of lysine was also elucidated by comparison with the burial environments of the four hydrophilic residues.
Collapse
Affiliation(s)
- Matsuyuki Shirota
- Institute of Medical Science, The University of Tokyo, Minato-ku, Tokyo 108-8639, Japan
| | | | | |
Collapse
|
36
|
Abstract
The amino acid composition of intrinsically disordered proteins and protein segments characteristically differs from that of ordered proteins. This observation forms the basis of several disorder prediction methods. These, however, usually perform worse for smaller proteins (or segments) than for larger ones. We show that the regions of amino acid composition space corresponding to ordered and disordered proteins overlap with each other, and the extent of the overlap (the "twilight zone") is larger for short than for long chains. To explain this finding, we used two-dimensional lattice model proteins containing hydrophobic, polar, and charged monomers and revealed the relation among chain length, amino acid composition, and disorder. Because the number of chain configurations exponentially grows with chain length, a larger fraction of longer chains can reach a low-energy, ordered state than do shorter chains. The amount of information carried by the amino acid composition about whether a protein or segment is (dis)ordered grows with increasing chain length. Smaller proteins rely more on specific interactions for stability, which limits the possible accuracy of disorder prediction methods. For proteins in the "twilight zone", size can determine order, as illustrated by the example of two-state homodimers.
Collapse
|
37
|
Szilágyi A. A mathematically related singularity and the maximum size of protein domains. Proteins 2008; 71:2086-8; discussion 2089-90. [DOI: 10.1002/prot.22000] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
38
|
Monsellier E, Chiti F. Prevention of amyloid-like aggregation as a driving force of protein evolution. EMBO Rep 2007; 8:737-42. [PMID: 17668004 PMCID: PMC1978086 DOI: 10.1038/sj.embor.7401034] [Citation(s) in RCA: 191] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2007] [Accepted: 06/18/2007] [Indexed: 12/16/2022] Open
Abstract
Uncontrolled protein aggregation is a constant challenge in all compartments of living organisms. The failure of a peptide or protein to remain soluble often results in pathology. So far, more than 40 human diseases have been associated with the formation of extracellular fibrillar aggregates - known as amyloid fibrils - or structurally related intracellular deposits. It is well known that molecular chaperones and elaborate quality control mechanisms exist in the cell to counteract aggregation. However, an increasing number of reports during the past few years indicate that proteins have also evolved structural and sequence-based strategies to prevent aggregation. This review describes these strategies and the selection pressures that exist on protein sequences to combat their uncontrolled aggregation. We will describe the different types of mechanism evolved by proteins that adopt different conformational states including normally folded proteins, intrinsically disordered polypeptide chains, elastomeric systems and multimodular proteins.
Collapse
Affiliation(s)
- Elodie Monsellier
- Dipartimento di Scienze Biochimiche, Università di Firenze, Viale Morgagni 50, I-50134, Firenze, Italy
| | - Fabrizio Chiti
- Dipartimento di Scienze Biochimiche, Università di Firenze, Viale Morgagni 50, I-50134, Firenze, Italy
- Tel: +39 055 4598319; Fax: +39 055 4598905;
| |
Collapse
|
39
|
Monsellier E, Ramazzotti M, de Laureto PP, Tartaglia GG, Taddei N, Fontana A, Vendruscolo M, Chiti F. The distribution of residues in a polypeptide sequence is a determinant of aggregation optimized by evolution. Biophys J 2007; 93:4382-91. [PMID: 17766358 PMCID: PMC2098718 DOI: 10.1529/biophysj.107.111336] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
It has been shown that the propensity of a protein to form amyloid-like fibrils can be predicted with high accuracy from the knowledge of its amino acid sequence. It has also been suggested, however, that some regions of the sequences are more important than others in determining the aggregation process. Here, we have addressed this issue by constructing a set of "sequence scrambled" variants of the first 29 residues of horse heart apomyoglobin (apoMb(1-29)), in which the sequence was modified while maintaining the same amino acid composition. The clustering of the most amyloidogenic residues in one region of the sequence was found to cause a marked increase of the elongation rate (k(agg)) and a remarkable shortening of the lag phase (t(lag)) of the fibril growth, as determined by far-UV circular dichroism and thioflavin T fluorescence. We also show that taking explicitly into consideration the presence of aggregation-promoting regions in the predictive methods results in a quantitative agreement between the theoretical and observed k(agg) and t(lag) values of the apoMb(1-29) variants. These results, together with a comparison between homologous segments from the family of globins, indicate the existence of a negative selection against the clustering of highly amyloidogenic residues in one or few regions of polypeptide sequences.
Collapse
Affiliation(s)
- Elodie Monsellier
- Dipartimento di Scienze Biochimiche, Università degli studi di Firenze, Florence, Italy
| | | | | | | | | | | | | | | |
Collapse
|
40
|
Rykunov D, Fiser A. Effects of amino acid composition, finite size of proteins, and sparse statistics on distance-dependent statistical pair potentials. Proteins 2007; 67:559-68. [PMID: 17335003 DOI: 10.1002/prot.21279] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Statistical distance dependent pair potentials are frequently used in a variety of folding, threading, and modeling studies of proteins. The applicability of these types of potentials is tightly connected to the reliability of statistical observations. We explored the possible origin and extent of false positive signals in statistical potentials by analyzing their distance dependence in a variety of randomized protein-like models. While on average potentials derived from such models are expected to equal zero at any distance, we demonstrate that systematic and significant distortions exist. These distortions originate from the limited statistical counts in local environments of proteins and from the limited size of protein structures at large distances. We suggest that these systematic errors in statistical potentials are connected to the dependence of amino acid composition on protein size and to variation in protein sizes. Additionally, atom-based potentials are dominated by a false positive signal that is due to correlation among distances measured from atoms of one residue to atoms of another residue. The significance of residue-based pairwise potentials at various spatial pair separations was assessed in this study and it was found that as few as approximately 50% of potential values were statistically significant at distances below 4 A, and only at most approximately 80% of them were significant at larger pair separations. A new definition for reference state, free of the observed systematic errors, is suggested. It has been demonstrated to generate statistical potentials that compare favorably to other publicly available ones.
Collapse
Affiliation(s)
- Dmitry Rykunov
- Department of Biochemistry, Seaver Center for Bioinformatics, Albert Einstein College of Medicine, Bronx, New York 10461, USA
| | | |
Collapse
|
41
|
Destri C, Miccio C. Simple stochastic model for the evolution of protein lengths. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2007; 76:011924. [PMID: 17677511 DOI: 10.1103/physreve.76.011924] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/18/2007] [Indexed: 05/16/2023]
Abstract
We analyze a simple discrete-time stochastic process for the theoretical modeling of the evolution of protein lengths. At every step of the process, a new protein is produced as a modification of one of the proteins already existing, and its length is assumed to be a random variable that depends only on the length of the originating protein. Thus a random recursive tree is produced over the natural numbers. If (quasi) scale invariance is assumed, the length distribution in a single history tends to a log-normal form with a specific signature of the deviations from exact Gaussianity. Comparison with the very large Similarity Matrix of Proteins database shows good agreement.
Collapse
Affiliation(s)
- C Destri
- Dipartimento di Fisica G. Occhialini, Università di Milano-Bicocca and INFN, Sezione di Milano, Piazza della Scienza 3, I-20126 Milano, Italy
| | | |
Collapse
|
42
|
Kaplan N, Morpurgo N, Linial M. Novel families of toxin-like peptides in insects and mammals: a computational approach. J Mol Biol 2007; 369:553-66. [PMID: 17433819 DOI: 10.1016/j.jmb.2007.02.106] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2006] [Revised: 02/14/2007] [Accepted: 02/21/2007] [Indexed: 11/19/2022]
Abstract
Most animal toxins are short proteins that appear in venom and vary in sequence, structure and function. A common characteristic of many such toxins is their apparent structural stability. Sporadic instances of endogenous toxin-like proteins that function in non-venom context have been reported. We have utilized machine learning methodology, based on sequence-derived features and guided by the notion of structural stability, in order to conduct a large-scale search for toxin and toxin-like proteins. Application of the method to insect and mammalian sequences revealed novel families of toxin-like proteins. One of these proteins shows significant similarity to ion channel inhibitors that are expressed in cone snail and assassin bug venom, and is surprisingly expressed in the bee brain. A toxicity assay in which the protein was injected to fish induced a strong yet reversible paralytic effect. We suggest that the protein may function as an endogenous modulator of voltage-gated Ca(2+) channels. Additionally, we have identified a novel mammalian cluster of toxin-like proteins that are expressed in the testis. We suggest that these proteins might be involved in regulation of nicotinic acetylcholine receptors that affect the acrosome reaction and sperm motility. Finally, we highlight a possible evolutionary link between venom toxins and antibacterial proteins. We expect our methodology to enhance the discovery of additional novel protein families.
Collapse
Affiliation(s)
- Noam Kaplan
- Department of Biological Chemistry, Institute of Life Sciences, The Hebrew University, Jerusalem, Israel.
| | | | | |
Collapse
|
43
|
Arolas JL, Bronsoms S, Ventura S, Aviles FX, Calvete JJ. Characterizing the tick carboxypeptidase inhibitor: molecular basis for its two-domain nature. J Biol Chem 2006; 281:22906-16. [PMID: 16760476 DOI: 10.1074/jbc.m602301200] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Tick carboxypeptidase inhibitor (TCI) is a small, disulfide-rich protein that selectively inhibits metallocarboxypeptidases and strongly accelerates the fibrinolysis of blood clots. TCI consists of two domains that are structurally very similar, each containing three disulfide bonds arranged in an almost identical fashion. The oxidative folding and reductive unfolding pathways of TCI and its separated domains have been characterized by kinetic and structural analysis of the acid-trapped folding intermediates. TCI folding proceeds through a sequential formation of 1-, 2-, 3-, 4-, 5-, and 6-disulfide species to reach the native form. Folding intermediates of TCI comprise two predominant 3-disulfide species (named IIIa and IIIb) and a major 6-disulfide scrambled isomer (Xa) that consecutively accumulate along the reaction and are strongly prevented by the presence of protein disulfide isomerase. This study demonstrates that IIIa and IIIb are 3-disulfide species containing the native disulfide pairings of the N- and C-terminal domains of TCI, respectively, and explains why the two domains of TCI fold sequentially and independently. Also, we show that the reductive unfolding of TCI undergoes two main independent unfolding events through the formation of IIIa and IIIb intermediates. Together, the comparison of the folding, stability, and inhibitory activity of TCI with those of the isolated domains reveals the reasons behind the two-domain nature of this protein: both domains contribute to the specificity and high affinity of its double-headed binding to carboxypeptidases. The results obtained herein provide valuable information for the design of more potent and selective TCI molecules.
Collapse
Affiliation(s)
- Joan L Arolas
- Institut de Biotecnologia i Biomedicina and Departament de Bioquímica i Biologia Molecular, Universitat Autònoma de Barcelona, 08193 Bellaterra, Barcelona
| | | | | | | | | |
Collapse
|
44
|
Bastolla U, Porto M, Roman HE, Vendruscolo M. A protein evolution model with independent sites that reproduces site-specific amino acid distributions from the Protein Data Bank. BMC Evol Biol 2006; 6:43. [PMID: 16737532 PMCID: PMC1570368 DOI: 10.1186/1471-2148-6-43] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2005] [Accepted: 05/31/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Since thermodynamic stability is a global property of proteins that has to be conserved during evolution, the selective pressure at a given site of a protein sequence depends on the amino acids present at other sites. However, models of molecular evolution that aim at reconstructing the evolutionary history of macromolecules become computationally intractable if such correlations between sites are explicitly taken into account. RESULTS We introduce an evolutionary model with sites evolving independently under a global constraint on the conservation of structural stability. This model consists of a selection process, which depends on two hydrophobicity parameters that can be computed from protein sequences without any fit, and a mutation process for which we consider various models. It reproduces quantitatively the results of Structurally Constrained Neutral (SCN) simulations of protein evolution in which the stability of the native state is explicitly computed and conserved. We then compare the predicted site-specific amino acid distributions with those sampled from the Protein Data Bank (PDB). The parameters of the mutation model, whose number varies between zero and five, are fitted from the data. The mean correlation coefficient between predicted and observed site-specific amino acid distributions is larger than <r> = 0.70 for a mutation model with no free parameters and no genetic code. In contrast, considering only the mutation process with no selection yields a mean correlation coefficient of <r> = 0.56 with three fitted parameters. The mutation model that best fits the data takes into account increased mutation rate at CpG dinucleotides, yielding <r> = 0.90 with five parameters. CONCLUSION The effective selection process that we propose reproduces well amino acid distributions as observed in the protein sequences in the PDB. Its simplicity makes it very promising for likelihood calculations in phylogenetic studies. Interestingly, in this approach the mutation process influences the effective selection process, i.e. selection and mutation must be entangled in order to obtain effectively independent sites. This interdependence between mutation and selection reflects the deep influence that mutation has on the evolutionary process: The bias in the mutation influences the thermodynamic properties of the evolving proteins, in agreement with comparative studies of bacterial proteomes, and it also influences the rate of accepted mutations.
Collapse
Affiliation(s)
- Ugo Bastolla
- Centro de Biología Molecular "Severo Ochoa", (CSIC-UAM), Cantoblanco, 28049 Madrid, Spain
| | - Markus Porto
- Institut für Festkörperphysik, Technische Universität Darmstadt, Hochschulstr. 8, 64289 Darmstadt, Germany
| | - H Eduardo Roman
- Dipartimento di Fisica, Università di Milano Bicocca, Piazza della Scienza 3, 20126 Milano, Italy
| | - Michele Vendruscolo
- Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK
| |
Collapse
|