1
|
Yadahalli S, Jayanthi LP, Gosavi S. A Method for Assessing the Robustness of Protein Structures by Randomizing Packing Interactions. Front Mol Biosci 2022; 9:849272. [PMID: 35832734 PMCID: PMC9271847 DOI: 10.3389/fmolb.2022.849272] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2022] [Accepted: 04/27/2022] [Indexed: 12/02/2022] Open
Abstract
Many single-domain proteins are not only stable and water-soluble, but they also populate few to no intermediates during folding. This reduces interactions between partially folded proteins, misfolding, and aggregation, and makes the proteins tractable in biotechnological applications. Natural proteins fold thus, not necessarily only because their structures are well-suited for folding, but because their sequences optimize packing and fit their structures well. In contrast, folding experiments on the de novo designed Top7 suggest that it populates several intermediates. Additionally, in de novo protein design, where sequences are designed for natural and new non-natural structures, tens of sequences still need to be tested before success is achieved. Both these issues may be caused by the specific scaffolds used in design, i.e., some protein scaffolds may be more tolerant to packing perturbations and varied sequences. Here, we report a computational method for assessing the response of protein structures to packing perturbations. We then benchmark this method using designed proteins and find that it can identify scaffolds whose folding gets disrupted upon perturbing packing, leading to the population of intermediates. The method can also isolate regions of both natural and designed scaffolds that are sensitive to such perturbations and identify contacts which when present can rescue folding. Overall, this method can be used to identify protein scaffolds that are more amenable to whole protein design as well as to identify protein regions which are sensitive to perturbations and where further mutations should be avoided during protein engineering.
Collapse
|
2
|
Kolodny R, Nepomnyachiy S, Tawfik DS, Ben-Tal N. Bridging Themes: Short Protein Segments Found in Different Architectures. Mol Biol Evol 2021; 38:2191-2208. [PMID: 33502503 PMCID: PMC8136508 DOI: 10.1093/molbev/msab017] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
The vast majority of theoretically possible polypeptide chains do not fold, let alone confer function. Hence, protein evolution from preexisting building blocks has clear potential advantages over ab initio emergence from random sequences. In support of this view, sequence similarities between different proteins is generally indicative of common ancestry, and we collectively refer to such homologous sequences as "themes." At the domain level, sequence homology is routinely detected. However, short themes which are segments, or fragments of intact domains, are particularly interesting because they may provide hints about the emergence of domains, as opposed to divergence of preexisting domains, or their mixing-and-matching to form multi-domain proteins. Here we identified 525 representative short themes, comprising 20-80 residues that are unexpectedly shared between domains considered to have emerged independently. Among these "bridging themes" are ones shared between the most ancient domains, for example, Rossmann, P-loop NTPase, TIM-barrel, flavodoxin, and ferredoxin-like. We elaborate on several particularly interesting cases, where the bridging themes mediate ligand binding. Ligand binding may have contributed to the stability and the plasticity of these building blocks, and to their ability to invade preexisting domains or serve as starting points for completely new domains.
Collapse
Affiliation(s)
- Rachel Kolodny
- Department of Computer Science, University of Haifa, Haifa, Israel
| | | | - Dan S Tawfik
- Department of Biomolecular Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Nir Ben-Tal
- George S. Wise Faculty of Life Sciences, Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
3
|
Searching protein space for ancient sub-domain segments. Curr Opin Struct Biol 2021; 68:105-112. [PMID: 33476896 DOI: 10.1016/j.sbi.2020.11.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Accepted: 11/29/2020] [Indexed: 01/08/2023]
Abstract
Evolutionary processes that formed the current protein universe left their traces, among them homologous segments that recur, or are 'reused,' in multiple proteins. These reused segments, called 'themes,' can be found at various scales, the best known of which is the domain. Yet, recent studies have begun to focus on the evolutionary insights that can be derived from sub-domain-scale themes, which are candidates for traces of more ancient events. Characterizing these may provide clues to the emergence of domains. Particularly interesting are themes that are reused across dissimilar contexts, that is, where the rest of the protein domain differs. We survey computational studies identifying reused themes within different contexts at the sub-domain level.
Collapse
|
4
|
Abstract
Life on Earth is driven by electron transfer reactions catalyzed by a suite of enzymes that comprise the superfamily of oxidoreductases (Enzyme Classification EC1). Most modern oxidoreductases are complex in their structure and chemistry and must have evolved from a small set of ancient folds. Ancient oxidoreductases from the Archean Eon between ca. 3.5 and 2.5 billion years ago have been long extinct, making it challenging to retrace evolution by sequence-based phylogeny or ancestral sequence reconstruction. However, three-dimensional topologies of proteins change more slowly than sequences. Using comparative structure and sequence profile-profile alignments, we quantify the similarity between proximal cofactor-binding folds and show that they are derived from a common ancestor. We discovered that two recurring folds were central to the origin of metabolism: ferredoxin and Rossmann-like folds. In turn, these two folds likely shared a common ancestor that, through duplication, recruitment, and diversification, evolved to facilitate electron transfer and catalysis at a very early stage in the origin of metabolism.
Collapse
|
5
|
Afanasieva E, Chaudhuri I, Martin J, Hertle E, Ursinus A, Alva V, Hartmann MD, Lupas AN. Structural diversity of oligomeric β-propellers with different numbers of identical blades. eLife 2019; 8:49853. [PMID: 31613220 PMCID: PMC6805158 DOI: 10.7554/elife.49853] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 09/25/2019] [Indexed: 12/29/2022] Open
Abstract
β-Propellers arise through the amplification of a supersecondary structure element called a blade. This process produces toroids of between four and twelve repeats, which are almost always arranged sequentially in a single polypeptide chain. We found that new propellers evolve continuously by amplification from single blades. We therefore investigated whether such nascent propellers can fold as homo-oligomers before they have been fully amplified within a single chain. One- to six-bladed building blocks derived from two seven-bladed WD40 propellers yielded stable homo-oligomers with six to nine blades, depending on the size of the building block. High-resolution structures for tetramers of two blades, trimers of three blades, and dimers of four and five blades, respectively, show structurally diverse propellers and include a novel fold, highlighting the inherent flexibility of the WD40 blade. Our data support the hypothesis that subdomain-sized fragments can provide structural versatility in the evolution of new proteins.
Collapse
Affiliation(s)
- Evgenia Afanasieva
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Indronil Chaudhuri
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Jörg Martin
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Eva Hertle
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Astrid Ursinus
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Vikram Alva
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Marcus D Hartmann
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Andrei N Lupas
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| |
Collapse
|
6
|
Zhang W, Du L, Li F, Zhang X, Qu Z, Han L, Li Z, Sun J, Qi F, Yao Q, Sun Y, Geng C, Li S. Mechanistic Insights into Interactions between Bacterial Class I P450 Enzymes and Redox Partners. ACS Catal 2018. [DOI: 10.1021/acscatal.8b02913] [Citation(s) in RCA: 52] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Wei Zhang
- State Key Laboratory of Microbial Technology, Shandong University, Qingdao, Shandong 266237, China
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Lei Du
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Fengwei Li
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Xingwang Zhang
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Zepeng Qu
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
- University of Chinese Academy of Sciences, No. 19(A) Yuquan Road, Beijing 100049, China
| | - Lei Han
- College of Chemistry and Pharmaceutical Sciences, Qingdao Agricultural University, Qingdao, Shandong 266109, China
| | - Zhong Li
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
- University of Chinese Academy of Sciences, No. 19(A) Yuquan Road, Beijing 100049, China
| | - Jingran Sun
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Fengxia Qi
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Qiuping Yao
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Yue Sun
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Ce Geng
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
| | - Shengying Li
- Shandong Provincial Key Laboratory of Synthetic Biology, CAS Key Laboratory of Biofuels, Qingdao Institute of Bioenergy and Bioprocess Technology, Chinese Academy of Sciences, No. 189 Songling Road, Qingdao, Shandong 266101, China
- University of Chinese Academy of Sciences, No. 19(A) Yuquan Road, Beijing 100049, China
| |
Collapse
|
7
|
Raanan H, Pike DH, Moore EK, Falkowski PG, Nanda V. Modular origins of biological electron transfer chains. Proc Natl Acad Sci U S A 2018; 115:1280-1285. [PMID: 29358375 PMCID: PMC5819401 DOI: 10.1073/pnas.1714225115] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
Oxidoreductases catalyze electron transfer reactions that ultimately provide the energy for life. A limited set of ancestral protein-metal modules are presumably the building blocks that evolved into this diverse protein family. However, the identity of these modules and their path to modern oxidoreductases is unknown. Using a comparative structural analysis approach, we identify a set of fundamental electron transfer modules that have evolved to form the extant oxidoreductases. Using transition metal-containing cofactors as fiducial markers, it is possible to cluster cofactor microenvironments into as few as four major modules: bacterial ferredoxin, cytochrome c, symerythrin, and plastocyanin-type folds. From structural alignments, it is challenging to ascertain whether modules evolved from a single common ancestor (homology) or arose by independent convergence on a limited set of structural forms (analogy). Additional insight into common origins is contained in the spatial adjacency network (SPAN), which is based on proximity of modules in oxidoreductases containing multiple cofactor electron transfer chains. Electron transfer chains within complex modern oxidoreductases likely evolved through repeated duplication and diversification of ancient modular units that arose in the Archean eon.
Collapse
Affiliation(s)
- Hagai Raanan
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, NJ 08901
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, NJ 08854
| | - Douglas H Pike
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, NJ 08854
| | - Eli K Moore
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, NJ 08901
| | - Paul G Falkowski
- Environmental Biophysics and Molecular Ecology Program, Department of Marine and Coastal Sciences, Rutgers University, New Brunswick, NJ 08901;
- Department of Earth and Planetary Sciences, Rutgers University, New Brunswick, NJ 08901
| | - Vikas Nanda
- Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, NJ 08854;
- Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Rutgers University, Piscataway, NJ 08854
| |
Collapse
|
8
|
Jelen BI, Giovannelli D, Falkowski PG. The Role of Microbial Electron Transfer in the Coevolution of the Biosphere and Geosphere. Annu Rev Microbiol 2016; 70:45-62. [PMID: 27297124 DOI: 10.1146/annurev-micro-102215-095521] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
All life on Earth is dependent on biologically mediated electron transfer (i.e., redox) reactions that are far from thermodynamic equilibrium. Biological redox reactions originally evolved in prokaryotes and ultimately, over the first ∼2.5 billion years of Earth's history, formed a global electronic circuit. To maintain the circuit on a global scale requires that oxidants and reductants be transported; the two major planetary wires that connect global metabolism are geophysical fluids-the atmosphere and the oceans. Because all organisms exchange gases with the environment, the evolution of redox reactions has been a major force in modifying the chemistry at Earth's surface. Here we briefly review the discovery and consequences of redox reactions in microbes with a specific focus on the coevolution of life and geochemical phenomena.
Collapse
Affiliation(s)
- Benjamin I Jelen
- Environmental Biophysics and Molecular Ecology Program, Institute of Earth, Ocean and Atmospheric Sciences, Rutgers University, New Brunswick, New Jersey 08901; , ,
| | - Donato Giovannelli
- Environmental Biophysics and Molecular Ecology Program, Institute of Earth, Ocean and Atmospheric Sciences, Rutgers University, New Brunswick, New Jersey 08901; , , .,Institute of Marine Science, National Research Council, 60125 Ancona, Italy.,Program in Interdisciplinary Studies, Institute for Advanced Studies, Princeton, New Jersey 08540.,Earth-Life Science Institute, Tokyo Institute of Technology, Tokyo, Japan 152-8550
| | - Paul G Falkowski
- Environmental Biophysics and Molecular Ecology Program, Institute of Earth, Ocean and Atmospheric Sciences, Rutgers University, New Brunswick, New Jersey 08901; , , .,Department of Earth and Planetary Sciences, Rutgers University, New Brunswick, New Jersey 08854
| |
Collapse
|
9
|
Akram MS, Ur Rehman J, Hall EAH. Engineered proteins for bioelectrochemistry. ANNUAL REVIEW OF ANALYTICAL CHEMISTRY (PALO ALTO, CALIF.) 2014; 7:257-274. [PMID: 24818813 DOI: 10.1146/annurev-anchem-071213-020143] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
It is only in the past two decades that excellent protein engineering tools have begun to meet parallel advances in materials chemistry, nanofabrication, and electronics. This is revealing scenarios from which synthetic enzymes can emerge, which were previously impossible, as well as interfaces with novel electrode materials. That means the control of the protein structure, electron transport pathway, and electrode surface can usher us into a new era of bioelectrochemistry. This article reviews the principle of electron transfer (ET) and considers how its application at the electrode, within the protein, and at a redox group is directing key advances in the understanding of protein structure to create systems that exhibit better efficiency and unique bioelectrochemistry.
Collapse
Affiliation(s)
- Muhammad Safwan Akram
- Institute of Biotechnology, Department of Chemical Engineering and Biotechnology, University of Cambridge, Cambridge CB2 1QT United Kingdom;
| | | | | |
Collapse
|
10
|
Senn S, Nanda V, Falkowski P, Bromberg Y. Function-based assessment of structural similarity measurements using metal co-factor orientation. Proteins 2013; 82:648-56. [PMID: 24127252 DOI: 10.1002/prot.24442] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2013] [Revised: 09/17/2013] [Accepted: 09/26/2013] [Indexed: 12/20/2022]
Abstract
Structure comparison is widely used to quantify protein relationships. Although there are several approaches to calculate structural similarity, specifying significance thresholds for similarity metrics is difficult due to the inherent likeness of common secondary structure elements. In this study, metal co-factor location is used to assess the biological relevance of structural alignments. The distance between the centroids of bound co-factors adds a chemical and function-relevant constraint to the structural superimposition of two proteins. This additional dimension can be used to define cut-off values for discriminating valid and spurious alignments in large alignment sets. The hypothesis underlying our approach is that metal coordination sites constrain structural evolution, thus revealing functional relationships between distantly related proteins. A comparison of three related nitrogenases shows the sequence and fold constraints imposed on the protein structures up to 18 Å away from the centers of their bound metal clusters.
Collapse
Affiliation(s)
- Stefan Senn
- Environmental Biophysics and Molecular Ecology Program, Institute of Marine and Coastal Sciences, Rutgers University, New Brunswick, New Jersey, 08901
| | | | | | | |
Collapse
|
11
|
Roy A, Sarrou I, Vaughn MD, Astashkin AV, Ghirlanda G. De Novo Design of an Artificial Bis[4Fe-4S] Binding Protein. Biochemistry 2013; 52:7586-94. [DOI: 10.1021/bi401199s] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Anindya Roy
- Department
of Chemistry and Biochemistry, Arizona State University, Tempe, Arizona 85287-1604, United States
| | - Iosifina Sarrou
- Department
of Chemistry and Biochemistry, Arizona State University, Tempe, Arizona 85287-1604, United States
| | - Michael D. Vaughn
- Department
of Chemistry and Biochemistry, Arizona State University, Tempe, Arizona 85287-1604, United States
| | - Andrei V. Astashkin
- Department
of Chemistry and Biochemistry, University of Arizona, Tucson, Arizona 85721, United States
| | - Giovanna Ghirlanda
- Department
of Chemistry and Biochemistry, Arizona State University, Tempe, Arizona 85287-1604, United States
| |
Collapse
|
12
|
Dey F, Cliff Zhang Q, Petrey D, Honig B. Toward a "structural BLAST": using structural relationships to infer function. Protein Sci 2013; 22:359-66. [PMID: 23349097 DOI: 10.1002/pro.2225] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2012] [Revised: 01/17/2013] [Accepted: 01/17/2013] [Indexed: 02/05/2023]
Abstract
We outline a set of strategies to infer protein function from structure. The overall approach depends on extensive use of homology modeling, the exploitation of a wide range of global and local geometric relationships between protein structures and the use of machine learning techniques. The combination of modeling with broad searches of protein structure space defines a "structural BLAST" approach to infer function with high genomic coverage. Applications are described to the prediction of protein-protein and protein-ligand interactions. In the context of protein-protein interactions, our structure-based prediction algorithm, PrePPI, has comparable accuracy to high-throughput experiments. An essential feature of PrePPI involves the use of Bayesian methods to combine structure-derived information with non-structural evidence (e.g. co-expression) to assign a likelihood for each predicted interaction. This, combined with a structural BLAST approach significantly expands the range of applications of protein structure in the annotation of protein function, including systems level biological applications where it has previously played little role.
Collapse
Affiliation(s)
- Fabian Dey
- Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, Center for Computational Biology and Bioinformatics and Initiative in Systems Biology, Columbia University, New York, New York 10032, USA
| | | | | | | |
Collapse
|
13
|
On the evolutionary origins of "Fold Space Continuity": a study of topological convergence and divergence in mixed alpha-beta domains. J Struct Biol 2010; 172:244-52. [PMID: 20691788 DOI: 10.1016/j.jsb.2010.07.016] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2010] [Revised: 06/25/2010] [Accepted: 07/31/2010] [Indexed: 11/21/2022]
Abstract
Existing protein structure classifications group proteins by overall structural similarity at the highest level and by evolutionary relationships at the lowest level, deriving higher-level groups by pairwise structure comparison. For this to be successful requires that large changes in structure are relatively rare in evolution and that proteins with no detectable evolutionary relationship do not converge on similar global chain conformations since this creates conflicts between structural and evolutionary consistency. Analysis of global structural changes using core topological descriptions for 4261 domains from classes C and D of the SCOP database and new measures of topological distance and consistency of classification showed that the topological consistency of SCOP folds is highly variable with some folds having no consistent description and significant overlaps between groups including some members of separate folds with identical topological descriptions. Topological clustering shows that including sufficient indels to allow family members to be joined would also require joining several distinct folds. We conclude that evolutionary changes in the global topology of protein domains are the root cause of many difficulties for present approaches to structure classification using pairwise comparison. As a resolution we propose that a purely structural classification should be created using an approach similar to that adopted by the Gene Ontology in which proteins are assigned labels describing structure.
Collapse
|
14
|
Grzyb J, Xu F, Weiner L, Reijerse EJ, Lubitz W, Nanda V, Noy D. De novo design of a non-natural fold for an iron–sulfur protein: Alpha-helical coiled-coil with a four-iron four-sulfur cluster binding site in its central core. BIOCHIMICA ET BIOPHYSICA ACTA-BIOENERGETICS 2010; 1797:406-13. [DOI: 10.1016/j.bbabio.2009.12.012] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2009] [Revised: 12/11/2009] [Accepted: 12/16/2009] [Indexed: 01/09/2023]
|
15
|
Remmert M, Biegert A, Linke D, Lupas AN, Söding J. Evolution of outer membrane beta-barrels from an ancestral beta beta hairpin. Mol Biol Evol 2010; 27:1348-58. [PMID: 20106904 DOI: 10.1093/molbev/msq017] [Citation(s) in RCA: 84] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Outer membrane beta-barrels (OMBBs) are the major class of outer membrane proteins from Gram-negative bacteria, mitochondria, and plastids. Their transmembrane domains consist of 8-24 beta-strands forming a closed, barrel-shaped beta-sheet around a central pore. Despite their obvious structural regularity, evidence for an origin by duplication or for a common ancestry has not been found. We use three complementary approaches to show that all OMBBs from Gram-negative bacteria evolved from a single, ancestral beta beta hairpin. First, we link almost all families of known single-chain bacterial OMBBs with each other through transitive profile searches. Second, we identify a clear repeat signature in the sequences of many OMBBs in which the repeating sequence unit coincides with the structural beta beta hairpin repeat. Third, we show that the observed sequence similarity between OMBB hairpins cannot be explained by structural or membrane constraints on their sequences. The third approach addresses a longstanding problem in protein evolution: how to distinguish between a very remotely homologous relationship and the opposing scenario of "sequence convergence." The origin of a diverse group of proteins from a single hairpin module supports the hypothesis that, around the time of transition from the RNA to the protein world, proteins arose by amplification and recombination of short peptide modules that had previously evolved as cofactors of RNAs.
Collapse
Affiliation(s)
- M Remmert
- Department of Biochemistry, Gene Center Munich and Center for Integrated Protein Science (CIPSM), Ludwig-Maximilians-Universtät München, Munich, Germany
| | | | | | | | | |
Collapse
|
16
|
|
17
|
Sadreyev RI, Kim BH, Grishin NV. Discrete-continuous duality of protein structure space. Curr Opin Struct Biol 2009; 19:321-8. [PMID: 19482467 PMCID: PMC3688466 DOI: 10.1016/j.sbi.2009.04.009] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2009] [Revised: 04/29/2009] [Accepted: 04/29/2009] [Indexed: 11/30/2022]
Abstract
Recently, the nature of protein structure space has been widely discussed in the literature. The traditional discrete view of protein universe as a set of separate folds has been criticized in the light of growing evidence that almost any arrangement of secondary structures is possible and the whole protein space can be traversed through a path of similar structures. Here we argue that the discrete and continuous descriptions are not mutually exclusive, but complementary: the space is largely discrete in evolutionary sense, but continuous geometrically when purely structural similarities are quantified. Evolutionary connections are mainly confined to separate structural prototypes corresponding to folds as islands of structural stability, with few remaining traceable links between the islands. However, for a geometric similarity measure, it is usually possible to find a reasonable cutoff that yields paths connecting any two structures through intermediates.
Collapse
Affiliation(s)
- Ruslan I. Sadreyev
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9050, USA
| | - Bong-Hyun Kim
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9050, USA
| | - Nick V. Grishin
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9050, USA
- Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-9050, USA
| |
Collapse
|
18
|
Petrey D, Honig B. Is protein classification necessary? Toward alternative approaches to function annotation. Curr Opin Struct Biol 2009; 19:363-8. [PMID: 19269161 PMCID: PMC2745633 DOI: 10.1016/j.sbi.2009.02.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2009] [Accepted: 02/02/2009] [Indexed: 11/16/2022]
Abstract
The current nonredundant protein sequence database contains over seven million entries and the number of individual functional domains is significantly larger than this value. The vast quantity of data associated with these proteins poses enormous challenges to any attempt at function annotation. Classification of proteins into sequence and structural groups has been widely used as an approach to simplifying the problem. In this article we question such strategies. We describe how the multifunctionality and structural diversity of even closely related proteins confounds efforts to assign function on the basis of overall sequence or structural similarity. Rather, we suggest that strategies that avoid classification may offer a more robust approach to protein function annotation.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
19
|
Vinogradov SN, Hoogewijs D, Bailly X, Mizuguchi K, Dewilde S, Moens L, Vanfleteren JR. A model of globin evolution. Gene 2007; 398:132-42. [PMID: 17540514 DOI: 10.1016/j.gene.2007.02.041] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2006] [Revised: 02/20/2007] [Accepted: 02/21/2007] [Indexed: 11/19/2022]
Abstract
Putative globins have been identified in 426 bacterial, 32 Archaeal and 67 eukaryote genomes. Among these sequences are the hitherto unsuspected presence of single domain sensor globins within Bacteria, Fungi, and a Euryarchaeote. Bayesian phylogenetic trees suggest that their occurrence in the latter two groups could be the result of lateral gene transfer from Bacteria. Iterated psiblast searches based on groups of globin sequences indicate that bacterial flavohemoglobins are closer to metazoan globins than to the other two lineages, the 2-over-2 globins and the globin-coupled sensors. Since Bacteria is the only kingdom to have all the subgroups of the three globin lineages, we propose a working model of globin evolution based on the assumption that all three lineages originated and evolved only in Bacteria. Although the 2-over-2 globins and the globin-coupled sensors recognize flavohemoglobins, there is little recognition between them. Thus, in the first stage of globin evolution, we favor a flavohemoglobin-like single domain protein as the ancestral globin. The next stage comprised the splitting off to single domain 2-over-2 and sensor-like globins, followed by the covalent addition of C-terminal domains resulting in the chimeric flavohemoglobins and globin-coupled sensors. The last stage encompassed the lateral gene transfers of some members of the three globin lineages to specific groups of Archaea and Eukaryotes.
Collapse
Affiliation(s)
- Serge N Vinogradov
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, MI 48201, USA.
| | | | | | | | | | | | | |
Collapse
|
20
|
Sadreyev RI, Grishin NV. Exploring dynamics of protein structure determination and homology-based prediction to estimate the number of superfamilies and folds. BMC STRUCTURAL BIOLOGY 2006; 6:6. [PMID: 16549009 PMCID: PMC1444916 DOI: 10.1186/1472-6807-6-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2005] [Accepted: 03/20/2006] [Indexed: 11/10/2022]
Abstract
Background As tertiary structure is currently available only for a fraction of known protein families, it is important to assess what parts of sequence space have been structurally characterized. We consider protein domains whose structure can be predicted by sequence similarity to proteins with solved structure and address the following questions. Do these domains represent an unbiased random sample of all sequence families? Do targets solved by structural genomic initiatives (SGI) provide such a sample? What are approximate total numbers of structure-based superfamilies and folds among soluble globular domains? Results To make these assessments, we combine two approaches: (i) sequence analysis and homology-based structure prediction for proteins from complete genomes; and (ii) monitoring dynamics of the assigned structure set in time, with the accumulation of experimentally solved structures. In the Clusters of Orthologous Groups (COG) database, we map the growing population of structurally characterized domain families onto the network of sequence-based connections between domains. This mapping reveals a systematic bias suggesting that target families for structure determination tend to be located in highly populated areas of sequence space. In contrast, the subset of domains whose structure is initially inferred by SGI is similar to a random sample from the whole population. To accommodate for the observed bias, we propose a new non-parametric approach to the estimation of the total numbers of structural superfamilies and folds, which does not rely on a specific model of the sampling process. Based on dynamics of robust distribution-based parameters in the growing set of structure predictions, we estimate the total numbers of superfamilies and folds among soluble globular proteins in the COG database. Conclusion The set of currently solved protein structures allows for structure prediction in approximately a third of sequence-based domain families. The choice of targets for structure determination is biased towards domains with many sequence-based homologs. The growing SGI output in the future should further contribute to the reduction of this bias. The total number of structural superfamilies and folds in the COG database are estimated as ~4000 and ~1700. These numbers are respectively four and three times higher than the numbers of superfamilies and folds that can currently be assigned to COG proteins.
Collapse
Affiliation(s)
- Ruslan I Sadreyev
- Howard Hughes Medical Institute/Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-8816, USA
| | - Nick V Grishin
- Howard Hughes Medical Institute/Department of Biochemistry, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd, Dallas, TX 75390-8816, USA
| |
Collapse
|