1
|
Sidar A, Voshol GP, Arentshorst M, Ram AFJ, Vijgenboom E, Punt PJ. Deciphering domain structures of Aspergillus and Streptomyces GH3-β-Glucosidases: a screening system for enzyme engineering and biotechnological applications. BMC Res Notes 2024; 17:257. [PMID: 39256846 PMCID: PMC11389254 DOI: 10.1186/s13104-024-06896-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 08/13/2024] [Indexed: 09/12/2024] Open
Abstract
The glycoside hydrolase family 3 (GH3) β-glucosidases from filamentous fungi are crucial industrial enzymes facilitating the complete degradation of lignocellulose, by converting cello-oligosaccharides and cellobiose into glucose. Understanding the diverse domain organization is essential for elucidating their biological roles and potential biotechnological applications. This research delves into the variability of domain organization within GH3 β-glucosidases. Two distinct configurations were identified in fungal GH3 β-glucosidases, one comprising solely the GH3 catalytic domain, and another incorporating the GH3 domain with a C-terminal fibronectin type III (Fn3) domain. Notably, Streptomyces filamentous bacteria showcased a separate clade of GH3 proteins linking the GH3 domain to a carbohydrate binding module from family 2 (CBM2). As a first step to be able to explore the role of accessory domains in β-glucosidase activity, a screening system utilizing the well-characterised Aspergillus niger β-glucosidase gene (bglA) in bglA deletion mutant host was developed. Based on this screening system, reintroducing the native GH3-Fn3 gene successfully expressed the gene allowing detection of the protein using different enzymatic assays. Further investigation into the role of the accessory domains in GH3 family proteins, including those from Streptomyces, will be required to design improved chimeric β-glucosidases enzymes for industrial application.
Collapse
Affiliation(s)
- Andika Sidar
- Institute of Biology Leiden, Fungal Genetics and Biotechnology, Leiden University, Leiden, The Netherlands.
- Department of Food and Agricultural Product Technology, Gadjah Mada University, Yogyakarta, Indonesia.
| | - Gerben P Voshol
- Institute of Biology Leiden, Fungal Genetics and Biotechnology, Leiden University, Leiden, The Netherlands
- , Genomescan, Leiden, The Netherlands
| | - Mark Arentshorst
- Institute of Biology Leiden, Fungal Genetics and Biotechnology, Leiden University, Leiden, The Netherlands
| | - Arthur F J Ram
- Institute of Biology Leiden, Fungal Genetics and Biotechnology, Leiden University, Leiden, The Netherlands
| | - Erik Vijgenboom
- Institute of Biology Leiden, Fungal Genetics and Biotechnology, Leiden University, Leiden, The Netherlands
| | - Peter J Punt
- Institute of Biology Leiden, Fungal Genetics and Biotechnology, Leiden University, Leiden, The Netherlands.
- Ginkgo Bioworks, Zeist, The Netherlands.
| |
Collapse
|
2
|
Douglas J, Cui H, Perona JJ, Vargas-Rodriguez O, Tyynismaa H, Carreño CA, Ling J, Ribas de Pouplana L, Yang XL, Ibba M, Becker H, Fischer F, Sissler M, Carter CW, Wills PR. AARS Online: A collaborative database on the structure, function, and evolution of the aminoacyl-tRNA synthetases. IUBMB Life 2024. [PMID: 39247978 DOI: 10.1002/iub.2911] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2024] [Accepted: 08/07/2024] [Indexed: 09/10/2024]
Abstract
The aminoacyl-tRNA synthetases (aaRS) are a large group of enzymes that implement the genetic code in all known biological systems. They attach amino acids to their cognate tRNAs, moonlight in various translational and non-translational activities beyond aminoacylation, and are linked to many genetic disorders. The aaRS have a subtle ontology characterized by structural and functional idiosyncrasies that vary from organism to organism, and protein to protein. Across the tree of life, the 22 coded amino acids are handled by 16 evolutionary families of Class I aaRS and 21 families of Class II aaRS. We introduce AARS Online, an interactive Wikipedia-like tool curated by an international consortium of field experts. This platform systematizes existing knowledge about the aaRS by showcasing a taxonomically diverse selection of aaRS sequences and structures. Through its graphical user interface, AARS Online facilitates a seamless exploration between protein sequence and structure, providing a friendly introduction to the material for non-experts and a useful resource for experts. Curated multiple sequence alignments can be extracted for downstream analyses. Accessible at www.aars.online, AARS Online is a free resource to delve into the world of the aaRS.
Collapse
Affiliation(s)
- Jordan Douglas
- Department of Physics, University of Auckland, New Zealand
- Centre for Computational Evolution, University of Auckland, New Zealand
| | - Haissi Cui
- Department of Chemistry, University of Toronto, Canada
| | - John J Perona
- Department of Chemistry, Portland State University, Portland, Oregon, USA
| | - Oscar Vargas-Rodriguez
- Department of Molecular Biology and Biophysics, University of Connecticut, Storrs, Connecticut, USA
| | - Henna Tyynismaa
- Stem Cells and Metabolism Research Program, Faculty of Medicine, University of Helsinki, Finland
| | | | - Jiqiang Ling
- Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, Maryland, USA
| | - Lluís Ribas de Pouplana
- Institute for Research in Biomedicine, The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain
- Catalan Institution for Research and Advanced Studies, Barcelona, Catalonia, Spain
| | - Xiang-Lei Yang
- Department of Molecular Medicine, The Scripps Research Institute, La Jolla, California, USA
| | - Michael Ibba
- Biological Sciences, Chapman University, Orange, California, USA
| | - Hubert Becker
- Génétique Moléculaire, Génomique Microbiologique, University of Strasbourg, France
| | - Frédéric Fischer
- Génétique Moléculaire, Génomique Microbiologique, University of Strasbourg, France
| | - Marie Sissler
- Génétique Moléculaire, Génomique Microbiologique, University of Strasbourg, France
| | - Charles W Carter
- Department of Biochemistry and Biophysics, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Peter R Wills
- Department of Physics, University of Auckland, New Zealand
- Centre for Computational Evolution, University of Auckland, New Zealand
| |
Collapse
|
3
|
Haikal Y, Blazeck J. Exploiting protein domain modularity to enable synthetic control of engineered cells. CURRENT OPINION IN BIOMEDICAL ENGINEERING 2024; 31:100550. [PMID: 39430298 PMCID: PMC11486415 DOI: 10.1016/j.cobme.2024.100550] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2024]
Abstract
The ability to precisely control cellular function in response to external stimuli can enhance the function and safety of cell therapies. In this review, we will detail how the modularity of protein domains has been exploited for cellular control applications, specifically through design of multifunctional synthetic constructs and controllable split moieties. These advances, which build on techniques developed by biologists, protein chemists and drug developers, harness natural evolutionary tendencies of protein domain fusion and fission. In this light, we will highlight recent advances towards the development of novel immunoreceptors, base editors, and cytokines that have achieved intriguing therapeutic potential by taking advantage of well-known protein evolutionary phenomena and have helped cells learn new tricks via synthetic biology. In general, protein modularity, i.e., the relatively facile separation or (re)assembly of functional single protein domains or subdomains, is becoming an enabling phenomenon for cellular engineering by allowing enhanced control of phenotypic responses.
Collapse
Affiliation(s)
- Yusef Haikal
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta GA 303332, USA
| | - John Blazeck
- School of Chemical and Biomolecular Engineering, Georgia Institute of Technology, Atlanta GA 303332, USA
| |
Collapse
|
4
|
Kaminskaya AN, Evpak AS, Belogurov AA, Kudriaeva AA. Tracking of Ubiquitin Signaling through 3.5 Billion Years of Combinatorial Conjugation. Int J Mol Sci 2024; 25:8671. [PMID: 39201358 PMCID: PMC11354881 DOI: 10.3390/ijms25168671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2024] [Revised: 08/01/2024] [Accepted: 08/02/2024] [Indexed: 09/02/2024] Open
Abstract
Ubiquitination is an evolutionary, ancient system of post-translational modification of proteins that occurs through a cascade involving ubiquitin activation, transfer, and conjugation. The maturation of this system has followed two main pathways. The first is the conservation of a universal structural fold of ubiquitin and ubiquitin-like proteins, which are present in both Archaea and Bacteria, as well as in multicellular Eukaryotes. The second is the rise of the complexity of the superfamily of ligases, which conjugate ubiquitin-like proteins to substrates, in terms of an increase in the number of enzyme variants, greater variation in structural organization, and the diversification of their catalytic domains. Here, we examine the diversity of the ubiquitination system among different organisms, assessing the variety and conservation of the key domains of the ubiquitination enzymes and ubiquitin itself. Our data show that E2 ubiquitin-conjugating enzymes of metazoan phyla are highly conservative, whereas the homology of E3 ubiquitin ligases with human orthologues gradually decreases depending on "molecular clock" timing and evolutionary distance. Surprisingly, Chordata and Echinodermata, which diverged over 0.5 billion years ago during the Cambrian explosion, share almost the same homology with humans in the amino acid sequences of E3 ligases but not in their adaptor proteins. These observations may suggest that, firstly, the E2 superfamily already existed in its current form in the last common metazoan ancestor and was generally not affected by purifying selection in metazoans. Secondly, it may indicate convergent evolution of the ubiquitination system and highlight E3 adaptor proteins as the "upper deck" of the ubiquitination system, which plays a crucial role in chordate evolution.
Collapse
Affiliation(s)
- Alena N. Kaminskaya
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; (A.N.K.); (A.S.E.)
| | - Alena S. Evpak
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; (A.N.K.); (A.S.E.)
| | - Alexey A. Belogurov
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; (A.N.K.); (A.S.E.)
- Department of Biological Chemistry, Russian University of Medicine, Ministry of Health of Russian Federation, 127473 Moscow, Russia
| | - Anna A. Kudriaeva
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, 117997 Moscow, Russia; (A.N.K.); (A.S.E.)
| |
Collapse
|
5
|
Windels A, Franceus J, Pleiss J, Desmet T. CANDy: Automated analysis of domain architectures in carbohydrate-active enzymes. PLoS One 2024; 19:e0306410. [PMID: 38990885 PMCID: PMC11238990 DOI: 10.1371/journal.pone.0306410] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Accepted: 06/17/2024] [Indexed: 07/13/2024] Open
Abstract
Carbohydrate-active enzymes (CAZymes) can be found in all domains of life and play a crucial role in metabolic and physiological processes. CAZymes often possess a modular structure, comprising not only catalytic domains but also associated domains such as carbohydrate-binding modules (CBMs) and linker domains. By exploring the modular diversity of CAZy families, catalysts with novel properties can be discovered and further insight in their biological functions and evolutionary relationships can be obtained. Here we present the carbohydrate-active enzyme domain analysis tool (CANDy), an assembly of several novel scripts, tools and databases that allows users to analyze the domain architecture of all protein sequences in a given CAZy family. CANDy's usability is shown on glycoside hydrolase family 48, a small yet underexplored family containing multi-domain enzymes. Our analysis reveals the existence of 35 distinct domain assemblies, including eight known architectures, with the remaining assemblies awaiting characterization. Moreover, we substantiate the occurrence of horizontal gene transfer from prokaryotes to insect orthologs and provide evidence for the subsequent removal of auxiliary domains, likely through a gene fission event. CANDy is available at https://github.com/PyEED/CANDy.
Collapse
Affiliation(s)
- Alex Windels
- Department of Biotechnology, Centre for Synthetic Biology (CSB), Ghent University, Ghent, Belgium
| | - Jorick Franceus
- Department of Biotechnology, Centre for Synthetic Biology (CSB), Ghent University, Ghent, Belgium
| | - Jürgen Pleiss
- Institute of Biochemistry and Technical Biochemistry, University of Stuttgart, Stuttgart, Germany
| | - Tom Desmet
- Department of Biotechnology, Centre for Synthetic Biology (CSB), Ghent University, Ghent, Belgium
| |
Collapse
|
6
|
Mikhailova AA, Dohmen E, Harrison MC. Major changes in domain arrangements are associated with the evolution of termites. J Evol Biol 2024; 37:758-769. [PMID: 38630634 DOI: 10.1093/jeb/voae047] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2023] [Revised: 12/18/2023] [Accepted: 04/12/2024] [Indexed: 04/19/2024]
Abstract
Domains as functional protein units and their rearrangements along the phylogeny can shed light on the functional changes of proteomes associated with the evolution of complex traits like eusociality. This complex trait is associated with sterile soldiers and workers, and long-lived, highly fecund reproductives. Unlike in Hymenoptera (ants, bees, and wasps), the evolution of eusociality within Blattodea, where termites evolved from within cockroaches, was accompanied by a reduction in proteome size, raising the question of whether functional novelty was achieved with existing rather than novel proteins. To address this, we investigated the role of domain rearrangements during the evolution of termite eusociality. Analysing domain rearrangements in the proteomes of three solitary cockroaches and five eusocial termites, we inferred more than 5,000 rearrangements over the phylogeny of Blattodea. The 90 novel domain arrangements that emerged at the origin of termites were enriched for several functions related to longevity, such as protein homeostasis, DNA repair, mitochondrial activity, and nutrient sensing. Many domain rearrangements were related to changes in developmental pathways, important for the emergence of novel castes. Along with the elaboration of social complexity, including permanently sterile workers and larger, foraging colonies, we found 110 further domain arrangements with functions related to protein glycosylation and ion transport. We found an enrichment of caste-biased expression and splicing within rearranged genes, highlighting their importance for the evolution of castes. Furthermore, we found increased levels of DNA methylation among rearranged compared to non-rearranged genes suggesting fundamental differences in their regulation. Our findings indicate the importance of domain rearrangements in the generation of functional novelty necessary for termite eusociality to evolve.
Collapse
Affiliation(s)
- Alina A Mikhailova
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Mark C Harrison
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| |
Collapse
|
7
|
Aubel M, Buchel F, Heames B, Jones A, Honc O, Bornberg-Bauer E, Hlouchova K. High-throughput Selection of Human de novo-emerged sORFs with High Folding Potential. Genome Biol Evol 2024; 16:evae069. [PMID: 38597156 PMCID: PMC11024478 DOI: 10.1093/gbe/evae069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 03/11/2024] [Accepted: 03/23/2024] [Indexed: 04/11/2024] Open
Abstract
De novo genes emerge from previously noncoding stretches of the genome. Their encoded de novo proteins are generally expected to be similar to random sequences and, accordingly, with no stable tertiary fold and high predicted disorder. However, structural properties of de novo proteins and whether they differ during the stages of emergence and fixation have not been studied in depth and rely heavily on predictions. Here we generated a library of short human putative de novo proteins of varying lengths and ages and sorted the candidates according to their structural compactness and disorder propensity. Using Förster resonance energy transfer combined with Fluorescence-activated cell sorting, we were able to screen the library for most compact protein structures, as well as most elongated and flexible structures. We find that compact de novo proteins are on average slightly shorter and contain lower predicted disorder than less compact ones. The predicted structures for most and least compact de novo proteins correspond to expectations in that they contain more secondary structure content or higher disorder content, respectively. Our experiments indicate that older de novo proteins have higher compactness and structural propensity compared with young ones. We discuss possible evolutionary scenarios and their implications underlying the age-dependencies of compactness and structural content of putative de novo proteins.
Collapse
Affiliation(s)
- Margaux Aubel
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Filip Buchel
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
- Department of Biochemistry, Faculty of Science, Charles University, Prague, Czech Republic
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Alun Jones
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Ondrej Honc
- Imaging Methods Core Facility, BIOCEV, Prague, Czech Republic
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
- Department of Protein Evolution, Max Planck-Institute for Biology Tuebingen, Tuebingen, Germany
| | - Klara Hlouchova
- Department of Cell Biology, Faculty of Science, Charles University, Prague, Czech Republic
- Institute of Organic Chemistry and Biochemistry, Czech Academy of Sciences, Prague, Czech Republic
| |
Collapse
|
8
|
Coleman T, Shin J, Silberg JJ, Shamoo Y, Atkinson JT. The Biochemical Impact of Extracting an Embedded Adenylate Kinase Domain Using Circular Permutation. Biochemistry 2024; 63:599-609. [PMID: 38357768 DOI: 10.1021/acs.biochem.3c00605] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2024]
Abstract
Adenylate kinases (AKs) have evolved AMP-binding and lid domains that are encoded as continuous polypeptides embedded at different locations within the discontinuous polypeptide encoding the core domain. A prior study showed that AK homologues of different stabilities consistently retain cellular activity following circular permutation that splits a region with high energetic frustration within the AMP-binding domain into discontinuous fragments. Herein, we show that mesophilic and thermophilic AKs having this topological restructuring retain activity and substrate-binding characteristics of the parental AK. While permutation decreased the activity of both AK homologues at physiological temperatures, the catalytic activity of the thermophilic AK increased upon permutation when assayed >30 °C below the melting temperature of the native AK. The thermostabilities of the permuted AKs were uniformly lower than those of native AKs, and they exhibited multiphasic unfolding transitions, unlike the native AKs, which presented cooperative thermal unfolding. In addition, proteolytic digestion revealed that permutation destabilized each AK in differing manners, and mass spectrometry suggested that the new termini within the AMP-binding domain were responsible for the increased proteolysis sensitivity. These findings illustrate how changes in contact order can be used to tune enzyme activity and alter folding dynamics in multidomain enzymes.
Collapse
Affiliation(s)
- Tom Coleman
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| | - John Shin
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| | - Jonathan J Silberg
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
- Department of Chemical and Biomolecular Engineering, Rice University, MS-362, 6100 Main Street, Houston, Texas 77005, United States
- Department of Bioengineering, Rice University, MS-142, 6100 Main Street, Houston, Texas 77005, United States
| | - Yousif Shamoo
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
| | - Joshua T Atkinson
- Department of BioSciences, Rice University, MS-140, 6100 Main Street, Houston, Texas 77005, United States
- Department of Physics and Astronomy, University of Southern California, Los Angeles, California 90007, United States
- Department of Civil and Environmental Engineering, Princeton University, Princeton, New Jersey 08544, United States
- Omenn-Darling Bioengineering Institute, Princeton University, Princeton, New Jersey 08544, United States
| |
Collapse
|
9
|
Feldmeyer B, Bornberg-Bauer E, Dohmen E, Fouks B, Heckenhauer J, Huylmans AK, Jones ARC, Stolle E, Harrison MC. Comparative Evolutionary Genomics in Insects. Methods Mol Biol 2024; 2802:473-514. [PMID: 38819569 DOI: 10.1007/978-1-0716-3838-5_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/01/2024]
Abstract
Genome sequencing quality, in terms of both read length and accuracy, is constantly improving. By combining long-read sequencing technologies with various scaffolding techniques, chromosome-level genome assemblies are now achievable at an affordable price for non-model organisms. Insects represent an exciting taxon for studying the genomic underpinnings of evolutionary innovations, due to ancient origins, immense species-richness, and broad phenotypic diversity. Here we summarize some of the most important methods for carrying out a comparative genomics study on insects. We describe available tools and offer concrete tips on all stages of such an endeavor from DNA extraction through genome sequencing, annotation, and several evolutionary analyses. Along the way we describe important insect-specific aspects, such as DNA extraction difficulties or gene families that are particularly difficult to annotate, and offer solutions. We describe results from several examples of comparative genomics analyses on insects to illustrate the fascinating questions that can now be addressed in this new age of genomics research.
Collapse
Affiliation(s)
- Barbara Feldmeyer
- Senckenberg Biodiversity and Climate Research Centre (SBiK-F), Molecular Ecology, Frankfurt, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Elias Dohmen
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Bertrand Fouks
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Jacqueline Heckenhauer
- LOEWE Centre for Translational Biodiversity Genomics (LOEWE-TBG), Frankfurt, Germany
- Department of Terrestrial Zoology, Senckenberg Research Institute and Natural History Museum Frankfurt, Frankfurt, Germany
| | - Ann Kathrin Huylmans
- Institute of Organismic and Molecular Evolution, Johannes Gutenberg University, Mainz, Germany
| | - Alun R C Jones
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Eckart Stolle
- Museum Koenig, Leibniz Institute for the Analysis of Biodiversity Change (LIB), Bonn, Germany
| | - Mark C Harrison
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
10
|
Gollapalli P, Rudrappa S, Kumar V, Santosh Kumar HS. Domain Architecture Based Methods for Comparative Functional Genomics Toward Therapeutic Drug Target Discovery. J Mol Evol 2023; 91:598-615. [PMID: 37626222 DOI: 10.1007/s00239-023-10129-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 08/06/2023] [Indexed: 08/27/2023]
Abstract
Genes duplicate, mutate, recombine, fuse or fission to produce new genes, or when genes are formed from de novo, novel functions arise during evolution. Researchers have tried to quantify the causes of these molecular diversification processes to know how these genes increase molecular complexity over a period of time, for instance protein domain organization. In contrast to global sequence similarity, protein domain architectures can capture key structural and functional characteristics, making them better proxies for describing functional equivalence. In Prokaryotes and eukaryotes it has proven that, domain designs are retained over significant evolutionary distances. Protein domain architectures are now being utilized to categorize and distinguish evolutionarily related proteins and find homologs among species that are evolutionarily distant from one another. Additionally, structural information stored in domain structures has accelerated homology identification and sequence search methods. Tools for functional protein annotation have been developed to discover, protein domain content, domain order, domain recurrence, and domain position as all these contribute to the prediction of protein functional accuracy. In this review, an attempt is made to summarise facts and speculations regarding the use of protein domain architecture and modularity to identify possible therapeutic targets among cellular activities based on the understanding their linked biological processes.
Collapse
Affiliation(s)
- Pavan Gollapalli
- Center for Bioinformatics and Biostatistics, Nitte (Deemed to be University), Mangalore, Karnataka, 575018, India
| | - Sushmitha Rudrappa
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India
| | - Vadlapudi Kumar
- Department of Biochemistry, Davangere University, Shivagangothri, Davangere, Karnataka, 577007, India
| | - Hulikal Shivashankara Santosh Kumar
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India.
| |
Collapse
|
11
|
Barrera-Redondo J, Lotharukpong JS, Drost HG, Coelho SM. Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra. Genome Biol 2023; 24:54. [PMID: 36964572 PMCID: PMC10037820 DOI: 10.1186/s13059-023-02895-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Accepted: 03/10/2023] [Indexed: 03/26/2023] Open
Abstract
We present GenEra ( https://github.com/josuebarrera/GenEra ), a DIAMOND-fueled gene-family founder inference framework that addresses previously raised limitations and biases in genomic phylostratigraphy, such as homology detection failure. GenEra also reduces computational time from several months to a few days for any genome of interest. We analyze the emergence of taxonomically restricted gene families during major evolutionary transitions in plants, animals, and fungi. Our results indicate that the impact of homology detection failure on inferred patterns of gene emergence is lineage-dependent, suggesting that plants are more prone to evolve novelty through the emergence of new genes compared to animals and fungi.
Collapse
Affiliation(s)
- Josué Barrera-Redondo
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| | - Jaruwatana Sodai Lotharukpong
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany
| | - Hajk-Georg Drost
- Computational Biology Group, Department of Molecular Biology, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| | - Susana M Coelho
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, 72076, Tübingen, Germany.
| |
Collapse
|
12
|
Del Amparo R, González-Vázquez LD, Rodríguez-Moure L, Bastolla U, Arenas M. Consequences of Genetic Recombination on Protein Folding Stability. J Mol Evol 2023; 91:33-45. [PMID: 36463317 PMCID: PMC9849154 DOI: 10.1007/s00239-022-10080-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2022] [Accepted: 11/25/2022] [Indexed: 12/05/2022]
Abstract
Genetic recombination is a common evolutionary mechanism that produces molecular diversity. However, its consequences on protein folding stability have not attracted the same attention as in the case of point mutations. Here, we studied the effects of homologous recombination on the computationally predicted protein folding stability for several protein families, finding less detrimental effects than we previously expected. Although recombination can affect multiple protein sites, we found that the fraction of recombined proteins that are eliminated by negative selection because of insufficient stability is not significantly larger than the corresponding fraction of proteins produced by mutation events. Indeed, although recombination disrupts epistatic interactions, the mean stability of recombinant proteins is not lower than that of their parents. On the other hand, the difference of stability between recombined proteins is amplified with respect to the parents, promoting phenotypic diversity. As a result, at least one third of recombined proteins present stability between those of their parents, and a substantial fraction have higher or lower stability than those of both parents. As expected, we found that parents with similar sequences tend to produce recombined proteins with stability close to that of the parents. Finally, the simulation of protein evolution along the ancestral recombination graph with empirical substitution models commonly used in phylogenetics, which ignore constraints on protein folding stability, showed that recombination favors the decrease of folding stability, supporting the convenience of adopting structurally constrained models when possible for inferences of protein evolutionary histories with recombination.
Collapse
Affiliation(s)
- Roberto Del Amparo
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain ,Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain
| | - Luis Daniel González-Vázquez
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain ,Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain
| | - Laura Rodríguez-Moure
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain ,Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain
| | - Ugo Bastolla
- Centre for Molecular Biology Severo Ochoa (CSIC-UAM), 28049 Madrid, Spain
| | - Miguel Arenas
- CINBIO, Universidade de Vigo, 36310 Vigo, Spain ,Departamento de Bioquímica, Genética e Inmunología, Universidade de Vigo, 36310 Vigo, Spain ,Galicia Sur Health Research Institute (IIS Galicia Sur), 36310 Vigo, Spain
| |
Collapse
|
13
|
Kress A, Poch O, Lecompte O, Thompson JD. Real or fake? Measuring the impact of protein annotation errors on estimates of domain gain and loss events. FRONTIERS IN BIOINFORMATICS 2023; 3:1178926. [PMID: 37151482 PMCID: PMC10158824 DOI: 10.3389/fbinf.2023.1178926] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2023] [Accepted: 04/05/2023] [Indexed: 05/09/2023] Open
Abstract
Protein annotation errors can have significant consequences in a wide range of fields, ranging from protein structure and function prediction to biomedical research, drug discovery, and biotechnology. By comparing the domains of different proteins, scientists can identify common domains, classify proteins based on their domain architecture, and highlight proteins that have evolved differently in one or more species or clades. However, genome-wide identification of different protein domain architectures involves a complex error-prone pipeline that includes genome sequencing, prediction of gene exon/intron structures, and inference of protein sequences and domain annotations. Here we developed an automated fact-checking approach to distinguish true domain loss/gain events from false events caused by errors that occur during the annotation process. Using genome-wide ortholog sets and taking advantage of the high-quality human and Saccharomyces cerevisiae genome annotations, we analyzed the domain gain and loss events in the predicted proteomes of 9 non-human primates (NHP) and 20 non-S. cerevisiae fungi (NSF) as annotated in the Uniprot and Interpro databases. Our approach allowed us to quantify the impact of errors on estimates of protein domain gains and losses, and we show that domain losses are over-estimated ten-fold and three-fold in the NHP and NSF proteins respectively. This is in line with previous studies of gene-level losses, where issues with genome sequencing or gene annotation led to genes being falsely inferred as absent. In addition, we show that insistent protein domain annotations are a major factor contributing to the false events. For the first time, to our knowledge, we show that domain gains are also over-estimated by three-fold and two-fold respectively in NHP and NSF proteins. Based on our more accurate estimates, we infer that true domain losses and gains in NHP with respect to humans are observed at similar rates, while domain gains in the more divergent NSF are observed twice as frequently as domain losses with respect to S. cerevisiae. This study highlights the need to critically examine the scientific validity of protein annotations, and represents a significant step toward scalable computational fact-checking methods that may 1 day mitigate the propagation of wrong information in protein databases.
Collapse
|
14
|
The Modular Architecture of Metallothioneins Facilitates Domain Rearrangements and Contributes to Their Evolvability in Metal-Accumulating Mollusks. Int J Mol Sci 2022; 23:ijms232415824. [PMID: 36555472 PMCID: PMC9781358 DOI: 10.3390/ijms232415824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Revised: 12/05/2022] [Accepted: 12/10/2022] [Indexed: 12/15/2022] Open
Abstract
Protein domains are independent structural and functional modules that can rearrange to create new proteins. While the evolution of multidomain proteins through the shuffling of different preexisting domains has been well documented, the evolution of domain repeat proteins and the origin of new domains are less understood. Metallothioneins (MTs) provide a good case study considering that they consist of metal-binding domain repeats, some of them with a likely de novo origin. In mollusks, for instance, most MTs are bidomain proteins that arose by lineage-specific rearrangements between six putative domains: α, β1, β2, β3, γ and δ. Some domains have been characterized in bivalves and gastropods, but nothing is known about the MTs and their domains of other Mollusca classes. To fill this gap, we investigated the metal-binding features of NpoMT1 of Nautilus pompilius (Cephalopoda class) and FcaMT1 of Falcidens caudatus (Caudofoveata class). Interestingly, whereas NpoMT1 consists of α and β1 domains and has a prototypical Cd2+ preference, FcaMT1 has a singular preference for Zn2+ ions and a distinct domain composition, including a new Caudofoveata-specific δ domain. Overall, our results suggest that the modular architecture of MTs has contributed to MT evolution during mollusk diversification, and exemplify how modularity increases MT evolvability.
Collapse
|
15
|
Does Generic Cyclic Kinase Insert Domain of Receptor Tyrosine Kinase KIT Clone Its Native Homologue? Int J Mol Sci 2022; 23:ijms232112898. [PMID: 36361689 PMCID: PMC9656684 DOI: 10.3390/ijms232112898] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Revised: 10/12/2022] [Accepted: 10/18/2022] [Indexed: 11/23/2022] Open
Abstract
Receptor tyrosine kinases (RTKs) are modular membrane proteins possessing both well-folded and disordered domains acting together in ligand-induced activation and regulation of post-transduction processes that tightly couple extracellular and cytoplasmic events. They ensure the fine-turning control of signal transmission by signal transduction. Deregulation of RTK KIT, including overexpression and gain of function mutations, has been detected in several human cancers. In this paper, we analysed by in silico techniques the Kinase Insert Domain (KID), a key platform of KIT transduction processes, as a generic macrocycle (KIDGC), a cleaved isolated polypeptide (KIDC), and a natively fused TKD domain (KIDD). We assumed that these KID species have similar structural and dynamic characteristics indicating the intrinsically disordered nature of this domain. This finding means that both polypeptides, cyclic KIDGC and linear KIDC, are valid models of KID integrated into the RTK KIT and will be helpful for further computational and empirical studies of post-transduction KIT events.
Collapse
|
16
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
17
|
Cui X, Xue Y, McCormack C, Garces A, Rachman TW, Yi Y, Stolzer M, Durand D. Simulating domain architecture evolution. Bioinformatics 2022; 38:i134-i142. [PMID: 35758772 PMCID: PMC9236583 DOI: 10.1093/bioinformatics/btac242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation Simulation is an essential technique for generating biomolecular data with a ‘known’ history for use in validating phylogenetic inference and other evolutionary methods. On longer time scales, simulation supports investigations of equilibrium behavior and provides a formal framework for testing competing evolutionary hypotheses. Twenty years of molecular evolution research have produced a rich repertoire of simulation methods. However, current models do not capture the stringent constraints acting on the domain insertions, duplications, and deletions by which multidomain architectures evolve. Although these processes have the potential to generate any combination of domains, only a tiny fraction of possible domain combinations are observed in nature. Modeling these stringent constraints on domain order and co-occurrence is a fundamental challenge in domain architecture simulation that does not arise with sequence and gene family simulation. Results Here, we introduce a stochastic model of domain architecture evolution to simulate evolutionary trajectories that reflect the constraints on domain order and co-occurrence observed in nature. This framework is implemented in a novel domain architecture simulator, DomArchov, using the Metropolis–Hastings algorithm with data-driven transition probabilities. The use of a data-driven event module enables quick and easy redeployment of the simulator for use in different taxonomic and protein function contexts. Using empirical evaluation with metazoan datasets, we demonstrate that domain architectures simulated by DomArchov recapitulate properties of genuine domain architectures that reflect the constraints on domain order and adjacency seen in nature. This work expands the realm of evolutionary processes that are amenable to simulation. Availability and implementation DomArchov is written in Python 3 and is available at http://www.cs.cmu.edu/~durand/DomArchov. The data underlying this article are available via the same link. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoyue Cui
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yifan Xue
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Collin McCormack
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Alejandro Garces
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Thomas W Rachman
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yang Yi
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Maureen Stolzer
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
18
|
New Genomic Signals Underlying the Emergence of Human Proto-Genes. Genes (Basel) 2022; 13:genes13020284. [PMID: 35205330 PMCID: PMC8871994 DOI: 10.3390/genes13020284] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 12/04/2022] Open
Abstract
De novo genes are novel genes which emerge from non-coding DNA. Until now, little is known about de novo genes’ properties, correlated to their age and mechanisms of emergence. In this study, we investigate four related properties: introns, upstream regulatory motifs, 5′ Untranslated regions (UTRs) and protein domains, in 23,135 human proto-genes. We found that proto-genes contain introns, whose number and position correlates with the genomic position of proto-gene emergence. The origin of these introns is debated, as our results suggest that 41% of proto-genes might have captured existing introns, and 13.7% of them do not splice the ORF. We show that proto-genes which emerged via overprinting tend to be more enriched in core promotor motifs, while intergenic and intronic genes are more enriched in enhancers, even if the TATA motif is most commonly found upstream in these genes. Intergenic and intronic 5′ UTRs of proto-genes have a lower potential to stabilise mRNA structures than exonic proto-genes and established human genes. Finally, we confirm that proteins expressed by proto-genes gain new putative domains with age. Overall, we find that regulatory motifs inducing transcription and translation of previously non-coding sequences may facilitate proto-gene emergence. Our study demonstrates that introns, 5′ UTRs, and domains have specific properties in proto-genes. We also emphasize that the genomic positions of de novo genes strongly impacts these properties.
Collapse
|
19
|
Murcia-Garzón J, Méndez-Tenorio A. Promiscuous Domains in Eukaryotes and HAT Proteins in FUNGI Have Followed Different Evolutionary Paths. J Mol Evol 2022; 90:124-138. [PMID: 35084521 DOI: 10.1007/s00239-021-10046-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 12/27/2021] [Indexed: 10/19/2022]
Abstract
Diverse studies have shown that the content of genes present in sequenced genomes does not seem to correlate with the complexity of the organisms. However, various studies have shown that organism complexity and the size of the proteome has, indeed, a significant correlation. This characteristic allows us to postulate that some molecular mechanisms have permitted a greater functional diversity to some proteins to increase their participation in developing organisms with higher complexity. Among those mechanisms, the domain promiscuity, defined as the ability of the domains to organize in combination with other distinct domains, is of great importance for the evolution of organisms. Previous works have analyzed the degree of domain promiscuity of the proteomes showing how it seems to have paralleled the evolution of eukaryotic organisms. The latter has motivated the present study, where we analyzed the domain promiscuity in a collection of 84 eukaryotic proteomes representative of all the taxonomy groups of the tree of life. Using a grammar definition approach, we determined the architecture of 1,223,227 proteins, conformed by 2,296,371 domains, which established 839,184 bigram types. The phylogenetic reconstructions based on differences in the content of information from measures of proteome promiscuity confirm that the evolution of the promiscuity of domains in eukaryotic organisms resembles the evolutionary history of the species. However, a close analysis of the PHD and RING domains, the most promiscuous domains found in fungi and functional components of chromatin remodeling enzymes and important expression regulators, suggests an evolution according to their function.
Collapse
Affiliation(s)
- Jazmín Murcia-Garzón
- Laboratorio de Biotecnología Vegetal, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Boulevard del Maestro S/N esq. Elías Piña, Col. Narciso Mendoza, 88710, Reynosa, Tamaulipas, Mexico
| | - Alfonso Méndez-Tenorio
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Prol. de Carpio y Plan de Ayala s/n, Col. Santo Tomás, 11340, Mexico City, Mexico.
| |
Collapse
|
20
|
Padilla-Mejia NE, Makarov AA, Barlow LD, Butterfield ER, Field MC. Evolution and diversification of the nuclear envelope. Nucleus 2021; 12:21-41. [PMID: 33435791 PMCID: PMC7889174 DOI: 10.1080/19491034.2021.1874135] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Revised: 12/08/2020] [Accepted: 12/11/2020] [Indexed: 02/06/2023] Open
Abstract
Eukaryotic cells arose ~1.5 billion years ago, with the endomembrane system a central feature, facilitating evolution of intracellular compartments. Endomembranes include the nuclear envelope (NE) dividing the cytoplasm and nucleoplasm. The NE possesses universal features: a double lipid bilayer membrane, nuclear pore complexes (NPCs), and continuity with the endoplasmic reticulum, indicating common evolutionary origin. However, levels of specialization between lineages remains unclear, despite distinct mechanisms underpinning various nuclear activities. Several distinct modes of molecular evolution facilitate organellar diversification and to understand which apply to the NE, we exploited proteomic datasets of purified nuclear envelopes from model systems for comparative analysis. We find enrichment of core nuclear functions amongst the widely conserved proteins to be less numerous than lineage-specific cohorts, but enriched in core nuclear functions. This, together with consideration of additional evidence, suggests that, despite a common origin, the NE has evolved as a highly diverse organelle with significant lineage-specific functionality.
Collapse
Affiliation(s)
- Norma E. Padilla-Mejia
- Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee, UK
| | - Alexandr A. Makarov
- Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee, UK
| | - Lael D. Barlow
- Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee, UK
| | - Erin R. Butterfield
- Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee, UK
| | - Mark C. Field
- Division of Biological Chemistry and Drug Discovery, School of Life Sciences, University of Dundee, Dundee, UK
- Institute of Parasitology, Biology Centre, Czech Academy of Sciences, České, Czech Republic
| |
Collapse
|
21
|
Yang Z, Liu M, Wang B, Wang B. Classification of protein domains based on their three-dimensional shapes (CPD3DS). Synth Syst Biotechnol 2021; 6:224-230. [PMID: 34541344 PMCID: PMC8429105 DOI: 10.1016/j.synbio.2021.08.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2021] [Revised: 08/23/2021] [Accepted: 08/30/2021] [Indexed: 11/13/2022] Open
Abstract
Protein design has become a powerful method to expand the number of natural proteins and design customized proteins according to demands. Domain-based protein design spares the need to create novel elements from scratch, which makes it a more efficient strategy than scratch-based protein design in designing multi-domain proteins, protein complexes and biomaterials. As the surface shape plays a central role in domain-domain and protein-protein interactions, a global map of the surface shapes of all domains should be very beneficial for domain-based protein design. Therefore, in this study, we characterized the surface shapes of protein domains, collected from CATH and SCOP databases, with their 3D-Zernike descriptors (3DZDs). Then similarities of domain shape features were identified, and all domains were classified accordingly. The preferences of the combinations of domains between different clusters were analyzed in natural proteins from the Protein Data Bank. A user-friendly website, termed CPD3DS, was also developed for storage, retrieval, analyses and visualization of our results. This work not only provides an overall view of protein domain shapes by showing their variety and similarities, but also opens up a new avenue to understand the properties of protein structural domains, and design principles of protein architectures.
Collapse
Affiliation(s)
- Zhaochang Yang
- School of Life Science and Technology, University of Electronic Science and Technology of China, China
| | - Mingkang Liu
- School of Life Science and Technology, University of Electronic Science and Technology of China, China
| | - Bin Wang
- School of Information and Software Engineering, University of Electronic Science and Technology of China, China
| | - Beibei Wang
- School of Life Science and Technology, University of Electronic Science and Technology of China, China.,Centre for Informational Biology, University of Electronic Science and Technology of China, 2006 Xiyuan Road, Chengdu, Sichuan, 611731, China
| |
Collapse
|
22
|
Function and regulation of corin in physiology and disease. Biochem Soc Trans 2021; 48:1905-1916. [PMID: 33125488 DOI: 10.1042/bst20190760] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 09/19/2020] [Accepted: 09/22/2020] [Indexed: 02/07/2023]
Abstract
Atrial natriuretic peptide (ANP) is of major importance in the maintenance of electrolyte balance and normal blood pressure. Reduced plasma ANP levels are associated with the increased risk of cardiovascular disease. Corin is a type II transmembrane serine protease that converts the ANP precursor to mature ANP. Corin deficiency prevents ANP generation and alters electrolyte and body fluid homeostasis. Corin is synthesized as a zymogen that is proteolytically activated on the cell surface. Factors that disrupt corin folding, intracellular trafficking, cell surface expression, and zymogen activation are expected to impair corin function. To date, CORIN variants that reduce corin activity have been identified in hypertensive patients. In addition to the heart, corin expression has been detected in non-cardiac tissues, where corin and ANP participate in diverse physiological processes. In this review, we summarize the current knowledge in corin biosynthesis and post-translational modifications. We also discuss tissue-specific corin expression and function in physiology and disease.
Collapse
|
23
|
Calatayud S, Garcia-Risco M, Capdevila M, Cañestro C, Palacios Ò, Albalat R. Modular Evolution and Population Variability of Oikopleura dioica Metallothioneins. Front Cell Dev Biol 2021; 9:702688. [PMID: 34277643 PMCID: PMC8283569 DOI: 10.3389/fcell.2021.702688] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2021] [Accepted: 06/09/2021] [Indexed: 01/29/2023] Open
Abstract
Chordate Oikopleura dioica probably is the fastest evolving metazoan reported so far, and thereby, a suitable system in which to explore the limits of evolutionary processes. For this reason, and in order to gain new insights on the evolution of protein modularity, we have investigated the organization, function and evolution of multi-modular metallothionein (MT) proteins in O. dioica. MTs are a heterogeneous group of modular proteins defined by their cysteine (C)-rich domains, which confer the capacity of coordinating different transition metal ions. O. dioica has two MTs, a bi-modular OdiMT1 consisting of two domains (t-12C and 12C), and a multi-modular OdiMT2 with six t-12C/12C repeats. By means of mass spectrometry and spectroscopy of metal-protein complexes, we have shown that the 12C domain is able to autonomously bind four divalent metal ions, although the t-12C/12C pair –as it is found in OdiMT1– is the optimized unit for divalent metal binding. We have also shown a direct relationship between the number of the t-12C/12C repeats and the metal-binding capacity of the MTs, which means a stepwise mode of functional and structural evolution for OdiMT2. Finally, after analyzing four different O. dioica populations worldwide distributed, we have detected several OdiMT2 variants with changes in their number of t-12C/12C domain repeats. This finding reveals that the number of repeats fluctuates between current O. dioica populations, which provides a new perspective on the evolution of domain repeat proteins.
Collapse
Affiliation(s)
- Sara Calatayud
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Mario Garcia-Risco
- Departament de Química, Facultat de Ciències, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Mercè Capdevila
- Departament de Química, Facultat de Ciències, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Cristian Cañestro
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| | - Òscar Palacios
- Departament de Química, Facultat de Ciències, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Ricard Albalat
- Departament de Genètica, Microbiologia i Estadística, Facultat de Biologia, Institut de Recerca de la Biodiversitat (IRBio), Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
24
|
Ferruz N, Noske J, Höcker B. Protlego: A Python package for the analysis and design of chimeric proteins. Bioinformatics 2021; 37:3182-3189. [PMID: 33901273 PMCID: PMC8504633 DOI: 10.1093/bioinformatics/btab253] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 03/05/2021] [Accepted: 04/19/2021] [Indexed: 01/03/2023] Open
Abstract
Motivation Duplication and recombination of protein fragments have led to the highly diverse protein space that we observe today. By mimicking this natural process, the design of protein chimeras via fragment recombination has proven experimentally successful and has opened a new era for the design of customizable proteins. The in silico building of structural models for these chimeric proteins, however, remains a manual task that requires a considerable degree of expertise and is not amenable for high-throughput studies. Energetic and structural analysis of the designed proteins often require the use of several tools, each with their unique technical difficulties and available in different programming languages or web servers. Results We implemented a Python package that enables automated, high-throughput design of chimeras and their structural analysis. First, it fetches evolutionarily conserved fragments from a built-in database (also available at fuzzle.uni-bayreuth.de). These relationships can then be represented via networks or further selected for chimera construction via recombination. Designed chimeras or natural proteins are then scored and minimized with the Charmm and Amber forcefields and their diverse structural features can be analyzed at ease. Here, we showcase Protlego’s pipeline by exploring the relationships between the P-loop and Rossmann superfolds, building and characterizing their offspring chimeras. We believe that Protlego provides a powerful new tool for the protein design community. Availability and implementation Protlego runs on the Linux platform and is freely available at (https://hoecker-lab.github.io/protlego/) with tutorials and documentation. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Noelia Ferruz
- Department of Biochemistry, University of Bayreuth, Bayreuth, Germany
| | - Jakob Noske
- Department of Biochemistry, University of Bayreuth, Bayreuth, Germany
| | - Birte Höcker
- Department of Biochemistry, University of Bayreuth, Bayreuth, Germany
| |
Collapse
|
25
|
Hernández-Fernández J, Pinzón-Velasco A, López EA, Rodríguez-Becerra P, Mariño-Ramírez L. Transcriptional Analyses of Acute Exposure to Methylmercury on Erythrocytes of Loggerhead Sea Turtle. TOXICS 2021; 9:70. [PMID: 33805397 PMCID: PMC8066450 DOI: 10.3390/toxics9040070] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/13/2021] [Revised: 03/11/2021] [Accepted: 03/17/2021] [Indexed: 01/09/2023]
Abstract
To understand changes in enzyme activity and gene expression as biomarkers of exposure to methylmercury, we exposed loggerhead turtle erythrocytes (RBCs) to concentrations of 0, 1, and 5 mg L-1 of MeHg and de novo transcriptome were assembled using RNA-seq. The analysis of differentially expressed genes (DEGs) indicated that 79 unique genes were dysregulated (39 upregulated and 44 downregulated genes). The results showed that MeHg altered gene expression patterns as a response to the cellular stress produced, reflected in cell cycle regulation, lysosomal activity, autophagy, calcium regulation, mitochondrial regulation, apoptosis, and regulation of transcription and translation. The analysis of DEGs showed a low response of the antioxidant machinery to MeHg, evidenced by the fact that genes of early response to oxidative stress were not dysregulated. The RBCs maintained a constitutive expression of proteins that represented a good part of the defense against reactive oxygen species (ROS) induced by MeHg.
Collapse
Affiliation(s)
- Javier Hernández-Fernández
- Department of Natural and Environmental Science, Marine Biology Program, Faculty of Science and Engineering, Genetics, Molecular Biology and Bioinformatic Research Group–GENBIMOL, Jorge Tadeo Lozano University, Cra. 4 No 22-61, Bogotá 110311, Colombia;
- Faculty of Sciences, Department of Biology, Pontificia Universidad Javeriana, Calle 45, Cra. 7, Bogotá 110231, Colombia
| | - Andrés Pinzón-Velasco
- Bioinformática y Biología de Sistemas, Universidad Nacional de Colombia, Calle 45, Cra. 30, Bogotá 111321, Colombia;
| | - Ellie Anne López
- IDEASA Research Group-Environment and Sustainability, Institute of Environmental Studies and Services, Sergio Arboleda University, Bogotá 111711, Colombia;
| | - Pilar Rodríguez-Becerra
- Department of Natural and Environmental Science, Marine Biology Program, Faculty of Science and Engineering, Genetics, Molecular Biology and Bioinformatic Research Group–GENBIMOL, Jorge Tadeo Lozano University, Cra. 4 No 22-61, Bogotá 110311, Colombia;
| | - Leonardo Mariño-Ramírez
- NCBI, NLM, NIH Computational Biology Branch, Building 38A, Room 6S614M 8600 Rockville Pike, MSC 6075, Bethesda, MD 20894-6075, USA;
| |
Collapse
|
26
|
Structure and function of naturally evolved de novo proteins. Curr Opin Struct Biol 2021; 68:175-183. [PMID: 33567396 DOI: 10.1016/j.sbi.2020.11.010] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 11/16/2020] [Accepted: 11/27/2020] [Indexed: 01/05/2023]
Abstract
Comparative evolutionary genomics has revealed that novel protein coding genes can emerge randomly from non-coding DNA. While most of the myriad of transcripts which continuously emerge vanish rapidly, some attain regulatory regions, become translated and survive. More surprisingly, sequence properties of de novo proteins are almost indistinguishable from randomly obtained sequences, yet de novo proteins may gain functions and integrate into eukaryotic cellular networks quite easily. We here discuss current knowledge on de novo proteins, their structures, functions and evolution. Since the existence of de novo proteins seems at odds with decade-long attempts to construct proteins with novel structures and functions from scratch, we suggest that a better understanding of de novo protein evolution may fuel new strategies for protein design.
Collapse
|
27
|
Defosset A, Kress A, Nevers Y, Ripp R, Thompson JD, Poch O, Lecompte O. Proteome-Scale Detection of Differential Conservation Patterns at Protein and Subprotein Levels with BLUR. Genome Biol Evol 2020; 13:5991441. [PMID: 33211099 PMCID: PMC7851591 DOI: 10.1093/gbe/evaa248] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/18/2020] [Indexed: 11/23/2022] Open
Abstract
In the multiomics era, comparative genomics studies based on gene repertoire comparison are increasingly used to investigate evolutionary histories of species, to study genotype–phenotype relations, species adaptation to various environments, or to predict gene function using phylogenetic profiling. However, comparisons of orthologs have highlighted the prevalence of sequence plasticity among species, showing the benefits of combining protein and subprotein levels of analysis to allow for a more comprehensive study of genotype/phenotype correlations. In this article, we introduce a new approach called BLUR (BLAST Unexpected Ranking), capable of detecting genotype divergence or specialization between two related clades at different levels: gain/loss of proteins but also of subprotein regions. These regions can correspond to known domains, uncharacterized regions, or even small motifs. Our method was created to allow two types of research strategies: 1) the comparison of two groups of species with no previous knowledge, with the aim of predicting phenotype differences or specializations between close species or 2) the study of specific phenotypes by comparing species that present the phenotype of interest with species that do not. We designed a website to facilitate the use of BLUR with a possibility of in-depth analysis of the results with various tools, such as functional enrichments, protein–protein interaction networks, and multiple sequence alignments. We applied our method to the study of two different biological pathways and to the comparison of several groups of close species, all with very promising results. BLUR is freely available at http://lbgi.fr/blur/.
Collapse
Affiliation(s)
- Audrey Defosset
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Arnaud Kress
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Yannis Nevers
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France.,SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland.,Department of Computational Biology, University of Lausanne, Switzerland.,Center for Integrative Genomics, University of Lausanne, Switzerland
| | - Raymond Ripp
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Julie D Thompson
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Olivier Poch
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| | - Odile Lecompte
- Complex Systems and Translational Bioinformatics, ICube UMR 7357, Université de Strasbourg, France
| |
Collapse
|
28
|
Paladin L, Necci M, Piovesan D, Mier P, Andrade-Navarro MA, Tosatto SCE. A novel approach to investigate the evolution of structured tandem repeat protein families by exon duplication. J Struct Biol 2020; 212:107608. [PMID: 32896658 DOI: 10.1016/j.jsb.2020.107608] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2020] [Revised: 08/19/2020] [Accepted: 08/21/2020] [Indexed: 11/30/2022]
Abstract
Tandem Repeat Proteins (TRPs) are ubiquitous in cells and are enriched in eukaryotes. They contributed to the evolution of organism complexity, specializing for functions that require quick adaptability such as immunity-related functions. To investigate the hypothesis of repeat protein evolution through exon duplication and rearrangement, we designed a tool to analyze the relationships between exon/intron patterns and structural symmetries. The tool allows comparison of the structure fragments as defined by exon/intron boundaries from Ensembl against the structural element repetitions from RepeatsDB. The all-against-all pairwise structural alignment between fragments and comparison of the two definitions (structural units and exons) are visualized in a single matrix, the "repeat/exon plot". An analysis of different repeat protein families, including the solenoids Leucine-Rich, Ankyrin, Pumilio, HEAT repeats and the β propellers Kelch-like, WD40 and RCC1, shows different behaviors, illustrated here through examples. For each example, the analysis of the exon mapping in homologous proteins supports the conservation of their exon patterns. We propose that when a clear-cut relationship between exon and structural boundaries can be identified, it is possible to infer a specific "evolutionary pattern" which may improve TRPs detection and classification.
Collapse
Affiliation(s)
| | - Marco Necci
- Dept. of Biomedical Sciences, University of Padova, Italy
| | | | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University of Mainz, Germany
| | | | | |
Collapse
|
29
|
The structures of two archaeal type IV pili illuminate evolutionary relationships. Nat Commun 2020; 11:3424. [PMID: 32647180 PMCID: PMC7347861 DOI: 10.1038/s41467-020-17268-4] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2020] [Accepted: 06/22/2020] [Indexed: 12/14/2022] Open
Abstract
We have determined the cryo-electron microscopic (cryo-EM) structures of two archaeal type IV pili (T4P), from Pyrobaculum arsenaticum and Saccharolobus solfataricus, at 3.8 Å and 3.4 Å resolution, respectively. This triples the number of high resolution archaeal T4P structures, and allows us to pinpoint the evolutionary divergence of bacterial T4P, archaeal T4P and archaeal flagellar filaments. We suggest that extensive glycosylation previously observed in T4P of Sulfolobus islandicus is a response to an acidic environment, as at even higher temperatures in a neutral environment much less glycosylation is present for Pyrobaculum than for Sulfolobus and Saccharolobus pili. Consequently, the Pyrobaculum filaments do not display the remarkable stability of the Sulfolobus filaments in vitro. We identify the Saccharolobus and Pyrobaculum T4P as host receptors recognized by rudivirus SSRV1 and tristromavirus PFV2, respectively. Our results illuminate the evolutionary relationships among bacterial and archaeal T4P filaments and provide insights into archaeal virus-host interactions. Archaeal type IV pili (T4P) mediate adhesion to surfaces and are receptors for hyperthermophilic archaeal viruses. Here, the authors present the cryo-EM structures of two archaeal T4P from Pyrobaculum arsenaticum and Saccharolobus solfataricus and discuss evolutionary relationships between bacterial T4P, archaeal T4P and archaeal flagellar filaments.
Collapse
|