1
|
Perlinska AP, Nguyen ML, Pilla SP, Staszor E, Lewandowska I, Bernat A, Purta E, Augustyniak R, Bujnicki JM, Sulkowska JI. Are there double knots in proteins? Prediction and in vitro verification based on TrmD-Tm1570 fusion from C. nitroreducens. Front Mol Biosci 2024; 10:1223830. [PMID: 38903539 PMCID: PMC11187310 DOI: 10.3389/fmolb.2023.1223830] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Accepted: 10/04/2023] [Indexed: 06/22/2024] Open
Abstract
We have been aware of the existence of knotted proteins for over 30 years-but it is hard to predict what is the most complicated knot that can be formed in proteins. Here, we show new and the most complex knotted topologies recorded to date-double trefoil knots (31 #31). We found five domain arrangements (architectures) that result in a doubly knotted structure in almost a thousand proteins. The double knot topology is found in knotted membrane proteins from the CaCA family, that function as ion transporters, in the group of carbonic anhydrases that catalyze the hydration of carbon dioxide, and in the proteins from the SPOUT superfamily that gathers 31 knotted methyltransferases with the active site-forming knot. For each family, we predict the presence of a double knot using AlphaFold and RoseTTaFold structure prediction. In the case of the TrmD-Tm1570 protein, which is a member of SPOUT superfamily, we show that it folds in vitro and is biologically active. Our results show that this protein forms a homodimeric structure and retains the ability to modify tRNA, which is the function of the single-domain TrmD protein. However, how the protein folds and is degraded remains unknown.
Collapse
Affiliation(s)
| | - Mai Lan Nguyen
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Polish-Japanese Academy of Information Technology, Warsaw, Poland
| | - Smita P. Pilla
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
| | - Emilia Staszor
- Centre of New Technologies, University of Warsaw, Warsaw, Poland
- Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | | | - Agata Bernat
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | - Elżbieta Purta
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | | | - Janusz M. Bujnicki
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology in Warsaw, Warsaw, Poland
| | | |
Collapse
|
2
|
Gidhi A, Jha SK, Kumar M, Mukhopadhyay K. The F-box protein encoding genes of the leaf-rust fungi Puccinia triticina: genome-wide identification, characterization and expression dynamics during pathogenesis. Arch Microbiol 2024; 206:209. [PMID: 38587657 DOI: 10.1007/s00203-024-03936-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Revised: 01/31/2024] [Accepted: 03/19/2024] [Indexed: 04/09/2024]
Abstract
The F-box proteins in fungi perform diverse functions including regulation of cell cycle, circadian clock, development, signal transduction and nutrient sensing. Genome-wide analysis revealed 10 F-box genes in Puccinia triticina, the causal organism for the leaf rust disease in wheat and were characterized using in silico approaches for revealing phylogenetic relationships, gene structures, gene ontology, protein properties, sequence analysis and gene expression studies. Domain analysis predicted functional domains like WD40 and LRR at C-terminus along with the obvious presence of F-box motif in N-terminus. MSA showed amino acid replacements, which might be due to nucleotide substitution during replication. Phylogenetic analysis revealed the F-box proteins with similar domains to be clustered together while some sequences were spread out in different clades, which might be due to functional diversity. The clustering of Puccinia triticina GG705409 with Triticum aestivum TaAFB4/TaAFB5 in a single clade suggested the possibilities of horizontal gene transfer during the coevolution of P. triticina and wheat. Gene ontological annotation categorized them into three classes and were functionally involved in protein degradation through the protein ubiquitination pathway. Protein-protein interaction network revealed F-box proteins to interact with other components of the SCF complex involved in protein ubiquitination. Relative expression analysis of five F-box genes in a time course experiment denoted their involvement in leaf rust susceptible wheat plants. This study provides information on structure elucidation of F-box proteins of a basidiomycetes plant pathogenic fungi and their role during pathogenesis.
Collapse
Affiliation(s)
- Anupama Gidhi
- School of Genomics and Molecular Breeding, ICAR-Indian Institute of Agricultural Biotechnology, Garhkhatanga, Ranchi, Jharkhand, 834003, India
| | - Shailendra Kumar Jha
- Division of Genetics, ICAR-Indian Agricultural Research Institute, New Delhi, 110012, India
| | - Manish Kumar
- Department of Bioengineering and Biotechnology, Birla Institute of Technology, Mesra, Ranchi, Jharkhand, 835215, India
| | - Kunal Mukhopadhyay
- Department of Bioengineering and Biotechnology, Birla Institute of Technology, Mesra, Ranchi, Jharkhand, 835215, India.
| |
Collapse
|
3
|
Gollapalli P, Rudrappa S, Kumar V, Santosh Kumar HS. Domain Architecture Based Methods for Comparative Functional Genomics Toward Therapeutic Drug Target Discovery. J Mol Evol 2023; 91:598-615. [PMID: 37626222 DOI: 10.1007/s00239-023-10129-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2022] [Accepted: 08/06/2023] [Indexed: 08/27/2023]
Abstract
Genes duplicate, mutate, recombine, fuse or fission to produce new genes, or when genes are formed from de novo, novel functions arise during evolution. Researchers have tried to quantify the causes of these molecular diversification processes to know how these genes increase molecular complexity over a period of time, for instance protein domain organization. In contrast to global sequence similarity, protein domain architectures can capture key structural and functional characteristics, making them better proxies for describing functional equivalence. In Prokaryotes and eukaryotes it has proven that, domain designs are retained over significant evolutionary distances. Protein domain architectures are now being utilized to categorize and distinguish evolutionarily related proteins and find homologs among species that are evolutionarily distant from one another. Additionally, structural information stored in domain structures has accelerated homology identification and sequence search methods. Tools for functional protein annotation have been developed to discover, protein domain content, domain order, domain recurrence, and domain position as all these contribute to the prediction of protein functional accuracy. In this review, an attempt is made to summarise facts and speculations regarding the use of protein domain architecture and modularity to identify possible therapeutic targets among cellular activities based on the understanding their linked biological processes.
Collapse
Affiliation(s)
- Pavan Gollapalli
- Center for Bioinformatics and Biostatistics, Nitte (Deemed to be University), Mangalore, Karnataka, 575018, India
| | - Sushmitha Rudrappa
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India
| | - Vadlapudi Kumar
- Department of Biochemistry, Davangere University, Shivagangothri, Davangere, Karnataka, 577007, India
| | - Hulikal Shivashankara Santosh Kumar
- Department of Biotechnology and Bioinformatics, Jnana Sahyadri Campus, Kuvempu University, Shankaraghatta, Shivamogga, Karnataka, 577451, India.
| |
Collapse
|
4
|
Mohri M, Moghadam A, Burketova L, Ryšánek P. Genome-wide identification of the opsin protein in Leptosphaeria maculans and comparison with other fungi (pathogens of Brassica napus). Front Microbiol 2023; 14:1193892. [PMID: 37692395 PMCID: PMC10485269 DOI: 10.3389/fmicb.2023.1193892] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2023] [Accepted: 06/28/2023] [Indexed: 09/12/2023] Open
Abstract
The largest family of transmembrane receptors are G-protein-coupled receptors (GPCRs). These receptors respond to perceived environmental signals and infect their host plants. Family A of the GPCR includes opsin. However, there is little known about the roles of GPCRs in phytopathogenic fungi. We studied opsin in Leptosphaeria maculans, an important pathogen of oilseed rape (Brassica napus) that causes blackleg disease, and compared it with six other fungal pathogens of oilseed rape. A phylogenetic tree analysis of 31 isoforms of the opsin protein showed six major groups and six subgroups. All three opsin isoforms of L. maculans are grouped in the same clade in the phylogenetic tree. Physicochemical analysis revealed that all studied opsin proteins are stable and hydrophobic. Subcellular localization revealed that most isoforms were localized in the endoplasmic reticulum membrane except for several isoforms in Verticillium species, which were localized in the mitochondrial membrane. Most isoforms comprise two conserved domains. One conserved motif was observed across all isoforms, consisting of the BACTERIAL_OPSIN_1 domain, which has been hypothesized to have an identical sensory function. Most studied isoforms showed seven transmembrane helices, except for one isoform of V. longisporum and four isoforms of Fusarium oxysporum. Tertiary structure prediction displayed a conformational change in four isoforms of F. oxysporum that presumed differences in binding to other proteins and sensing signals, thereby resulting in various pathogenicity strategies. Protein-protein interactions and binding site analyses demonstrated a variety of numbers of ligands and pockets across all isoforms, ranging between 0 and 13 ligands and 4 and 10 pockets. According to the phylogenetic analysis in this study and considerable physiochemically and structurally differences of opsin proteins among all studied fungi hypothesized that this protein acts in the pathogenicity, growth, sporulation, and mating of these fungi differently.
Collapse
Affiliation(s)
- Marzieh Mohri
- Department of Plant Protection, Faculty of Agrobiology, Food, and Natural Resources, Czech University of Life Sciences, Prague, Czechia
| | - Ali Moghadam
- Institute of Biotechnology, Shiraz University, Shiraz, Iran
| | - Lenka Burketova
- Institute of Experimental Botany, Czech Academy of Sciences, Prague, Czechia
| | - Pavel Ryšánek
- Department of Plant Protection, Faculty of Agrobiology, Food, and Natural Resources, Czech University of Life Sciences, Prague, Czechia
| |
Collapse
|
5
|
Intrinsically Disordered Proteins: An Overview. Int J Mol Sci 2022; 23:ijms232214050. [PMID: 36430530 PMCID: PMC9693201 DOI: 10.3390/ijms232214050] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 11/07/2022] [Accepted: 11/08/2022] [Indexed: 11/16/2022] Open
Abstract
Many proteins and protein segments cannot attain a single stable three-dimensional structure under physiological conditions; instead, they adopt multiple interconverting conformational states. Such intrinsically disordered proteins or protein segments are highly abundant across proteomes, and are involved in various effector functions. This review focuses on different aspects of disordered proteins and disordered protein regions, which form the basis of the so-called "Disorder-function paradigm" of proteins. Additionally, various experimental approaches and computational tools used for characterizing disordered regions in proteins are discussed. Finally, the role of disordered proteins in diseases and their utility as potential drug targets are explored.
Collapse
|
6
|
Sharma A, Gupta S, Patil AB, Vijay N. Birth and death in terminal complement pathway. Mol Immunol 2022; 149:174-187. [PMID: 35908437 DOI: 10.1016/j.molimm.2022.07.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2022] [Revised: 06/15/2022] [Accepted: 07/18/2022] [Indexed: 10/16/2022]
Abstract
The cytolytic activity of the membrane attack complex (MAC) is pivotal in the complement-mediated elimination of pathogens. Terminal complement pathway (TCP) genes encode the proteins that form the MAC. Although the TCP genes are well conserved within most vertebrate species, the early evolution of the TCP genes is poorly understood. Based on the comparative genomic analysis of the early evolutionary history of the TCP homologs, we evaluated four possible scenarios that could have given rise to the vertebrate TCP. Currently available genomic data support a scheme of complex sequential protein domain gains that may be responsible for the birth of the vertebrate C6 gene. The subsequent duplication and divergence of this vertebrate C6 gene formed the C7, C8α, C8β, and C9 genes. Compared to the widespread conservation of TCP components within vertebrates, we discovered that C9 has disintegrated in the genomes of galliform birds. Publicly available genome and transcriptome sequencing datasets of chicken from Illumina short read, PacBio long read, and Optical mapping technologies support the validity of the genome assembly at the C9 locus. In this study, we have generated a > 120X coverage whole-genome Chromium 10x linked-read sequencing dataset for the chicken and used it to verify the loss of the C9 gene in the chicken. We find multiple CR1 (chicken repeat 1) element insertions within and near the remnant exons of C9 in several galliform bird genomes. The reconstructed chronology of events shows that the CR1 insertions occurred after C9 gene loss in an early galliform ancestor. Loss of C9 in galliform birds, in contrast to conservation in other vertebrates, may have implications for host-pathogen interactions. Our study of C6 gene birth in an early vertebrate ancestor and C9 gene death in galliform birds provides insights into the evolution of the TCP.
Collapse
Affiliation(s)
- Ashutosh Sharma
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Saumya Gupta
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Ajinkya Bharatraj Patil
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India
| | - Nagarjun Vijay
- Computational Evolutionary Genomics Lab, Department of Biological Sciences, IISER Bhopal, Bhauri, Madhya Pradesh, India.
| |
Collapse
|
7
|
Cui X, Xue Y, McCormack C, Garces A, Rachman TW, Yi Y, Stolzer M, Durand D. Simulating domain architecture evolution. Bioinformatics 2022; 38:i134-i142. [PMID: 35758772 PMCID: PMC9236583 DOI: 10.1093/bioinformatics/btac242] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Motivation Simulation is an essential technique for generating biomolecular data with a ‘known’ history for use in validating phylogenetic inference and other evolutionary methods. On longer time scales, simulation supports investigations of equilibrium behavior and provides a formal framework for testing competing evolutionary hypotheses. Twenty years of molecular evolution research have produced a rich repertoire of simulation methods. However, current models do not capture the stringent constraints acting on the domain insertions, duplications, and deletions by which multidomain architectures evolve. Although these processes have the potential to generate any combination of domains, only a tiny fraction of possible domain combinations are observed in nature. Modeling these stringent constraints on domain order and co-occurrence is a fundamental challenge in domain architecture simulation that does not arise with sequence and gene family simulation. Results Here, we introduce a stochastic model of domain architecture evolution to simulate evolutionary trajectories that reflect the constraints on domain order and co-occurrence observed in nature. This framework is implemented in a novel domain architecture simulator, DomArchov, using the Metropolis–Hastings algorithm with data-driven transition probabilities. The use of a data-driven event module enables quick and easy redeployment of the simulator for use in different taxonomic and protein function contexts. Using empirical evaluation with metazoan datasets, we demonstrate that domain architectures simulated by DomArchov recapitulate properties of genuine domain architectures that reflect the constraints on domain order and adjacency seen in nature. This work expands the realm of evolutionary processes that are amenable to simulation. Availability and implementation DomArchov is written in Python 3 and is available at http://www.cs.cmu.edu/~durand/DomArchov. The data underlying this article are available via the same link. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Xiaoyue Cui
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yifan Xue
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Collin McCormack
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Alejandro Garces
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Thomas W Rachman
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Yang Yi
- Computational Biology, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Maureen Stolzer
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
8
|
Zhou Y, Zhang C, Zhang L, Ye Q, Liu N, Wang M, Long G, Fan W, Long M, Wing RA. Gene fusion as an important mechanism to generate new genes in the genus Oryza. Genome Biol 2022; 23:130. [PMID: 35706016 PMCID: PMC9199173 DOI: 10.1186/s13059-022-02696-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2021] [Accepted: 05/30/2022] [Indexed: 11/16/2022] Open
Abstract
Background Events of gene fusion have been reported in several organisms. However, the general role of gene fusion as part of new gene origination remains unknown. Results We conduct genome-wide interrogations of four Oryza genomes by designing and implementing novel pipelines to detect fusion genes. Based on the phylogeny of ten plant species, we detect 310 fusion genes across four Oryza species. The estimated rate of origination of fusion genes in the Oryza genus is as high as 63 fusion genes per species per million years, which is fixed at 16 fusion genes per species per million years and much higher than that in flies. By RNA sequencing analysis, we find more than 44% of the fusion genes are expressed and 90% of gene pairs show strong signals of purifying selection. Further analysis of CRISPR/Cas9 knockout lines indicates that newly formed fusion genes regulate phenotype traits including seed germination, shoot length and root length, suggesting the functional significance of these genes. Conclusions We detect new fusion genes that may drive phenotype evolution in Oryza. This study provides novel insights into the genome evolution of Oryza. Supplementary Information The online version contains supplementary material available at 10.1186/s13059-022-02696-w.
Collapse
Affiliation(s)
- Yanli Zhou
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China
| | - Chengjun Zhang
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China. .,Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL, 60637, USA.
| | - Li Zhang
- Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL, 60637, USA.,Chinese Institute for Brain Research, (CIBR), Beijing, 102206, China
| | - Qiannan Ye
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China
| | - Ningyawen Liu
- Germplasm Bank of Wild species, Kunming Institute of Botany, Chinese Academy of Science, Kunming, Yunnan, 650201, China
| | - Muhua Wang
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA.,State Key Laboratory for Biocontrol, School of Marine Sciences, Sun Yat-sen University, Zhuhai, 519000, China
| | - Guangqiang Long
- Key Laboratory of Medicinal Plant Biology of Yunnan Province, Yunnan Agricultural University, Kunming, Yunnan, 650201, China
| | - Wei Fan
- Key Laboratory of Medicinal Plant Biology of Yunnan Province, Yunnan Agricultural University, Kunming, Yunnan, 650201, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, 1101 E. 57th Street, Chicago, IL, 60637, USA.
| | - Rod A Wing
- Arizona Genomics Institute, School of Plant Sciences, University of Arizona, Tucson, AZ, 85721, USA. .,Center for Desert Agriculture, King Abdullah University of Science & Technology, Thuwal, 23955-6900, Kingdom of Saudi Arabia.
| |
Collapse
|
9
|
Murcia-Garzón J, Méndez-Tenorio A. Promiscuous Domains in Eukaryotes and HAT Proteins in FUNGI Have Followed Different Evolutionary Paths. J Mol Evol 2022; 90:124-138. [PMID: 35084521 DOI: 10.1007/s00239-021-10046-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Accepted: 12/27/2021] [Indexed: 10/19/2022]
Abstract
Diverse studies have shown that the content of genes present in sequenced genomes does not seem to correlate with the complexity of the organisms. However, various studies have shown that organism complexity and the size of the proteome has, indeed, a significant correlation. This characteristic allows us to postulate that some molecular mechanisms have permitted a greater functional diversity to some proteins to increase their participation in developing organisms with higher complexity. Among those mechanisms, the domain promiscuity, defined as the ability of the domains to organize in combination with other distinct domains, is of great importance for the evolution of organisms. Previous works have analyzed the degree of domain promiscuity of the proteomes showing how it seems to have paralleled the evolution of eukaryotic organisms. The latter has motivated the present study, where we analyzed the domain promiscuity in a collection of 84 eukaryotic proteomes representative of all the taxonomy groups of the tree of life. Using a grammar definition approach, we determined the architecture of 1,223,227 proteins, conformed by 2,296,371 domains, which established 839,184 bigram types. The phylogenetic reconstructions based on differences in the content of information from measures of proteome promiscuity confirm that the evolution of the promiscuity of domains in eukaryotic organisms resembles the evolutionary history of the species. However, a close analysis of the PHD and RING domains, the most promiscuous domains found in fungi and functional components of chromatin remodeling enzymes and important expression regulators, suggests an evolution according to their function.
Collapse
Affiliation(s)
- Jazmín Murcia-Garzón
- Laboratorio de Biotecnología Vegetal, Centro de Biotecnología Genómica, Instituto Politécnico Nacional, Boulevard del Maestro S/N esq. Elías Piña, Col. Narciso Mendoza, 88710, Reynosa, Tamaulipas, Mexico
| | - Alfonso Méndez-Tenorio
- Laboratorio de Biotecnología y Bioinformática Genómica, Departamento de Bioquímica, Escuela Nacional de Ciencias Biológicas, Instituto Politécnico Nacional, Prol. de Carpio y Plan de Ayala s/n, Col. Santo Tomás, 11340, Mexico City, Mexico.
| |
Collapse
|
10
|
Coyote-Maestas W, Nedrud D, Suma A, He Y, Matreyek KA, Fowler DM, Carnevale V, Myers CL, Schmidt D. Probing ion channel functional architecture and domain recombination compatibility by massively parallel domain insertion profiling. Nat Commun 2021; 12:7114. [PMID: 34880224 PMCID: PMC8654947 DOI: 10.1038/s41467-021-27342-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 11/16/2021] [Indexed: 11/10/2022] Open
Abstract
Protein domains are the basic units of protein structure and function. Comparative analysis of genomes and proteomes showed that domain recombination is a main driver of multidomain protein functional diversification and some of the constraining genomic mechanisms are known. Much less is known about biophysical mechanisms that determine whether protein domains can be combined into viable protein folds. Here, we use massively parallel insertional mutagenesis to determine compatibility of over 300,000 domain recombination variants of the Inward Rectifier K+ channel Kir2.1 with channel surface expression. Our data suggest that genomic and biophysical mechanisms acted in concert to favor gain of large, structured domain at protein termini during ion channel evolution. We use machine learning to build a quantitative biophysical model of domain compatibility in Kir2.1 that allows us to derive rudimentary rules for designing domain insertion variants that fold and traffic to the cell surface. Positional Kir2.1 responses to motif insertion clusters into distinct groups that correspond to contiguous structural regions of the channel with distinct biophysical properties tuned towards providing either folding stability or gating transitions. This suggests that insertional profiling is a high-throughput method to annotate function of ion channel structural regions.
Collapse
Affiliation(s)
- Willow Coyote-Maestas
- grid.17635.360000000419368657Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, MN 55455 USA
| | - David Nedrud
- grid.17635.360000000419368657Department of Biochemistry, Molecular Biology & Biophysics, University of Minnesota, Minneapolis, MN 55455 USA
| | - Antonio Suma
- grid.264727.20000 0001 2248 3398Department of Chemistry, Temple University, Philadelphia, PA 19122 USA
| | - Yungui He
- grid.17635.360000000419368657Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, MN 55455 USA
| | - Kenneth A. Matreyek
- grid.67105.350000 0001 2164 3847Department of Pathology, Case Western Reserve University School of Medicine, Cleveland, OH 44106 USA
| | - Douglas M. Fowler
- grid.34477.330000000122986657Department of Genome Sciences, University of Washington, Seattle, WA 98115 USA ,grid.34477.330000000122986657Department of Bioengineering, University of Washington, Seattle, WA 98115 USA
| | - Vincenzo Carnevale
- grid.264727.20000 0001 2248 3398Department of Chemistry, Temple University, Philadelphia, PA 19122 USA
| | - Chad L. Myers
- grid.17635.360000000419368657Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN 55455 USA
| | - Daniel Schmidt
- Department of Genetics, Cell Biology & Development, University of Minnesota, Minneapolis, MN, 55455, USA.
| |
Collapse
|
11
|
Trasviña-Arenas CH, Demir M, Lin WJ, David SS. Structure, function and evolution of the Helix-hairpin-Helix DNA glycosylase superfamily: Piecing together the evolutionary puzzle of DNA base damage repair mechanisms. DNA Repair (Amst) 2021; 108:103231. [PMID: 34649144 DOI: 10.1016/j.dnarep.2021.103231] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2021] [Revised: 09/20/2021] [Accepted: 09/23/2021] [Indexed: 10/20/2022]
Abstract
The Base Excision Repair (BER) pathway is a highly conserved DNA repair system targeting chemical base modifications that arise from oxidation, deamination and alkylation reactions. BER features lesion-specific DNA glycosylases (DGs) which recognize and excise modified or inappropriate DNA bases to produce apurinic/apyrimidinic (AP) sites and coordinate AP-site hand-off to subsequent BER pathway enzymes. The DG superfamilies identified have evolved independently to cope with a wide variety of nucleobase chemical modifications. Most DG superfamilies recognize a distinct set of structurally related lesions. In contrast, the Helix-hairpin-Helix (HhH) DG superfamily has the remarkable ability to act upon structurally diverse sets of base modifications. The versatility in substrate recognition of the HhH-DG superfamily has been shaped by motif and domain acquisitions during evolution. In this paper, we review the structural features and catalytic mechanisms of the HhH-DG superfamily and draw a hypothetical reconstruction of the evolutionary path where these DGs developed diverse and unique enzymatic features.
Collapse
Affiliation(s)
| | - Merve Demir
- Department of Chemistry, University of California, Davis, CA 95616, U.S.A
| | - Wen-Jen Lin
- Department of Chemistry, University of California, Davis, CA 95616, U.S.A
| | - Sheila S David
- Department of Chemistry, University of California, Davis, CA 95616, U.S.A..
| |
Collapse
|
12
|
Gagné M, Deshaies JE, Sidibé H, Benchaar Y, Arbour D, Dubinski A, Litt G, Peyrard S, Robitaille R, Sephton CF, Vande Velde C. hnRNP A1B, a Splice Variant of HNRNPA1, Is Spatially and Temporally Regulated. Front Neurosci 2021; 15:724307. [PMID: 34630013 PMCID: PMC8498194 DOI: 10.3389/fnins.2021.724307] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2021] [Accepted: 08/30/2021] [Indexed: 11/28/2022] Open
Abstract
RNA binding proteins (RBPs) play a key role in cellular growth, homoeostasis and survival and are tightly regulated. A deep understanding of their spatiotemporal regulation is needed to understand their contribution to physiology and pathology. Here, we have characterized the spatiotemporal expression pattern of hnRNP A1 and its splice variant hnRNP A1B in mice. We have found that hnRNP A1B expression is more restricted to the CNS compared to hnRNP A1, and that it can form an SDS-resistant dimer in the CNS. Also, hnRNP A1B expression becomes progressively restricted to motor neurons in the ventral horn of the spinal cord, compared to hnRNP A1 which is more broadly expressed. We also demonstrate that hnRNP A1B is present in neuronal processes, while hnRNP A1 is absent. This finding supports a hypothesis that hnRNP A1B may have a cytosolic function in neurons that is not shared with hnRNP A1. Our results demonstrate that both isoforms are differentially expressed across tissues and have distinct localization profiles, suggesting that the two isoforms may have specific subcellular functions that can uniquely contribute to disease progression.
Collapse
Affiliation(s)
- Myriam Gagné
- Department of Biochemistry, Université de Montréal, Montréal, QC, Canada.,Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
| | - Jade-Emmanuelle Deshaies
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
| | - Hadjara Sidibé
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada.,Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | - Yousri Benchaar
- Department of Psychiatry and Neuroscience, CERVO Brain Research Centre, Laval University, Quebec City, QC, Canada
| | - Danielle Arbour
- Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | - Alicia Dubinski
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada.,Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | - Gurleen Litt
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
| | - Sarah Peyrard
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada
| | - Richard Robitaille
- Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| | - Chantelle F Sephton
- Department of Psychiatry and Neuroscience, CERVO Brain Research Centre, Laval University, Quebec City, QC, Canada
| | - Christine Vande Velde
- Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CRCHUM), Montréal, QC, Canada.,Department of Neurosciences, Université de Montréal, Montréal, QC, Canada
| |
Collapse
|
13
|
Martinez Gomez L, Pozo F, Walsh TA, Abascal F, Tress ML. The clinical importance of tandem exon duplication-derived substitutions. Nucleic Acids Res 2021; 49:8232-8246. [PMID: 34302486 PMCID: PMC8373072 DOI: 10.1093/nar/gkab623] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 07/21/2021] [Indexed: 01/04/2023] Open
Abstract
Most coding genes in the human genome are annotated with multiple alternative transcripts. However, clear evidence for the functional relevance of the protein isoforms produced by these alternative transcripts is often hard to find. Alternative isoforms generated from tandem exon duplication-derived substitutions are an exception. These splice events are rare, but have important functional consequences. Here, we have catalogued the 236 tandem exon duplication-derived substitutions annotated in the GENCODE human reference set. We find that more than 90% of the events have a last common ancestor in teleost fish, so are at least 425 million years old, and twenty-one can be traced back to the Bilateria clade. Alternative isoforms generated from tandem exon duplication-derived substitutions also have significantly more clinical impact than other alternative isoforms. Tandem exon duplication-derived substitutions have >25 times as many pathogenic and likely pathogenic mutations as other alternative events. Tandem exon duplication-derived substitutions appear to have vital functional roles in the cell and may have played a prominent part in metazoan evolution.
Collapse
Affiliation(s)
- Laura Martinez Gomez
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Fernando Pozo
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| | - Thomas A Walsh
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, 3, 28029 Madrid, Spain.,Eukaryotic Annotation Team, EMBL-EBI, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA. UK
| | - Federico Abascal
- Somatic Evolution Group, Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SA, UK
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre (CNIO), C. Melchor Fernandez Almagro, 3, 28029 Madrid, Spain
| |
Collapse
|
14
|
Sadat MA, Ullah MW, Bashar KK, Hossen QMM, Tareq MZ, Islam MS. Genome-wide identification of F-box proteins in Macrophomina phaseolina and comparison with other fungus. J Genet Eng Biotechnol 2021; 19:46. [PMID: 33761027 PMCID: PMC7991009 DOI: 10.1186/s43141-021-00143-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2020] [Accepted: 03/11/2021] [Indexed: 01/01/2023]
Abstract
Background In fungi, like other eukaryotes, protein turnover is an important cellular process for the controlling of various cellular functions. The ubiquitin-proteasome pathway degrades some selected intracellular proteins and F-box proteins are one of the important components controlling protein degradation. F-box proteins are well studied in different model plants however, their functions in the fungi are not clear yet. This study aimed to identify the genes involved in protein degradation for disease development in the Macrophomina phaseolina fungus. Results In this research, in silico studies were done to understand the distribution of F-box proteins in pathogenic fungi including Macrophomina phaseolina fungus. Genome-wide analysis indicates that M. phaseolina fungus contained thirty-one F-box proteins throughout its chromosomes. In addition, there are 17, 37, 16, and 21 F-box proteins have been identified from Puccinia graminis, Colletotrichum graminicola, Ustilago maydis, and Phytophthora infestans, respectively. Analyses revealed that selective fungal genomes contain several additional functional domains along with F-box domain. Sequence alignment showed the substitution of amino acid in several F-box proteins; however, gene duplication was not found among these proteins. Phylogenetic analysis revealed that F-box proteins having similar functional domain was highly diverse form each other showing the possibility of various function. Analysis also found that MPH_00568 and MPH_05531 were closely related to rice blast fungus F-box protein MGG_00768 and MGG_13065, respectively, may play an important role for blast disease development. Conclusion This genome-wide analysis of F-box proteins will be useful for characterization of candidate F-box proteins to understand the molecular mechanisms leading to disease development of M. phaseolina in the host plants. Supplementary Information The online version contains supplementary material available at 10.1186/s43141-021-00143-0.
Collapse
Affiliation(s)
- Md Abu Sadat
- Basic and Applied Research on Jute Project, Bangladesh Jute Research Institute, Manik Mia Avenue, Dhaka, 1207, Bangladesh.
| | - Md Wali Ullah
- Basic and Applied Research on Jute Project, Bangladesh Jute Research Institute, Manik Mia Avenue, Dhaka, 1207, Bangladesh
| | - Kazi Khayrul Bashar
- Basic and Applied Research on Jute Project, Bangladesh Jute Research Institute, Manik Mia Avenue, Dhaka, 1207, Bangladesh
| | - Quazi Md Mosaddeque Hossen
- Basic and Applied Research on Jute Project, Bangladesh Jute Research Institute, Manik Mia Avenue, Dhaka, 1207, Bangladesh
| | - Md Zablul Tareq
- Basic and Applied Research on Jute Project, Bangladesh Jute Research Institute, Manik Mia Avenue, Dhaka, 1207, Bangladesh
| | - Md Shahidul Islam
- Basic and Applied Research on Jute Project, Bangladesh Jute Research Institute, Manik Mia Avenue, Dhaka, 1207, Bangladesh
| |
Collapse
|
15
|
Maturana P, Tobar-Calfucoy E, Fuentealba M, Roversi P, Garratt R, Cabrera R. Crystal structure of the 6-phosphogluconate dehydrogenase from Gluconobacter oxydans reveals tetrameric 6PGDHs as the crucial intermediate in the evolution of structure and cofactor preference in the 6PGDH family. Wellcome Open Res 2021. [DOI: 10.12688/wellcomeopenres.16572.1] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Background: The enzyme 6-phosphogluconate dehydrogenase (6PGDH) is the central enzyme of the oxidative pentose phosphate pathway. Members of the 6PGDH family belong to different classes: either homodimeric enzymes assembled from long-chain subunits or homotetrameric ones assembled from short-chain subunits. Dimeric 6PGDHs bear an internal duplication absent in tetrameric 6PGDHs and distant homologues of the β-hydroxyacid dehydrogenase (βHADH) superfamily. Methods: We use X-ray crystallography to determine the structure of the apo form of the 6PGDH from Gluconobacter oxydans (Go6PGDH). We carried out a structural and phylogenetic analysis of short and long-chain 6PGDHs. We put forward an evolutionary hypothesis explaining the differences seen in oligomeric state vs. dinucleotide preference of the 6PGDH family. We determined the cofactor preference of Go6PGDH at different 6-phosphogluconate concentrations, characterizing the wild-type enzyme and three-point mutants of residues in the cofactor binding site of Go6PGDH. Results: The structural comparison suggests that the 6PG binding site initially evolved by exchanging C-terminal α-helices between subunits. An internal duplication event changed the quaternary structure of the enzyme from a tetrameric to a dimeric arrangement. The phylogenetic analysis suggests that 6PGDHs have spread from Bacteria to Archaea and Eukarya on multiple occasions by lateral gene transfer. Sequence motifs consistent with NAD+- and NADP+-specificity are found in the β2-α2 loop of dimeric and tetrameric 6PGDHs. Site-directed mutagenesis of Go6PGDH inspired by this analysis fully reverses dinucleotide preference. One of the mutants we engineered has the highest efficiency and specificity for NAD+ so far described for a 6PGDH. Conclusions: The family 6PGDH comprises dimeric and tetrameric members whose active sites are conformed by a C-terminal α-helix contributed from adjacent subunits. Dimeric 6PGDHs have evolved from the duplication-fusion of the tetrameric C-terminal domain before independent transitions of cofactor specificity. Changes in the conserved β2-α2 loop are crucial to modulate the cofactor specificity in Go6PGDH.
Collapse
|
16
|
Han X, Guo J, Pang E, Song H, Lin K. Ab Initio Construction and Evolutionary Analysis of Protein-Coding Gene Families with Partially Homologous Relationships: Closely Related Drosophila Genomes as a Case Study. Genome Biol Evol 2021; 12:185-202. [PMID: 32108239 PMCID: PMC7144356 DOI: 10.1093/gbe/evaa041] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/18/2020] [Indexed: 01/05/2023] Open
Abstract
How have genes evolved within a well-known genome phylogeny? Many protein-coding genes should have evolved as a whole at the gene level, and some should have evolved partly through fragments at the subgene level. To comprehensively explore such complex homologous relationships and better understand gene family evolution, here, with de novo-identified modules, the subgene units which could consecutively cover proteins within a set of closely related species, we applied a new phylogeny-based approach that considers evolutionary models with partial homology to classify all protein-coding genes in nine Drosophila genomes. Compared with two other popular methods for gene family construction, our approach improved practical gene family classifications with a more reasonable view of homology and provided a much more complete landscape of gene family evolution at the gene and subgene levels. In the case study, we found that most expanded gene families might have evolved mainly through module rearrangements rather than gene duplications and mainly generated single-module genes through partial gene duplication, suggesting that there might be pervasive subgene rearrangement in the evolution of protein-coding gene families. The use of a phylogeny-based approach with partial homology to classify and analyze protein-coding gene families may provide us with a more comprehensive landscape depicting how genes evolve within a well-known genome phylogeny.
Collapse
Affiliation(s)
- Xia Han
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, China
| | - Jindan Guo
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, China
| | - Erli Pang
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, China
| | - Hongtao Song
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, China
| | - Kui Lin
- State Key Laboratory of Earth Surface Processes and Resource Ecology, Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, China
| |
Collapse
|
17
|
Yadav A, Fernández-Baca D, Cannon SB. Family-Specific Gains and Losses of Protein Domains in the Legume and Grass Plant Families. Evol Bioinform Online 2020; 16:1176934320939943. [PMID: 32694909 PMCID: PMC7350399 DOI: 10.1177/1176934320939943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 06/15/2020] [Indexed: 11/27/2022] Open
Abstract
Protein domains can be regarded as sections of protein sequences capable of folding independently and performing specific functions. In addition to amino-acid level changes, protein sequences can also evolve through domain shuffling events such as domain insertion, deletion, or duplication. The evolution of protein domains can be studied by tracking domain changes in a selected set of species with known phylogenetic relationships. Here, we conduct such an analysis by defining domains as “features” or “descriptors,” and considering the species (target + outgroup) as instances or data-points in a data matrix. We then look for features (domains) that are significantly different between the target species and the outgroup species. We study the domain changes in 2 large, distinct groups of plant species: legumes (Fabaceae) and grasses (Poaceae), with respect to selected outgroup species. We evaluate 4 types of domain feature matrices: domain content, domain duplication, domain abundance, and domain versatility. The 4 types of domain feature matrices attempt to capture different aspects of domain changes through which the protein sequences may evolve—that is, via gain or loss of domains, increase or decrease in the copy number of domains along the sequences, expansion or contraction of domains, or through changes in the number of adjacent domain partners. All the feature matrices were analyzed using feature selection techniques and statistical tests to select protein domains that have significant different feature values in legumes and grasses. We report the biological functions of the top selected domains from the analysis of all the feature matrices. In addition, we also perform domain-centric gene ontology (dcGO) enrichment analysis on all selected domains from all 4 feature matrices to study the gene ontology terms associated with the significantly evolving domains in legumes and grasses. Domain content analysis revealed a striking loss of protein domains from the Fanconi anemia (FA) pathway, the pathway responsible for the repair of interstrand DNA crosslinks. The abundance analysis of domains found in legumes revealed an increase in glutathione synthase enzyme, an antioxidant required from nitrogen fixation, and a decrease in xanthine oxidizing enzymes, a phenomenon confirmed by previous studies. In grasses, the abundance analysis showed increases in domains related to gene silencing which could be due to polyploidy or due to enhanced response to viral infection. We provide a docker container that can be used to perform this analysis workflow on any user-defined sets of species, available at https://cloud.docker.com/u/akshayayadav/repository/docker/akshayayadav/protein-domain-evolution-project.
Collapse
Affiliation(s)
- Akshay Yadav
- Bioinformatics and Computational Biology Graduate Program, Iowa State University, Ames, IA, USA
| | | | - Steven B Cannon
- Corn Insects and Crop Genetics Research Unit, USDA-Agricultural Research Service, Ames, IA, USA
| |
Collapse
|
18
|
Xiao X, Xue GF, Stamatovic B, Qiu WR. Using Cellular Automata to Simulate Domain Evolution in Proteins. Front Genet 2020; 11:515. [PMID: 32582278 PMCID: PMC7296063 DOI: 10.3389/fgene.2020.00515] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/24/2019] [Accepted: 04/28/2020] [Indexed: 11/26/2022] Open
Abstract
Proteins play primary roles in important biological processes such as catalysis, physiological functions, and immune system functions. Thus, the research on how proteins evolved has been a nuclear question in the field of evolutionary biology. General models of protein evolution help to determine the baseline expectations for evolution of sequences, and these models have been extensively useful in sequence analysis as well as for the computer simulation of artificial sequence data sets. We have developed a new method of simulating multi-domain protein evolution, including fusions of domains, insertion, and deletion. It has been observed via the simulation test that the success rates achieved by the proposed predictor are remarkably high. For the convenience of the most experimental scientists, a user-friendly web server has been established at http://jci-bioinfo.cn/domainevo, by which users can easily get their desired results without having to go through the detailed mathematics. Through the simulation results of this website, users can predict the evolution trend of the protein domain architecture.
Collapse
Affiliation(s)
- Xuan Xiao
- Computer Department, Jing-De-Zhen Ceramic Institute, Jingdezhen, China
| | - Guang-Fu Xue
- Computer Department, Jing-De-Zhen Ceramic Institute, Jingdezhen, China
| | - Biljana Stamatovic
- Faculty of Information Systems and Technologies, University of Donja Gorica, Podgorica, Montenegro
| | - Wang-Ren Qiu
- Computer Department, Jing-De-Zhen Ceramic Institute, Jingdezhen, China
| |
Collapse
|
19
|
Brennan CJ, Zhou B, Benbow HR, Ajaz S, Karki SJ, Hehir JG, O’Driscoll A, Feechan A, Mullins E, Doohan FM. Taxonomically Restricted Wheat Genes Interact With Small Secreted Fungal Proteins and Enhance Resistance to Septoria Tritici Blotch Disease. FRONTIERS IN PLANT SCIENCE 2020; 11:433. [PMID: 32477375 PMCID: PMC7236048 DOI: 10.3389/fpls.2020.00433] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/24/2019] [Accepted: 03/24/2020] [Indexed: 05/12/2023]
Abstract
Understanding the nuances of host/pathogen interactions are paramount if we wish to effectively control cereal diseases. In the case of the wheat/Zymoseptoria tritici interaction that leads to Septoria tritici blotch (STB) disease, a 10,000-year-old conflict has led to considerable armaments being developed on both sides which are not reflected in conventional model systems. Taxonomically restricted genes (TRGs) have evolved in wheat to better allow it to cope with stress caused by fungal pathogens, and Z. tritici has evolved specialized effectors which allow it to manipulate its' host. A microarray focused on the latent phase response of a resistant wheat cultivar (cv. Stigg) and susceptible wheat cultivar (cv. Gallant) to Z. tritici infection was mined for TRGs within the Poaceae. From this analysis, we identified two TRGs that were significantly upregulated in response to Z. tritici infection, Septoria-responsive TRG6 and 7 (TaSRTRG6 and TaSRTRG7). Virus induced silencing of these genes resulted in an increased susceptibility to STB disease in cvs. Gallant and Stigg, and significantly so in the latter (2.5-fold increase in STB disease). In silico and localization studies categorized TaSRTRG6 as a secreted protein and TaSRTRG7 as an intracellular protein. Yeast two-hybrid analysis and biofluorescent complementation studies demonstrated that both TaSRTRG6 and TaSRTRG7 can interact with small proteins secreted by Z. tritici (potential effector candidates). Thus we conclude that TRGs are an important part of the wheat-Z. tritici co-evolution story and potential candidates for modulating STB resistance.
Collapse
Affiliation(s)
- Ciarán J. Brennan
- UCD School of Biology and Environmental Science and UCD Earth Institute, UCD O’Brien Centre for Science (East), University College Dublin, Belfield, Ireland
| | - Binbin Zhou
- UCD School of Biology and Environmental Science and UCD Earth Institute, UCD O’Brien Centre for Science (East), University College Dublin, Belfield, Ireland
| | - Harriet R. Benbow
- UCD School of Biology and Environmental Science and UCD Earth Institute, UCD O’Brien Centre for Science (East), University College Dublin, Belfield, Ireland
| | - Sobia Ajaz
- UCD School of Biology and Environmental Science and UCD Earth Institute, UCD O’Brien Centre for Science (East), University College Dublin, Belfield, Ireland
| | - Sujit J. Karki
- School of Agriculture and Food Science, University College Dublin, Belfield, Ireland
| | | | | | - Angela Feechan
- School of Agriculture and Food Science, University College Dublin, Belfield, Ireland
| | - Ewen Mullins
- Department of Crop Science, Teagasc, Carlow, Ireland
| | - Fiona M. Doohan
- UCD School of Biology and Environmental Science and UCD Earth Institute, UCD O’Brien Centre for Science (East), University College Dublin, Belfield, Ireland
- *Correspondence: Fiona M. Doohan,
| |
Collapse
|
20
|
Abascal F, Juan D, Jungreis I, Kellis M, Martinez L, Rigau M, Rodriguez JM, Vazquez J, Tress ML. Loose ends: almost one in five human genes still have unresolved coding status. Nucleic Acids Res 2019; 46:7070-7084. [PMID: 29982784 PMCID: PMC6101605 DOI: 10.1093/nar/gky587] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Accepted: 06/18/2018] [Indexed: 12/16/2022] Open
Abstract
Seventeen years after the sequencing of the human genome, the human proteome is still under revision. One in eight of the 22 210 coding genes listed by the Ensembl/GENCODE, RefSeq and UniProtKB reference databases are annotated differently across the three sets. We have carried out an in-depth investigation on the 2764 genes classified as coding by one or more sets of manual curators and not coding by others. Data from large-scale genetic variation analyses suggests that most are not under protein-like purifying selection and so are unlikely to code for functional proteins. A further 1470 genes annotated as coding in all three reference sets have characteristics that are typical of non-coding genes or pseudogenes. These potential non-coding genes also appear to be undergoing neutral evolution and have considerably less supporting transcript and protein evidence than other coding genes. We believe that the three reference databases currently overestimate the number of human coding genes by at least 2000, complicating and adding noise to large-scale biomedical experiments. Determining which potential non-coding genes do not code for proteins is a difficult but vitally important task since the human reference proteome is a fundamental pillar of most basic research and supports almost all large-scale biomedical projects.
Collapse
Affiliation(s)
- Federico Abascal
- Wellcome Trust Sanger Institute, Hinxton CB10 1SA, Cambridgeshire, UK
| | - David Juan
- Comparative Genomics Lab, Instituto de Biologica Evolutiva, Universitat Pompeu Fabra, Barcelona, Spain
| | - Irwin Jungreis
- MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA and Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | | | - Laura Martinez
- Bioinformatics Unit, Spanish National Cancer Research Centre, Madrid, Spain
| | - Maria Rigau
- Computational Biology Life Sciences Group, Barcelona Supercomputing Center, Barcelona, Spain
| | - Jose Manuel Rodriguez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares, Madrid, Spain
| | - Jesus Vazquez
- Cardiovascular Proteomics Laboratory, Centro Nacional de Investigaciones Cardiovasculares, Madrid, Spain
| | - Michael L Tress
- Bioinformatics Unit, Spanish National Cancer Research Centre, Madrid, Spain
| |
Collapse
|
21
|
Raimundo J, Sobral R, Laranjeira S, Costa MMR. Successive Domain Rearrangements Underlie the Evolution of a Regulatory Module Controlled by a Small Interfering Peptide. Mol Biol Evol 2019; 35:2873-2885. [PMID: 30203071 PMCID: PMC6278869 DOI: 10.1093/molbev/msy178] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The establishment of new interactions between transcriptional regulators increases the regulatory diversity that drives phenotypic novelty. To understand how such interactions evolve, we have studied a regulatory module (DDR) composed by three MYB-like proteins: DIVARICATA (DIV), RADIALIS (RAD), and DIV-and-RAD-Interacting Factor (DRIF). The DIV and DRIF proteins form a transcriptional complex that is disrupted in the presence of RAD, a small interfering peptide, due to the formation of RAD–DRIF dimers. This dynamic interaction result in a molecular switch mechanism responsible for the control of distinct developmental processes in plants. Here, we have determined how the DDR regulatory module was established by analyzing the origin and evolution of the DIV, DRIF, and RAD protein families and the evolutionary history of their interactions. We show that duplications of a pre-existing MYB domain originated the DIV and DRIF protein families in the ancestral lineage of green algae, and, later, the RAD family in seed plants. Intraspecies interactions between the MYB domains of DIV and DRIF proteins are detected in green algae, whereas the earliest evidence of an interaction between DRIF and RAD proteins occurs in the gymnosperms, coincident with the establishment of the RAD family. Therefore, the DDR module evolved in a stepwise progression with the DIV–DRIF transcription complex evolving prior to the antagonistic RAD–DRIF interaction that established the molecular switch mechanism. Our results suggest that the successive rearrangement and divergence of a single protein domain can be an effective evolutionary mechanism driving new protein interactions and the establishment of novel regulatory modules.
Collapse
Affiliation(s)
- João Raimundo
- Biosystems and Integrative Sciences Institute (BioISI), Plant Functional Biology Center, University of Minho, Braga, Portugal.,Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ
| | - Rómulo Sobral
- Biosystems and Integrative Sciences Institute (BioISI), Plant Functional Biology Center, University of Minho, Braga, Portugal
| | - Sara Laranjeira
- Biosystems and Integrative Sciences Institute (BioISI), Plant Functional Biology Center, University of Minho, Braga, Portugal
| | - Maria Manuela R Costa
- Biosystems and Integrative Sciences Institute (BioISI), Plant Functional Biology Center, University of Minho, Braga, Portugal
| |
Collapse
|
22
|
Gerhardt MJ, Marsh JA, Morrison M, Kazlauskas A, Khadka A, Rosenkranz S, DeAngelis MM, Saint-Geniez M, Jacobo SMP. ER stress-induced aggresome trafficking of HtrA1 protects against proteotoxicity. J Mol Cell Biol 2019; 9:516-532. [PMID: 28992183 PMCID: PMC5823240 DOI: 10.1093/jmcb/mjx024] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2016] [Accepted: 07/08/2017] [Indexed: 01/13/2023] Open
Abstract
High temperature requirement A1 (HtrA1) belongs to an ancient protein family that is linked to various human disorders. The precise role of exon 1-encoded N-terminal domains and how these influence the biological functions of human HtrA1 remain elusive. In this study, we traced the evolutionary origins of these N-terminal domains to a single gene fusion event in the most recent common ancestor of vertebrates. We hypothesized that human HtrA1 is implicated in unfolded protein response. In highly secretory cells of the retinal pigmented epithelia, endoplasmic reticulum (ER) stress upregulated HtrA1. HtrA1 co-localized with vimentin intermediate filaments in highly arborized fashion. Upon ER stress, HtrA1 tracked along intermediate filaments, which collapsed and bundled in an aggresome at the microtubule organizing center. Gene silencing of HtrA1 altered the schedule and amplitude of adaptive signaling and concomitantly resulted in apoptosis. Restoration of wild-type HtrA1, but not its protease inactive mutant, was necessary and sufficient to protect from apoptosis. A variant of HtrA1 that harbored exon 1 substitutions displayed reduced efficacy in rescuing cells from proteotoxicity. Our results illuminate the integration of HtrA1 in the toolkit of mammalian cells against protein misfolding and the implications of defects in HtrA1 in proteostasis.
Collapse
Affiliation(s)
- Maximilian J Gerhardt
- Department of Ophthalmology, Harvard Medical School, The Schepens Eye Research Institute and Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA.,Department III of Internal Medicine, Cologne University Heart Center, Center for Molecular Medicine, University of Cologne, 50931 Cologne, Germany
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Edinburgh EH4 2XU, UK
| | - Margaux Morrison
- Department of Ophthalmology and Visual Sciences, University of Utah and John A. Moran Eye Center, Salt Lake City, UT 84132, USA
| | - Andrius Kazlauskas
- Department of Ophthalmology, Harvard Medical School, The Schepens Eye Research Institute and Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA
| | - Arogya Khadka
- Department of Ophthalmology, Harvard Medical School, The Schepens Eye Research Institute and Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA
| | - Stephan Rosenkranz
- Department III of Internal Medicine, Cologne University Heart Center, Center for Molecular Medicine, University of Cologne, 50931 Cologne, Germany
| | - Margaret M DeAngelis
- Department of Ophthalmology and Visual Sciences, University of Utah and John A. Moran Eye Center, Salt Lake City, UT 84132, USA
| | - Magali Saint-Geniez
- Department of Ophthalmology, Harvard Medical School, The Schepens Eye Research Institute and Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA
| | - Sarah Melissa P Jacobo
- Department of Ophthalmology, Harvard Medical School, The Schepens Eye Research Institute and Massachusetts Eye and Ear Infirmary, Boston, MA 02114, USA
| |
Collapse
|
23
|
Li Y, Zhang Y, Li X, Yi S, Xu J. Gain-of-Function Mutations: An Emerging Advantage for Cancer Biology. Trends Biochem Sci 2019; 44:659-674. [PMID: 31047772 DOI: 10.1016/j.tibs.2019.03.009] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 03/21/2019] [Accepted: 03/26/2019] [Indexed: 02/08/2023]
Abstract
Advances in next-generation sequencing have identified thousands of genomic variants that perturb the normal functions of proteins, further contributing to diverse phenotypic consequences in cancer. Elucidating the functional pathways altered by loss-of-function (LOF) or gain-of-function (GOF) mutations will be crucial for prioritizing cancer-causing variants and their resultant therapeutic liabilities. In this review, we highlight the fundamental function of GOF mutations and discuss the potential mechanistic effects in the context of signaling networks. We also summarize advances in experimental and computational resources, which will dramatically help with studies on the functional and phenotypic consequences of mutations. Together, systematic investigations of the function of GOF mutations will provide an important missing piece for cancer biology and precision therapy.
Collapse
Affiliation(s)
- Yongsheng Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China; Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA
| | - Yunpeng Zhang
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | - Xia Li
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China; College of Bioinformatics, Hainan Medical University, Haikou 570100, China.
| | - Song Yi
- Department of Oncology, Dell Medical School, The University of Texas at Austin, Austin, TX 78712, USA; Department of Biomedical Engineering, Cockrell School of Engineering, The University of Texas at Austin, Austin, TX 78712, USA.
| | - Juan Xu
- College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China.
| |
Collapse
|
24
|
Ratcliffe LE, Asiedu EK, Pickett CJ, Warburton MA, Izzi SA, Meedel TH. The Ciona myogenic regulatory factor functions as a typical MRF but possesses a novel N-terminus that is essential for activity. Dev Biol 2019; 448:210-225. [PMID: 30365920 PMCID: PMC6478573 DOI: 10.1016/j.ydbio.2018.10.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Revised: 08/28/2018] [Accepted: 10/16/2018] [Indexed: 11/26/2022]
Abstract
Electroporation-based assays were used to test whether the myogenic regulatory factor (MRF) of Ciona intestinalis (CiMRF) interferes with endogenous developmental programs, and to evaluate the importance of its unusual N-terminus for muscle development. We found that CiMRF suppresses both notochord and endoderm development when it is expressed in these tissues by a mechanism that may involve activation of muscle-specific microRNAs. Because these results add to a large body of evidence demonstrating the exceptionally high degree of functional conservation among MRFs, we were surprised to discover that non-ascidian MRFs were not myogenic in Ciona unless they formed part of a chimeric protein containing the CiMRF N-terminus. Equally surprising, we found that despite their widely differing primary sequences, the N-termini of MRFs of other ascidian species could form chimeric MRFs that were also myogenic in Ciona. This domain did not rescue the activity of a Brachyury protein whose transcriptional activation domain had been deleted, and so does not appear to constitute such a domain. Our results indicate that ascidians have previously unrecognized and potentially novel requirements for MRF-directed myogenesis. Moreover, they provide the first example of a domain that is essential to the core function of an important family of gene regulatory proteins, one that, to date, has been found in only a single branch of the family.
Collapse
Affiliation(s)
- Lindsay E Ratcliffe
- Department of Biology, Rhode Island College, 600 Mt. Pleasant Ave., Providence, RI 02908, USA.
| | - Emmanuel K Asiedu
- Department of Biology, Rhode Island College, 600 Mt. Pleasant Ave., Providence, RI 02908, USA.
| | - C J Pickett
- Department of Biology, Rhode Island College, 600 Mt. Pleasant Ave., Providence, RI 02908, USA.
| | - Megan A Warburton
- Department of Biology, Rhode Island College, 600 Mt. Pleasant Ave., Providence, RI 02908, USA.
| | - Stephanie A Izzi
- Department of Biology, Rhode Island College, 600 Mt. Pleasant Ave., Providence, RI 02908, USA.
| | - Thomas H Meedel
- Department of Biology, Rhode Island College, 600 Mt. Pleasant Ave., Providence, RI 02908, USA.
| |
Collapse
|
25
|
Abstract
This chapter reviews current research on how protein domain architectures evolve. We begin by summarizing work on the phylogenetic distribution of proteins, as this will directly impact which domain architectures can be formed in different species. Studies relating domain family size to occurrence have shown that they generally follow power law distributions, both within genomes and larger evolutionary groups. These findings were subsequently extended to multi-domain architectures. Genome evolution models that have been suggested to explain the shape of these distributions are reviewed, as well as evidence for selective pressure to expand certain domain families more than others. Each domain has an intrinsic combinatorial propensity, and the effects of this have been studied using measures of domain versatility or promiscuity. Next, we study the principles of protein domain architecture evolution and how these have been inferred from distributions of extant domain arrangements. Following this, we review inferences of ancestral domain architecture and the conclusions concerning domain architecture evolution mechanisms that can be drawn from these. Finally, we examine whether all known cases of a given domain architecture can be assumed to have a single common origin (monophyly) or have evolved convergently (polyphyly). We end by a discussion of some available tools for computational analysis or exploitation of protein domain architectures and their evolution.
Collapse
|
26
|
Gómez-Fernández P, Urtasun A, Paton AW, Paton JC, Borrego F, Dersh D, Argon Y, Alloza I, Vandenbroeck K. Long Interleukin-22 Binding Protein Isoform-1 Is an Intracellular Activator of the Unfolded Protein Response. Front Immunol 2018; 9:2934. [PMID: 30619294 PMCID: PMC6302113 DOI: 10.3389/fimmu.2018.02934] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2018] [Accepted: 11/29/2018] [Indexed: 12/26/2022] Open
Abstract
The human IL22RA2 gene co-produces three protein isoforms in dendritic cells [IL-22 binding protein isoform-1 (IL-22BPi1), IL-22BPi2, and IL-22BPi3]. Two of these, IL-22BPi2 and IL-22BPi3, are capable of neutralizing the biological activity of IL-22. The function of IL-22BPi1, which differs from IL-22BPi2 through an in-frame 32-amino acid insertion provided by an alternatively spliced exon, remains unknown. Using transfected human cell lines, we demonstrate that IL-22BPi1 is secreted detectably, but at much lower levels than IL-22BPi2, and unlike IL-22BPi2 and IL-22BPi3, is largely retained in the endoplasmic reticulum (ER). As opposed to IL-22BPi2 and IL-22BPi3, IL-22BPi1 is incapable of neutralizing or binding to IL-22 measured in bioassay or assembly-induced IL-22 co-folding assay. We performed interactome analysis to disclose the mechanism underlying the poor secretion of IL-22BPi1 and identified GRP78, GRP94, GRP170, and calnexin as main interactors. Structure-function analysis revealed that, like IL-22BPi2, IL-22BPi1 binds to the substrate-binding domain of GRP78 as well as to the middle domain of GRP94. Ectopic expression of wild-type GRP78 enhanced, and ATPase-defective GRP94 mutant decreased, secretion of both IL-22BPi1 and IL-22BPi2, while neither of both affected IL-22BPi3 secretion. Thus, IL-22BPi1 and IL-22BPi2 are bona fide clients of the ER chaperones GRP78 and GRP94. However, only IL-22BPi1 activates an unfolded protein response (UPR) resulting in increased protein levels of GRP78 and GRP94. Cloning of the IL22RA2 alternatively spliced exon into an unrelated cytokine, IL-2, bestowed similar characteristics on the resulting protein. We also found that CD14++/CD16+ intermediate monocytes produced a higher level of IL22RA2 mRNA than classical and non-classical monocytes, but this difference disappeared in immature dendritic cells (moDC) derived thereof. Upon silencing of IL22RA2 expression in moDC, GRP78 levels were significantly reduced, suggesting that native IL22RA2 expression naturally contributes to upregulating GRP78 levels in these cells. The IL22RA2 alternatively spliced exon was reported to be recruited through a single mutation in the proto-splice site of a Long Terminal Repeat retrotransposon sequence in the ape lineage. Our work suggests that positive selection of IL-22BPi1 was not driven by IL-22 antagonism as in the case of IL-22BPi2 and IL-22BPi3, but by capacity for induction of an UPR response.
Collapse
Affiliation(s)
- Paloma Gómez-Fernández
- Neurogenomiks Group, Department of Neuroscience, University of the Basque Country (UPV/EHU), Leioa, Spain
- Achucarro Basque Center for Neuroscience, Leioa, Spain
| | - Andoni Urtasun
- Neurogenomiks Group, Department of Neuroscience, University of the Basque Country (UPV/EHU), Leioa, Spain
- Achucarro Basque Center for Neuroscience, Leioa, Spain
| | - Adrienne W. Paton
- Research for Infectious Diseases, Department of Molecular and Biomedical Science, University of Adelaide, Adelaide, SA, Australia
| | - James C. Paton
- Research for Infectious Diseases, Department of Molecular and Biomedical Science, University of Adelaide, Adelaide, SA, Australia
| | - Francisco Borrego
- Biocruces Bizkaia Health Research Institute, Barakaldo, Spain
- Basque Center for Transfusion and Human Tissues, Galdakao, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
| | - Devin Dersh
- Division of Cell Pathology, Children's Hospital of Philadelphia and Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Yair Argon
- Division of Cell Pathology, Children's Hospital of Philadelphia and Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, United States
| | - Iraide Alloza
- Neurogenomiks Group, Department of Neuroscience, University of the Basque Country (UPV/EHU), Leioa, Spain
- Achucarro Basque Center for Neuroscience, Leioa, Spain
| | - Koen Vandenbroeck
- Neurogenomiks Group, Department of Neuroscience, University of the Basque Country (UPV/EHU), Leioa, Spain
- Achucarro Basque Center for Neuroscience, Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, Bilbao, Spain
| |
Collapse
|
27
|
Willis S, Masel J. Gene Birth Contributes to Structural Disorder Encoded by Overlapping Genes. Genetics 2018; 210:303-313. [PMID: 30026186 PMCID: PMC6116962 DOI: 10.1534/genetics.118.301249] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 07/18/2018] [Indexed: 11/18/2022] Open
Abstract
The same nucleotide sequence can encode two protein products in different reading frames. Overlapping gene regions encode higher levels of intrinsic structural disorder (ISD) than nonoverlapping genes (39% vs. 25% in our viral dataset). This might be because of the intrinsic properties of the genetic code, because one member per pair was recently born de novo in a process that favors high ISD, or because high ISD relieves increased evolutionary constraint imposed by dual-coding. Here, we quantify the relative contributions of these three alternative hypotheses. We estimate that the recency of de novo gene birth explains [Formula: see text] or more of the elevation in ISD in overlapping regions of viral genes. While the two reading frames within a same-strand overlapping gene pair have markedly different ISD tendencies that must be controlled for, their effects cancel out to make no net contribution to ISD. The remaining elevation of ISD in the older members of overlapping gene pairs, presumed due to the need to alleviate evolutionary constraint, was already present prior to the origin of the overlap. Same-strand overlapping gene birth events can occur in two different frames, favoring high ISD either in the ancestral gene or in the novel gene; surprisingly, most de novo gene birth events contained completely within the body of an ancestral gene favor high ISD in the ancestral gene (23 phylogenetically independent events vs. 1). This can be explained by mutation bias favoring the frame with more start codons and fewer stop codons.
Collapse
Affiliation(s)
- Sara Willis
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, Arizona 85721
| |
Collapse
|
28
|
Klasberg S, Bitard-Feildel T, Callebaut I, Bornberg-Bauer E. Origins and structural properties of novel and de novo protein domains during insect evolution. FEBS J 2018; 285:2605-2625. [PMID: 29802682 DOI: 10.1111/febs.14504] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2017] [Revised: 04/12/2018] [Accepted: 05/11/2018] [Indexed: 12/11/2022]
Abstract
Over long time scales, protein evolution is characterized by modular rearrangements of protein domains. Such rearrangements are mainly caused by gene duplication, fusion and terminal losses. To better understand domain emergence mechanisms we investigated 32 insect genomes covering a speciation gradient ranging from ~ 2 to ~ 390 mya. We use established domain models and foldable domains delineated by hydrophobic cluster analysis (HCA), which does not require homologous sequences, to also identify domains which have likely arisen de novo, that is, from previously noncoding DNA. Our results indicate that most novel domains emerge terminally as they originate from ORF extensions while fewer arise in middle arrangements, resulting from exonization of intronic or intergenic regions. Many novel domains rapidly migrate between terminal or middle positions and single- and multidomain arrangements. Young domains, such as most HCA-defined domains, are under strong selection pressure as they show signals of purifying selection. De novo domains, linked to ancient domains or defined by HCA, have higher degrees of intrinsic disorder and disorder-to-order transition upon binding than ancient domains. However, the corresponding DNA sequences of the novel domains of de novo origins could only rarely be found in sister genomes. We conclude that novel domains are often recruited by other proteins and undergo important structural modifications shortly after their emergence, but evolve too fast to be characterized by cross-species comparisons alone.
Collapse
Affiliation(s)
- Steffen Klasberg
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Germany
| | - Tristan Bitard-Feildel
- Sorbonne Université, CNRS, IBPS, Laboratoire de Biologie Computationnelle et Quantitative (LCQB), Paris, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, IRD, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, Paris, France
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Germany
| |
Collapse
|
29
|
Khor JM, Ettensohn CA. Functional divergence of paralogous transcription factors supported the evolution of biomineralization in echinoderms. eLife 2017; 6:32728. [PMID: 29154754 PMCID: PMC5758115 DOI: 10.7554/elife.32728] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 11/16/2017] [Indexed: 12/12/2022] Open
Abstract
Alx1 is a pivotal transcription factor in a gene regulatory network that controls skeletogenesis throughout the echinoderm phylum. We performed a structure-function analysis of sea urchin Alx1 using a rescue assay and identified a novel, conserved motif (Domain 2) essential for skeletogenic function. The paralogue of Alx1, Alx4, was not functionally interchangeable with Alx1, but insertion of Domain 2 conferred robust skeletogenic function on Alx4. We used cross-species expression experiments to show that Alx1 proteins from distantly related echinoderms are not interchangeable, although the sequence and function of Domain 2 are highly conserved. We also found that Domain 2 is subject to alternative splicing and provide evidence that this domain was originally gained through exonization. Our findings show that a gene duplication event permitted the functional specialization of a transcription factor through changes in exon-intron organization and thereby supported the evolution of a major morphological novelty.
Collapse
Affiliation(s)
- Jian Ming Khor
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
| | - Charles A Ettensohn
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, United States
| |
Collapse
|
30
|
Jia B, Tang K, Chun BH, Jeon CO. Large-scale examination of functional and sequence diversity of 2-oxoglutarate/Fe(II)-dependent oxygenases in Metazoa. Biochim Biophys Acta Gen Subj 2017; 1861:2922-2933. [DOI: 10.1016/j.bbagen.2017.08.019] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2017] [Revised: 08/22/2017] [Accepted: 08/23/2017] [Indexed: 12/25/2022]
|
31
|
Ahrens JB, Nunez-Castilla J, Siltberg-Liberles J. Evolution of intrinsic disorder in eukaryotic proteins. Cell Mol Life Sci 2017; 74:3163-3174. [PMID: 28597295 PMCID: PMC11107722 DOI: 10.1007/s00018-017-2559-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 06/01/2017] [Indexed: 12/23/2022]
Abstract
Conformational flexibility conferred though regions of intrinsic structural disorder allows proteins to behave as dynamic molecules. While it is well-known that intrinsically disordered regions can undergo disorder-to-order transitions in real-time as part of their function, we also are beginning to learn more about the dynamics of disorder-to-order transitions along evolutionary time-scales. Intrinsically disordered regions endow proteins with functional promiscuity, which is further enhanced by the ability of some of these regions to undergo real-time disorder-to-order transitions. Disorder content affects gene retention after whole genome duplication, but it is not necessarily conserved. Altered patterns of disorder resulting from evolutionary disorder-to-order transitions indicate that disorder evolves to modify function through refining stability, regulation, and interactions. Here, we review the evolution of intrinsically disordered regions in eukaryotic proteins. We discuss the interplay between secondary structure and disorder on evolutionary time-scales, the importance of disorder for eukaryotic proteome expansion and functional divergence, and the evolutionary dynamics of disorder.
Collapse
Affiliation(s)
- Joseph B Ahrens
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, 11200 SW 8th St, Miami, FL, 33199, USA
| | - Janelle Nunez-Castilla
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, 11200 SW 8th St, Miami, FL, 33199, USA
| | - Jessica Siltberg-Liberles
- Department of Biological Sciences, Biomolecular Sciences Institute, Florida International University, 11200 SW 8th St, Miami, FL, 33199, USA.
| |
Collapse
|
32
|
Luis Villanueva-Cañas J, Ruiz-Orera J, Agea MI, Gallo M, Andreu D, Albà MM. New Genes and Functional Innovation in Mammals. Genome Biol Evol 2017; 9:1886-1900. [PMID: 28854603 PMCID: PMC5554394 DOI: 10.1093/gbe/evx136] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2017] [Indexed: 12/22/2022] Open
Abstract
The birth of genes that encode new protein sequences is a major source of evolutionary innovation. However, we still understand relatively little about how these genes come into being and which functions they are selected for. To address these questions, we have obtained a large collection of mammalian-specific gene families that lack homologues in other eukaryotic groups. We have combined gene annotations and de novo transcript assemblies from 30 different mammalian species, obtaining ∼6,000 gene families. In general, the proteins in mammalian-specific gene families tend to be short and depleted in aromatic and negatively charged residues. Proteins which arose early in mammalian evolution include milk and skin polypeptides, immune response components, and proteins involved in reproduction. In contrast, the functions of proteins which have a more recent origin remain largely unknown, despite the fact that these proteins also have extensive proteomics support. We identify several previously described cases of genes originated de novo from noncoding genomic regions, supporting the idea that this mechanism frequently underlies the evolution of new protein-coding genes in mammals. Finally, we show that most young mammalian genes are preferentially expressed in testis, suggesting that sexual selection plays an important role in the emergence of new functional genes.
Collapse
Affiliation(s)
- José Luis Villanueva-Cañas
- Evolutionary Genomics Group, Research Programme in Biomedical Informatics, Hospital del Mar Research Institute (IMIM), Barcelona, Spain
- Present address: Institute of Evolutionary Biology (CSIC-Universitat Pompeu Fabra), Barcelona, Spain
| | - Jorge Ruiz-Orera
- Evolutionary Genomics Group, Research Programme in Biomedical Informatics, Hospital del Mar Research Institute (IMIM), Barcelona, Spain
| | - M. Isabel Agea
- Evolutionary Genomics Group, Research Programme in Biomedical Informatics, Hospital del Mar Research Institute (IMIM), Barcelona, Spain
| | - Maria Gallo
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - David Andreu
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - M. Mar Albà
- Evolutionary Genomics Group, Research Programme in Biomedical Informatics, Hospital del Mar Research Institute (IMIM), Barcelona, Spain
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra (UPF), Barcelona, Spain
- Catalan Institution for Research and Advanced Studies (ICREA), Barcelona, Spain
| |
Collapse
|
33
|
Johnson KL, Cassin AM, Lonsdale A, Bacic A, Doblin MS, Schultz CJ. Pipeline to Identify Hydroxyproline-Rich Glycoproteins. PLANT PHYSIOLOGY 2017; 174:886-903. [PMID: 28446635 PMCID: PMC5462032 DOI: 10.1104/pp.17.00294] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 04/21/2017] [Indexed: 05/14/2023]
Abstract
Intrinsically disordered proteins (IDPs) are functional proteins that lack a well-defined three-dimensional structure. The study of IDPs is a rapidly growing area as the crucial biological functions of more of these proteins are uncovered. In plants, IDPs are implicated in plant stress responses, signaling, and regulatory processes. A superfamily of cell wall proteins, the hydroxyproline-rich glycoproteins (HRGPs), have characteristic features of IDPs. Their protein backbones are rich in the disordering amino acid proline, they contain repeated sequence motifs and extensive posttranslational modifications (glycosylation), and they have been implicated in many biological functions. HRGPs are evolutionarily ancient, having been isolated from the protein-rich walls of chlorophyte algae to the cellulose-rich walls of embryophytes. Examination of HRGPs in a range of plant species should provide valuable insights into how they have evolved. Commonly divided into the arabinogalactan proteins, extensins, and proline-rich proteins, in reality, a continuum of structures exists within this diverse and heterogenous superfamily. An inability to accurately classify HRGPs leads to inconsistent gene ontologies limiting the identification of HRGP classes in existing and emerging omics data sets. We present a novel and robust motif and amino acid bias (MAAB) bioinformatics pipeline to classify HRGPs into 23 descriptive subclasses. Validation of MAAB was achieved using available genomic resources and then applied to the 1000 Plants transcriptome project (www.onekp.com) data set. Significant improvement in the detection of HRGPs using multiple-k-mer transcriptome assembly methodology was observed. The MAAB pipeline is readily adaptable and can be modified to optimize the recovery of IDPs from other organisms.
Collapse
Affiliation(s)
- Kim L Johnson
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Andrew M Cassin
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Andrew Lonsdale
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Antony Bacic
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Monika S Doblin
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| | - Carolyn J Schultz
- Australian Research Council Centre of Excellence in Plant Cell Walls, School of BioSciences, University of Melbourne, Parkville, Victoria 3010, Australia (K.L.J., A.M.C., A.L., A.B., M.S.D.); and
- School of Agriculture, Food, and Wine, University of Adelaide, Waite Research Institute, Glen Osmond, South Australia 5064, Australia (C.J.S.)
| |
Collapse
|
34
|
Abstract
The phenomenon of de novo gene birth from junk DNA is surprising, because random polypeptides are expected to be toxic. There are two conflicting views about how de novo gene birth is nevertheless possible: the continuum hypothesis invokes a gradual gene birth process, while the preadaptation hypothesis predicts that young genes will show extreme levels of gene-like traits. We show that intrinsic structural disorder conforms to the predictions of the preadaptation hypothesis and falsifies the continuum hypothesis, with all genes having higher levels than translated junk DNA, but young genes having the highest level of all. Results are robust to homology detection bias, to the non-independence of multiple members of the same gene family, and to the false positive annotation of protein-coding genes.
Collapse
|
35
|
Badet T, Peyraud R, Mbengue M, Navaud O, Derbyshire M, Oliver RP, Barbacci A, Raffaele S. Codon optimization underpins generalist parasitism in fungi. eLife 2017; 6:e22472. [PMID: 28157073 PMCID: PMC5315462 DOI: 10.7554/elife.22472] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 01/28/2017] [Indexed: 01/04/2023] Open
Abstract
The range of hosts that parasites can infect is a key determinant of the emergence and spread of disease. Yet, the impact of host range variation on the evolution of parasite genomes remains unknown. Here, we show that codon optimization underlies genome adaptation in broad host range parasites. We found that the longer proteins encoded by broad host range fungi likely increase natural selection on codon optimization in these species. Accordingly, codon optimization correlates with host range across the fungal kingdom. At the species level, biased patterns of synonymous substitutions underpin increased codon optimization in a generalist but not a specialist fungal pathogen. Virulence genes were consistently enriched in highly codon-optimized genes of generalist but not specialist species. We conclude that codon optimization is related to the capacity of parasites to colonize multiple hosts. Our results link genome evolution and translational regulation to the long-term persistence of generalist parasitism.
Collapse
Affiliation(s)
- Thomas Badet
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Remi Peyraud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Malick Mbengue
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Olivier Navaud
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Mark Derbyshire
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Richard P Oliver
- Centre for Crop and Disease Management, Department of Environment and Agriculture, Curtin University, Perth, Australia
| | - Adelin Barbacci
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| | - Sylvain Raffaele
- LIPM, Université de Toulouse, INRA, CNRS, Castanet-Tolosan, France
| |
Collapse
|
36
|
Integrative view of 2-oxoglutarate/Fe(II)-dependent oxygenase diversity and functions in bacteria. Biochim Biophys Acta Gen Subj 2017; 1861:323-334. [DOI: 10.1016/j.bbagen.2016.12.001] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2016] [Revised: 11/09/2016] [Accepted: 12/01/2016] [Indexed: 12/11/2022]
|
37
|
Pancsa R, Tompa P. Coding Regions of Intrinsic Disorder Accommodate Parallel Functions. Trends Biochem Sci 2016; 41:898-906. [DOI: 10.1016/j.tibs.2016.08.009] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2016] [Revised: 08/16/2016] [Accepted: 08/19/2016] [Indexed: 02/01/2023]
|
38
|
Zhong Y, Cheng ZMM. A unique RPW8-encoding class of genes that originated in early land plants and evolved through domain fission, fusion, and duplication. Sci Rep 2016; 6:32923. [PMID: 27678195 PMCID: PMC5039405 DOI: 10.1038/srep32923] [Citation(s) in RCA: 39] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Accepted: 08/16/2016] [Indexed: 01/17/2023] Open
Abstract
Duplication, lateral gene transfer, domain fusion/fission and de novo domain creation play a key role in formation of initial common ancestral protein. Abundant protein diversities are produced by domain rearrangements, including fusions, fissions, duplications, and terminal domain losses. In this report, we explored the origin of the RPW8 domain and examined the domain rearrangements that have driven the evolution of RPW8-encoding genes in land plants. The RPW8 domain first emerged in the early land plant, Physcomitrella patens, and it likely originated de novo from a non-coding sequence or domain divergence after duplication. It was then incorporated into the NBS-LRR protein to create a main sub-class of RPW8-encoding genes, the RPW8-NBS-encoding genes. They evolved by a series of genetic events of domain fissions, fusions, and duplications. Many species-specific duplication events and tandemly duplicated clusters clearly demonstrated that species-specific and tandem duplications played important roles in expansion of RPW8-encoding genes, especially in gymnosperms and species of the Rosaceae. RPW8 domains with greater Ka/Ks values than those of the NBS domains indicated that they evolved faster than the NBS domains in RPW8-NBSs.
Collapse
Affiliation(s)
- Yan Zhong
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China
| | - Zong-Ming Max Cheng
- College of Horticulture, Nanjing Agricultural University, Nanjing, 210095, China.,Department of Plant Science, University of Tennessee, Knoxville, 37996, USA
| |
Collapse
|
39
|
Klasberg S, Bitard-Feildel T, Mallet L. Computational Identification of Novel Genes: Current and Future Perspectives. Bioinform Biol Insights 2016; 10:121-31. [PMID: 27493475 PMCID: PMC4970615 DOI: 10.4137/bbi.s39950] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2016] [Revised: 05/31/2016] [Accepted: 06/05/2016] [Indexed: 12/31/2022] Open
Abstract
While it has long been thought that all genomic novelties are derived from the existing material, many genes lacking homology to known genes were found in recent genome projects. Some of these novel genes were proposed to have evolved de novo, ie, out of noncoding sequences, whereas some have been shown to follow a duplication and divergence process. Their discovery called for an extension of the historical hypotheses about gene origination. Besides the theoretical breakthrough, increasing evidence accumulated that novel genes play important roles in evolutionary processes, including adaptation and speciation events. Different techniques are available to identify genes and classify them as novel. Their classification as novel is usually based on their similarity to known genes, or lack thereof, detected by comparative genomics or against databases. Computational approaches are further prime methods that can be based on existing models or leveraging biological evidences from experiments. Identification of novel genes remains however a challenging task. With the constant software and technologies updates, no gold standard, and no available benchmark, evaluation and characterization of genomic novelty is a vibrant field. In this review, the classical and state-of-the-art tools for gene prediction are introduced. The current methods for novel gene detection are presented; the methodological strategies and their limits are discussed along with perspective approaches for further studies.
Collapse
Affiliation(s)
- Steffen Klasberg
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| | - Tristan Bitard-Feildel
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| | - Ludovic Mallet
- Institute for Evolution and Biodiversity, Westfalian Wilhelms University Muenster, Huefferstrasse 1, Muenster, Germany
| |
Collapse
|
40
|
Wu GL, Kuo TH, Tsay TT, Tsai IJ, Chen PJ. Glycoside Hydrolase (GH) 45 and 5 Candidate Cellulases in Aphelenchoides besseyi Isolated from Bird's-Nest Fern. PLoS One 2016; 11:e0158663. [PMID: 27391812 PMCID: PMC4938546 DOI: 10.1371/journal.pone.0158663] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Accepted: 06/20/2016] [Indexed: 11/18/2022] Open
Abstract
Five Aphelenchoides besseyi isolates collected from bird's-nest ferns or rice possess different parasitic capacities in bird's-nest fern. Two different glycoside hydrolase (GH) 45 genes were identified in the fern isolates, and only one was found in the rice isolates. A Abe GH5-1 gene containing an SCP-like family domain was found only in the fern isolates. Abe GH5-1 gene has five introns suggesting a eukaryotic origin. A maximum likelihood phylogeny revealed that Abe GH5-1 is part of the nematode monophyletic group that can be clearly distinguished from those of other eukaryotic and bacterial GH5 sequences with high bootstrap support values. The fern A. besseyi isolates were the first parasitic plant nematode found to possess both GH5 and GH45 genes. Surveying the genome of the five A. besseyi isolates by Southern blotting using an 834 bp probe targeting the GH5 domain suggests the presence of at least two copies in the fern-origin isolates but none in the rice-origin isolates. The in situ hybridization shows that the Abe GH5-1 gene is expressed in the nematode ovary and testis. Our study provides insights into the diversity of GH in isolates of plant parasitic nematodes of different host origins.
Collapse
Affiliation(s)
- Guan-Long Wu
- Department of Plant Pathology, National Chung Hsing University, Taichung, Taiwan
| | - Tzu-Hao Kuo
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Tung-Tsuan Tsay
- Department of Plant Pathology, National Chung Hsing University, Taichung, Taiwan
| | - Isheng J. Tsai
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | - Peichen J. Chen
- Department of Plant Pathology, National Chung Hsing University, Taichung, Taiwan
- * E-mail:
| |
Collapse
|
41
|
Lees JG, Dawson NL, Sillitoe I, Orengo CA. Functional innovation from changes in protein domains and their combinations. Curr Opin Struct Biol 2016; 38:44-52. [DOI: 10.1016/j.sbi.2016.05.016] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2016] [Revised: 05/17/2016] [Accepted: 05/24/2016] [Indexed: 10/21/2022]
|
42
|
Annibalini G, Bielli P, De Santi M, Agostini D, Guescini M, Sisti D, Contarelli S, Brandi G, Villarini A, Stocchi V, Sette C, Barbieri E. MIR retroposon exonization promotes evolutionary variability and generates species-specific expression of IGF-1 splice variants. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2016; 1859:757-68. [DOI: 10.1016/j.bbagrm.2016.03.014] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2015] [Revised: 03/07/2016] [Accepted: 03/23/2016] [Indexed: 12/18/2022]
|
43
|
Nagulapalli M, Maji S, Dwivedi N, Dahiya P, Thakur JK. Evolution of disorder in Mediator complex and its functional relevance. Nucleic Acids Res 2015; 44:1591-612. [PMID: 26590257 PMCID: PMC4770211 DOI: 10.1093/nar/gkv1135] [Citation(s) in RCA: 45] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2015] [Accepted: 10/18/2015] [Indexed: 12/27/2022] Open
Abstract
Mediator, an important component of eukaryotic transcriptional machinery, is a huge multisubunit complex. Though the complex is known to be conserved across all the eukaryotic kingdoms, the evolutionary topology of its subunits has never been studied. In this study, we profiled disorder in the Mediator subunits of 146 eukaryotes belonging to three kingdoms viz., metazoans, plants and fungi, and attempted to find correlation between the evolution of Mediator complex and its disorder. Our analysis suggests that disorder in Mediator complex have played a crucial role in the evolutionary diversification of complexity of eukaryotic organisms. Conserved intrinsic disordered regions (IDRs) were identified in only six subunits in the three kingdoms whereas unique patterns of IDRs were identified in other Mediator subunits. Acquisition of novel molecular recognition features (MoRFs) through evolution of new subunits or through elongation of the existing subunits was evident in metazoans and plants. A new concept of ‘junction-MoRF’ has been introduced. Evolutionary link between CBP and Med15 has been provided which explain the evolution of extended-IDR in CBP from Med15 KIX-IDR junction-MoRF suggesting role of junction-MoRF in evolution and modulation of protein–protein interaction repertoire. This study can be informative and helpful in understanding the conserved and flexible nature of Mediator complex across eukaryotic kingdoms.
Collapse
Affiliation(s)
- Malini Nagulapalli
- Plant Mediator Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Sourobh Maji
- Plant Mediator Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Nidhi Dwivedi
- Plant Mediator Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Pradeep Dahiya
- Plant Mediator Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| | - Jitendra K Thakur
- Plant Mediator Lab, National Institute of Plant Genome Research, Aruna Asaf Ali Marg, New Delhi 110067, India
| |
Collapse
|
44
|
Scaiewicz A, Levitt M. The language of the protein universe. Curr Opin Genet Dev 2015; 35:50-6. [PMID: 26451980 DOI: 10.1016/j.gde.2015.08.010] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Revised: 08/20/2015] [Accepted: 08/25/2015] [Indexed: 11/17/2022]
Abstract
Proteins, the main cell machinery which play a major role in nearly every cellular process, have always been a central focus in biology. We live in the post-genomic era, and inferring information from massive data sets is a steadily growing universal challenge. The increasing availability of fully sequenced genomes can be regarded as the 'Rosetta Stone' of the protein universe, allowing the understanding of genomes and their evolution, just as the original Rosetta Stone allowed Champollion to decipher the ancient Egyptian hieroglyphics. In this review, we consider aspects of the protein domain architectures repertoire that are closely related to those of human languages and aim to provide some insights about the language of proteins.
Collapse
Affiliation(s)
- Andrea Scaiewicz
- Department of Structural Biology, Stanford University, Stanford, CA 94305-5126, United States
| | - Michael Levitt
- Department of Structural Biology, Stanford University, Stanford, CA 94305-5126, United States.
| |
Collapse
|
45
|
Bhargav SP, Vahokoski J, Kallio JP, Torda AE, Kursula P, Kursula I. Two independently folding units of Plasmodium profilin suggest evolution via gene fusion. Cell Mol Life Sci 2015; 72:4193-203. [PMID: 26012696 PMCID: PMC11113795 DOI: 10.1007/s00018-015-1932-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Revised: 05/13/2015] [Accepted: 05/18/2015] [Indexed: 10/23/2022]
Abstract
Gene fusion is a common mechanism of protein evolution that has mainly been discussed in the context of multidomain or symmetric proteins. Less is known about fusion of ancestral genes to produce small single-domain proteins. Here, we show with a domain-swapped mutant Plasmodium profilin that this small, globular, apparently single-domain protein consists of two foldons. The separation of binding sites for different protein ligands in the two halves suggests evolution via an ancient gene fusion event, analogous to the formation of multidomain proteins. Finally, the two fragments can be assembled together after expression as two separate gene products. The possibility to engineer both domain-swapped dimers and half-profilins that can be assembled back to a full profilin provides perspectives for engineering of novel protein folds, e.g., with different scaffolding functions.
Collapse
Affiliation(s)
| | - Juha Vahokoski
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, P.O. Box 5400, 90014, Oulu, Finland
| | - Juha Pekka Kallio
- Helmholtz Centre for Infection Research, Notkestrasse 85, 22607, Hamburg, Germany
- German Electron Synchrotron (DESY), Notkestrasse 85, 22607, Hamburg, Germany
| | - Andrew E Torda
- Centre for Bioinformatics, University of Hamburg, Bundesstrasse 43, 20146, Hamburg, Germany
| | - Petri Kursula
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, P.O. Box 5400, 90014, Oulu, Finland
- Biocenter Oulu, University of Oulu, P.O. Box 5000, 90014, Oulu, Finland
- Department of Biomedicine, University of Bergen, Jonas Lies vei 91, 5009, Bergen, Norway
| | - Inari Kursula
- Faculty of Biochemistry and Molecular Medicine, University of Oulu, P.O. Box 5400, 90014, Oulu, Finland.
- Helmholtz Centre for Infection Research, Notkestrasse 85, 22607, Hamburg, Germany.
- German Electron Synchrotron (DESY), Notkestrasse 85, 22607, Hamburg, Germany.
- Department of Biomedicine, University of Bergen, Jonas Lies vei 91, 5009, Bergen, Norway.
| |
Collapse
|
46
|
Stolzer M, Siewert K, Lai H, Xu M, Durand D. Event inference in multidomain families with phylogenetic reconciliation. BMC Bioinformatics 2015; 16 Suppl 14:S8. [PMID: 26451642 PMCID: PMC4610023 DOI: 10.1186/1471-2105-16-s14-s8] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Reconstructing evolution provides valuable insights into the processes of gene evolution and function. However, while there have been great advances in algorithms and software to reconstruct the history of gene families, these tools do not model the domain shuffling events (domain duplication, insertion, transfer, and deletion) that drive the evolution of multidomain protein families. Protein evolution through domain shuffling events allows for rapid exploration of functions by introducing new combinations of existing folds. This powerful mechanism was key to some significant evolutionary innovations, such as multicellularity and the vertebrate immune system. A method for reconstructing this important evolutionary process is urgently needed. RESULTS Here, we introduce a novel, event-based framework for studying multidomain evolution by reconciling a domain tree with a gene tree, with additional information provided by the species tree. In the context of this framework, we present the first reconciliation algorithms to infer domain shuffling events, while addressing the challenges inherent in the inference of evolution across three levels of organization. CONCLUSIONS We apply these methods to the evolution of domains in the Membrane associated Guanylate Kinase family. These case studies reveal a more vivid and detailed evolutionary history than previously provided. Our algorithms have been implemented in software, freely available at http://www.cs.cmu.edu/˜durand/Notung.
Collapse
|
47
|
Zhang ZN, Wu QY, Zhang GZ, Zhu YY, Murphy RW, Liu Z, Zou CG. Systematic analyses reveal uniqueness and origin of the CFEM domain in fungi. Sci Rep 2015; 5:13032. [PMID: 26255557 PMCID: PMC4530338 DOI: 10.1038/srep13032] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2015] [Accepted: 07/16/2015] [Indexed: 11/25/2022] Open
Abstract
CFEM domain commonly occurs in fungal extracellular membrane proteins. To provide insights for understanding putative functions of CFEM, we investigate the evolutionary dynamics of CFEM domains by systematic comparative genomic analyses among diverse animals, plants, and more than 100 fungal species, which are representative across the entire group of fungi. We here show that CFEM domain is unique to fungi. Experiments using tissue culture demonstrate that the CFEM-containing ESTs in some plants originate from endophytic fungi. We also find that CFEM domain does not occur in all fungi. Its single origin dates to the most recent common ancestors of Ascomycota and Basidiomycota, instead of multiple origins. Although the length and architecture of CFEM domains are relatively conserved, the domain-number varies significantly among different fungal species. In general, pathogenic fungi have a larger number of domains compared to other species. Domain-expansion across fungal genomes appears to be driven by domain duplication and gene duplication via recombination. These findings generate a clear evolutionary trajectory of CFEM domains and provide novel insights into the functional exchange of CFEM-containing proteins from cell-surface components to mediators in host-pathogen interactions.
Collapse
Affiliation(s)
- Zhen-Na Zhang
- 1] Laboratory for Conservation and Utilization of Bio-Resources, Yunnan University, Kunming, China [2] Xiamen Tobacco Industrial CO., LTD, Xiamen, China
| | - Qin-Yi Wu
- Laboratory for Conservation and Utilization of Bio-Resources, Yunnan University, Kunming, China
| | | | - Yue-Yan Zhu
- Laboratory for Conservation and Utilization of Bio-Resources, Yunnan University, Kunming, China
| | - Robert W Murphy
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Zhen Liu
- State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences, Kunming, China
| | - Cheng-Gang Zou
- Laboratory for Conservation and Utilization of Bio-Resources, Yunnan University, Kunming, China
| |
Collapse
|
48
|
Kersting AR, Mizrachi E, Bornberg-Bauer E, Myburg AA. Protein domain evolution is associated with reproductive diversification and adaptive radiation in the genus Eucalyptus. THE NEW PHYTOLOGIST 2015; 206:1328-36. [PMID: 25494981 DOI: 10.1111/nph.13211] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 11/04/2014] [Indexed: 05/04/2023]
Abstract
Eucalyptus is a pivotal genus within the rosid order Myrtales with distinct geographic history and adaptations. Comparative analysis of protein domain evolution in the newly sequenced Eucalyptus grandis genome and other rosid lineages sheds light on the adaptive mechanisms integral to the success of this genus of woody perennials. We reconstructed the ancestral domain content to elucidate the gain, loss and expansion of protein domains and domain arrangements in Eucalyptus in the context of rosid phylogeny. We used functional gene ontology (GO) annotation of genes to investigate the possible biological and evolutionary consequences of protein domain expansion. We found that protein modulation within the angiosperms occurred primarily on the level of expansion of certain domains and arrangements. Using RNA-Seq data from E. grandis, we showed that domain expansions have contributed to tissue-specific expression of tandemly duplicated genes. Our results indicate that tandem duplication of genes, a key feature of the Eucalyptus genome, has played an important role in the expansion of domains, particularly in proteins related to the specialization of reproduction and biotic and abiotic interactions affecting root and floral biology, and that tissue-specific expression of proteins with expanded domains has facilitated subfunctionalization in domain families.
Collapse
Affiliation(s)
- Anna R Kersting
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
- Bioinformatics Group, Institute for Computer Science, Heinrich-Heine-University, Duesseldorf, Germany
| | - Eshchar Mizrachi
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute, University of Pretoria, Private Bag X20, Pretoria, 0028, South Africa
| | - Erich Bornberg-Bauer
- Evolutionary Bioinformatics Group, Institute for Evolution and Biodiversity, University of Muenster, Muenster, Germany
| | - Alexander A Myburg
- Department of Genetics, Forestry and Agricultural Biotechnology Institute (FABI), Genomics Research Institute, University of Pretoria, Private Bag X20, Pretoria, 0028, South Africa
| |
Collapse
|
49
|
Thieulin-Pardo G, Avilan L, Kojadinovic M, Gontero B. Fairy "tails": flexibility and function of intrinsically disordered extensions in the photosynthetic world. Front Mol Biosci 2015; 2:23. [PMID: 26042223 PMCID: PMC4436894 DOI: 10.3389/fmolb.2015.00023] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2015] [Accepted: 05/04/2015] [Indexed: 12/22/2022] Open
Abstract
Intrinsically Disordered Proteins (IDPs), or protein fragments also called Intrinsically Disordered Regions (IDRs), display high flexibility as the result of their amino acid composition. They can adopt multiple roles. In globular proteins, IDRs are usually found as loops and linkers between secondary structure elements. However, not all disordered fragments are loops: some proteins bear an intrinsically disordered extension at their C- or N-terminus, and this flexibility can affect the protein as a whole. In this review, we focus on the disordered N- and C-terminal extensions of globular proteins from photosynthetic organisms. Using the examples of the A2B2-GAPDH and the α Rubisco activase isoform, we show that intrinsically disordered extensions can help regulate their “host” protein in response to changes in light, thereby participating in photosynthesis regulation. As IDPs are famous for their large number of protein partners, we used the examples of the NAC, bZIP, TCP, and GRAS transcription factor families to illustrate the fact that intrinsically disordered extremities can allow a protein to have an increased number of partners, which directly affects its regulation. Finally, for proteins from the cryptochrome light receptor family, we describe how a new role for the photolyase proteins may emerge by the addition of an intrinsically disordered extension, while still allowing the protein to absorb blue light. This review has highlighted the diverse repercussions of the disordered extension on the regulation and function of their host protein and outlined possible future research avenues.
Collapse
Affiliation(s)
- Gabriel Thieulin-Pardo
- UMR 7281, Centre National de la Recherche Scientifique, Aix-Marseille Université Marseille, France
| | - Luisana Avilan
- UMR 7281, Centre National de la Recherche Scientifique, Aix-Marseille Université Marseille, France
| | - Mila Kojadinovic
- UMR 7281, Centre National de la Recherche Scientifique, Aix-Marseille Université Marseille, France
| | - Brigitte Gontero
- UMR 7281, Centre National de la Recherche Scientifique, Aix-Marseille Université Marseille, France
| |
Collapse
|
50
|
Prakash A, Bateman A. Domain atrophy creates rare cases of functional partial protein domains. Genome Biol 2015; 16:88. [PMID: 25924720 PMCID: PMC4432964 DOI: 10.1186/s13059-015-0655-8] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2014] [Accepted: 04/15/2015] [Indexed: 01/12/2023] Open
Abstract
BACKGROUND Protein domains display a range of structural diversity, with numerous additions and deletions of secondary structural elements between related domains. We have observed a small number of cases of surprising large-scale deletions of core elements of structural domains. We propose a new concept called domain atrophy, where protein domains lose a significant number of core structural elements. RESULTS Here, we implement a new pipeline to systematically identify new cases of domain atrophy across all known protein sequences. The output of this pipeline was carefully checked by hand, which filtered out partial domain instances that were unlikely to represent true domain atrophy due to misannotations or un-annotated sequence fragments. We identify 75 cases of domain atrophy, of which eight cases are found in a three-dimensional protein structure and 67 cases have been inferred based on mapping to a known homologous structure. Domains with structural variations include ancient folds such as the TIM-barrel and Rossmann folds. Most of these domains are observed to show structural loss that does not affect their functional sites. CONCLUSION Our analysis has significantly increased the known cases of domain atrophy. We discuss specific instances of domain atrophy and see that there has often been a compensatory mechanism that helps to maintain the stability of the partial domain. Our study indicates that although domain atrophy is an extremely rare phenomenon, protein domains under certain circumstances can tolerate extreme mutations giving rise to partial, but functional, domains.
Collapse
Affiliation(s)
- Ananth Prakash
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.
| | - Alex Bateman
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, UK.
| |
Collapse
|