101
|
Pérez-Castaño R, Bastida-Martínez E, Fernández Zapata J, Polanco MDC, Galbis-Martínez ML, Iniesta AA, Fontes M, Padmanabhan S, Elías-Arnanz M. Coenzyme B 12 -dependent and independent photoregulation of carotenogenesis across Myxococcales. Environ Microbiol 2022; 24:1865-1886. [PMID: 35005822 PMCID: PMC9304148 DOI: 10.1111/1462-2920.15895] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2021] [Revised: 12/27/2021] [Accepted: 01/01/2022] [Indexed: 11/28/2022]
Abstract
Light-induced carotenogenesis in Myxococcus xanthus is controlled by the B12 -based CarH repressor and photoreceptor, and by a separate intricate pathway involving singlet oxygen, the B12 -independent CarH paralog CarA and various other proteins, some eukaryotic-like. Whether other myxobacteria conserve these pathways and undergo photoregulated carotenogenesis is unknown. Here, comparative analyses across 27 Myxococcales genomes identified carotenogenic genes, albeit arranged differently, with carH often in their genomic vicinity, in all three Myxococcales suborders. However, CarA and its associated factors were found exclusively in suborder Cystobacterineae, with carA-carH invariably in tandem in a syntenic carotenogenic operon, except for Cystobacter/Melittangium, which lack CarA but retain all other factors. We experimentally show B12 -mediated photoregulated carotenogenesis in representative myxobacteria, and a remarkably plastic CarH operator design and DNA binding across Myxococcales. Unlike the two characterized CarH from other phyla, which are tetrameric, Cystobacter CarH (the first myxobacterial homolog amenable to analysis in vitro) is a dimer that combines direct CarH-like B12 -based photoregulation with CarA-like DNA-binding and inhibition by an antirepressor. This study provides new molecular insights into B12 -dependent photoreceptors. It further establishes the B12 -dependent pathway for photoregulated carotenogenesis as broadly prevalent across myxobacteria and its evolution, exclusively in one suborder, into a parallel complex B12 -independent circuit. This article is protected by copyright. All rights reserved.
Collapse
Affiliation(s)
- Ricardo Pérez-Castaño
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| | - Eva Bastida-Martínez
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| | - Jesús Fernández Zapata
- Instituto de Química Física "Rocasolano", Consejo Superior de Investigaciones Científicas, 28006, Madrid, Spain
| | - María Del Carmen Polanco
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| | - María Luisa Galbis-Martínez
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| | - Antonio A Iniesta
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| | - Marta Fontes
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| | - S Padmanabhan
- Instituto de Química Física "Rocasolano", Consejo Superior de Investigaciones Científicas, 28006, Madrid, Spain
| | - Montserrat Elías-Arnanz
- Departamento de Genética y Microbiología, Área de Genética (Unidad Asociada al IQFR-CSIC), Facultad de Biología, Universidad de Murcia, 30100, Murcia, Spain
| |
Collapse
|
102
|
Cridland JM, Majane AC, Zhao L, Begun DJ. Population biology of accessory gland-expressed de novo genes in Drosophila melanogaster. Genetics 2022; 220:iyab207. [PMID: 34791207 PMCID: PMC8733444 DOI: 10.1093/genetics/iyab207] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 11/08/2021] [Indexed: 12/20/2022] Open
Abstract
Early work on de novo gene discovery in Drosophila was consistent with the idea that many such genes have male-biased patterns of expression, including a large number expressed in the testis. However, there has been little formal analysis of variation in the abundance and properties of de novo genes expressed in different tissues. Here, we investigate the population biology of recently evolved de novo genes expressed in the Drosophila melanogaster accessory gland, a somatic male tissue that plays an important role in male and female fertility and the post mating response of females, using the same collection of inbred lines used previously to identify testis-expressed de novo genes, thus allowing for direct cross tissue comparisons of these genes in two tissues of male reproduction. Using RNA-seq data, we identify candidate de novo genes located in annotated intergenic and intronic sequence and determine the properties of these genes including chromosomal location, expression, abundance, and coding capacity. Generally, we find major differences between the tissues in terms of gene abundance and expression, though other properties such as transcript length and chromosomal distribution are more similar. We also explore differences between regulatory mechanisms of de novo genes in the two tissues and how such differences may interact with selection to produce differences in D. melanogaster de novo genes expressed in the two tissues.
Collapse
Affiliation(s)
- Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA
| | - Alex C Majane
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis, Davis, CA 95616, USA
| |
Collapse
|
103
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
104
|
Bhave D, Tautz D. Effects of the Expression of Random Sequence Clones on Growth and Transcriptome Regulation in Escherichia coli. Genes (Basel) 2021; 13:genes13010053. [PMID: 35052392 PMCID: PMC8775113 DOI: 10.3390/genes13010053] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2021] [Revised: 12/21/2021] [Accepted: 12/21/2021] [Indexed: 02/04/2023] Open
Abstract
Comparative genomic analyses have provided evidence that new genetic functions can emerge out of random nucleotide sequences. Here, we apply a direct experimental approach to study the effects of plasmids harboring random sequence inserts under the control of an inducible promoter. Based on data from previously described experiments dealing with the growth of clones within whole libraries, we extracted specific clones that had shown either negative, neutral or positive effects on relative cell growth. We analyzed these individually with respect to growth characteristics and the impact on the transcriptome. We find that candidate clones for negative peptides lead to growth arrest by eliciting a general stress response. Overexpression of positive clones, on the other hand, does not change the exponential growth rates of hosts, and they show a growth advantage over a neutral clone when tested in direct competition experiments. Transcriptomic changes in positive clones are relatively moderate and specific to each clone. We conclude from our experiments that random sequence peptides are indeed a suitable source for the de novo evolution of genetic functions.
Collapse
|
105
|
Barua A, Koludarov I, Mikheyev AS. Co-option of the same ancestral gene family gave rise to mammalian and reptilian toxins. BMC Biol 2021; 19:268. [PMID: 34949191 PMCID: PMC8705180 DOI: 10.1186/s12915-021-01191-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Accepted: 11/11/2021] [Indexed: 12/03/2022] Open
Abstract
Background Evolution can occur with surprising predictability when organisms face similar ecological challenges. For most traits, it is difficult to ascertain whether this occurs due to constraints imposed by the number of possible phenotypic solutions or because of parallel responses by shared genetic and regulatory architecture. Exceptionally, oral venoms are a tractable model of trait evolution, being largely composed of proteinaceous toxins that have evolved in many tetrapods, ranging from reptiles to mammals. Given the diversity of venomous lineages, they are believed to have evolved convergently, even though biochemically similar toxins occur in all taxa. Results Here, we investigate whether ancestral genes harbouring similar biochemical activity may have primed venom evolution, focusing on the origins of kallikrein-like serine proteases that form the core of most vertebrate oral venoms. Using syntenic relationships between genes flanking known toxins, we traced the origin of kallikreins to a single locus containing one or more nearby paralogous kallikrein-like clusters. Additionally, phylogenetic analysis of vertebrate serine proteases revealed that kallikrein-like toxins in mammals and reptiles are genetically distinct from non-toxin ones. Conclusions Given the shared regulatory and genetic machinery, these findings suggest that tetrapod venoms evolved by co-option of proteins that were likely already present in saliva. We term such genes ‘toxipotent’—in the case of salivary kallikreins they already had potent vasodilatory activity that was weaponized by venomous lineages. Furthermore, the ubiquitous distribution of kallikreins across vertebrates suggests that the evolution of envenomation may be more common than previously recognized, blurring the line between venomous and non-venomous animals. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-021-01191-1.
Collapse
Affiliation(s)
- Agneesh Barua
- Ecology and Evolution Unit, Okinawa Institute of Science and Technology Graduate University, Okinawa, Japan.
| | - Ivan Koludarov
- Animal Venomics Group, Justus Leibig University, Giessen, Germany
| | - Alexander S Mikheyev
- Research School of Biology, Australian National University, Canberra, ACT, Australia.
| |
Collapse
|
106
|
Li J, Singh U, Bhandary P, Campbell J, Arendsee Z, Seetharam AS, Wurtele ES. Foster thy young: enhanced prediction of orphan genes in assembled genomes. Nucleic Acids Res 2021; 50:e37. [PMID: 34928390 PMCID: PMC9023268 DOI: 10.1093/nar/gkab1238] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2021] [Revised: 10/22/2021] [Accepted: 12/02/2021] [Indexed: 02/06/2023] Open
Abstract
Proteins encoded by newly-emerged genes ('orphan genes') share no sequence similarity with proteins in any other species. They provide organisms with a reservoir of genetic elements to quickly respond to changing selection pressures. Here, we systematically assess the ability of five gene prediction pipelines to accurately predict genes in genomes according to phylostratal origin. BRAKER and MAKER are existing, popular ab initio tools that infer gene structures by machine learning. Direct Inference is an evidence-based pipeline we developed to predict gene structures from alignments of RNA-Seq data. The BIND pipeline integrates ab initio predictions of BRAKER and Direct inference; MIND combines Direct Inference and MAKER predictions. We use highly-curated Arabidopsis and yeast annotations as gold-standard benchmarks, and cross-validate in rice. Each pipeline under-predicts orphan genes (as few as 11 percent, under one prediction scenario). Increasing RNA-Seq diversity greatly improves prediction efficacy. The combined methods (BIND and MIND) yield best predictions overall, BIND identifying 68% of annotated orphan genes, 99% of ancient genes, and give the highest sensitivity score regardless dataset in Arabidopsis. We provide a light weight, flexible, reproducible, and well-documented solution to improve gene prediction.
Collapse
Affiliation(s)
- Jing Li
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Genetics and Genomics Graduate Program, Iowa State University, Ames, IA 50014, USA
| | - Urminder Singh
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| | - Priyanka Bhandary
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| | - Jacqueline Campbell
- Corn Insects and Crop Genetics Research Unit, US Department of Agriculture Agriculture Research Service, Ames, IA 50014, USA
| | - Zebulun Arendsee
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| | - Arun S Seetharam
- Genome Informatics Facility, Iowa State University, Ames, IA 50014, USA
| | - Eve Syrkin Wurtele
- Department of Genetics, Development and Cell Biology, Iowa State University, Ames, IA 50014, USA.,Center for Metabolic Biology, Iowa State University, Ames, IA 50014, USA.,Genetics and Genomics Graduate Program, Iowa State University, Ames, IA 50014, USA.,Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA 50014, USA
| |
Collapse
|
107
|
Cherezov RO, Vorontsova JE, Simonova OB. The Phenomenon of Evolutionary “De Novo Generation” of Genes. Russ J Dev Biol 2021. [DOI: 10.1134/s1062360421060035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
108
|
Klein B, Holmér L, Smith KM, Johnson MM, Swain A, Stolp L, Teufel AI, Kleppe AS. A computational exploration of resilience and evolvability of protein-protein interaction networks. Commun Biol 2021; 4:1352. [PMID: 34857859 PMCID: PMC8639913 DOI: 10.1038/s42003-021-02867-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2020] [Accepted: 11/03/2021] [Indexed: 11/09/2022] Open
Abstract
Protein-protein interaction (PPI) networks represent complex intra-cellular protein interactions, and the presence or absence of such interactions can lead to biological changes in an organism. Recent network-based approaches have shown that a phenotype's PPI network's resilience to environmental perturbations is related to its placement in the tree of life; though we still do not know how or why certain intra-cellular factors can bring about this resilience. Here, we explore the influence of gene expression and network properties on PPI networks' resilience. We use publicly available data of PPIs for E. coli, S. cerevisiae, and H. sapiens, where we compute changes in network resilience as new nodes (proteins) are added to the networks under three node addition mechanisms-random, degree-based, and gene-expression-based attachments. By calculating the resilience of the resulting networks, we estimate the effectiveness of these node addition mechanisms. We demonstrate that adding nodes with gene-expression-based preferential attachment (as opposed to random or degree-based) preserves and can increase the original resilience of PPI network in all three species, regardless of gene expression distribution or network structure. These findings introduce a general notion of prospective resilience, which highlights the key role of network structures in understanding the evolvability of phenotypic traits.
Collapse
Affiliation(s)
- Brennan Klein
- Network Science Institute, Northeastern University, Boston, MA, USA. .,Laboratory for the Modeling of Biological and Socio-Technical Systems, Northeastern University, Boston, MA, USA.
| | - Ludvig Holmér
- grid.419684.60000 0001 1214 1861Center for Data Analytics, Stockholm School of Economics, Stockholm, Sweden
| | - Keith M. Smith
- grid.12361.370000 0001 0727 0669Department of Physics and Mathematics, Nottingham Trent University, Nottingham, UK
| | - Mackenzie M. Johnson
- grid.89336.370000 0004 1936 9924Department of Integrative Biology, University of Texas at Austin, Austin, TX USA
| | - Anshuman Swain
- grid.164295.d0000 0001 0941 7177Department of Biology, University of Maryland, College Park, MD USA
| | - Laura Stolp
- grid.7177.60000000084992262Graduate School of Science, University of Amsterdam, Amsterdam, The Netherlands
| | - Ashley I. Teufel
- grid.89336.370000 0004 1936 9924Department of Integrative Biology, University of Texas at Austin, Austin, TX USA ,grid.209665.e0000 0001 1941 1940Santa Fe Institute, Santa Fe, NM USA ,grid.469272.c0000 0001 0180 5693Texas A&M University, San Antonio, San Antonio, TX USA
| | - April S. Kleppe
- grid.5949.10000 0001 2172 9288Institute for Evolution and Biodiversity, University of Münster, Münster, Germany ,grid.7048.b0000 0001 1956 2722Department of Clinical Medicine (MOMA), Aarhus University, Aarhus, Denmark
| |
Collapse
|
109
|
Papadopoulos C, Callebaut I, Gelly JC, Hatin I, Namy O, Renard M, Lespinet O, Lopes A. Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution. Genome Res 2021; 31:2303-2315. [PMID: 34810219 PMCID: PMC8647833 DOI: 10.1101/gr.275638.121] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023]
Abstract
The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Jean-Christophe Gelly
- Université de Paris, Biologie Intégrée du Globule Rouge, UMR_S1134, BIGR, INSERM, F-75015 Paris, France
- Laboratoire d'Excellence GR-Ex, 75015 Paris, France
- Institut National de la Transfusion Sanguine, F-75015 Paris, France
| | - Isabelle Hatin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Namy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Maxime Renard
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Lespinet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
110
|
Castro JF, Tautz D. The Effects of Sequence Length and Composition of Random Sequence Peptides on the Growth of E. coli Cells. Genes (Basel) 2021; 12:1913. [PMID: 34946861 PMCID: PMC8702183 DOI: 10.3390/genes12121913] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2021] [Revised: 11/22/2021] [Accepted: 11/26/2021] [Indexed: 12/21/2022] Open
Abstract
We study the potential for the de novo evolution of genes from random nucleotide sequences using libraries of E. coli expressing random sequence peptides. We assess the effects of such peptides on cell growth by monitoring frequency changes in individual clones in a complex library through four serial passages. Using a new analysis pipeline that allows the tracing of peptides of all lengths, we find that over half of the peptides have consistent effects on cell growth. Across nine different experiments, around 16% of clones increase in frequency and 36% decrease, with some variation between individual experiments. Shorter peptides (8-20 residues), are more likely to increase in frequency, longer ones are more likely to decrease. GC content, amino acid composition, intrinsic disorder, and aggregation propensity show slightly different patterns between peptide groups. Sequences that increase in frequency tend to be more disordered with lower aggregation propensity. This coincides with the observation that young genes with more disordered structures are better tolerated in genomes. Our data indicate that random sequences can be a source of evolutionary innovation, since a large fraction of them are well tolerated by the cells or can provide a growth advantage.
Collapse
Affiliation(s)
| | - Diethard Tautz
- Max Planck Institute for Evolutionary Biology, August-Thienemann Strasse 2, 24306 Plön, Germany;
| |
Collapse
|
111
|
Lee J, Wacholder A, Carvunis AR. Evolutionary Characterization of the Short Protein SPAAR. Genes (Basel) 2021; 12:genes12121864. [PMID: 34946813 PMCID: PMC8702040 DOI: 10.3390/genes12121864] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2021] [Revised: 11/22/2021] [Accepted: 11/22/2021] [Indexed: 02/07/2023] Open
Abstract
Microproteins (<100 amino acids) are receiving increasing recognition as important participants in numerous biological processes, but their evolutionary dynamics are poorly understood. SPAAR is a recently discovered microprotein that regulates muscle regeneration and angiogenesis through interactions with conserved signaling pathways. Interestingly, SPAAR does not belong to any known protein family and has known homologs exclusively among placental mammals. This lack of distant homology could be caused by challenges in homology detection of short sequences, or it could indicate a recent de novo emergence from a noncoding sequence. By integrating syntenic alignments and homology searches, we identify SPAAR orthologs in marsupials and monotremes, establishing that SPAAR has existed at least since the emergence of mammals. SPAAR shows substantial primary sequence divergence but retains a conserved protein structure. In primates, we infer two independent evolutionary events leading to the de novo origination of 5' elongated isoforms of SPAAR from a noncoding sequence and find evidence of adaptive evolution in this extended region. Thus, SPAAR may be of ancient origin, but it appears to be experiencing continual evolutionary innovation in mammals.
Collapse
Affiliation(s)
- Jiwon Lee
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; (J.L.); (A.W.)
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
- Joint CMU-Pitt Ph.D. Program in Computational Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; (J.L.); (A.W.)
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; (J.L.); (A.W.)
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
- Correspondence: ; Tel.: +1-412-648-3335
| |
Collapse
|
112
|
Claverie JM, Santini S. Validation of predicted anonymous proteins simply using Fisher's exact test. BIOINFORMATICS ADVANCES 2021; 1:vbab034. [PMID: 36700095 PMCID: PMC9710694 DOI: 10.1093/bioadv/vbab034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/03/2021] [Accepted: 11/10/2021] [Indexed: 01/28/2023]
Abstract
Motivation Genomes sequencing has become the primary (and often the sole) experimental method to characterize newly discovered organisms, in particular from the microbial world (bacteria, archaea, viruses). This generates an ever increasing number of predicted proteins the existence of which is unwarranted, in particular among those without homolog in model organisms. As a last resort, the computation of the selection pressure from pairwise alignments of the corresponding 'Open Reading Frames' (ORFs) can be used to validate their existences. However, this approach is error-prone, as not usually associated with a significance test. Results We introduce the use of the straightforward Fisher's exact test as a postprocessing of the results provided by the popular CODEML sequence comparison software. The respective rates of nucleotide changes at the nonsynonymous versus synonymous position (as determined by CODEML) are turned into entries into a 2 × 2 contingency table, the probability of which is computed under the Null hypothesis that they should not behave differently if the ORFs do not encode actual proteins. Using the genome sequences of two recently isolated giant viruses, we show that strong negative selection pressures do not always provide a solid argument in favor of the existence of proteins.
Collapse
Affiliation(s)
- Jean-Michel Claverie
- Aix-Marseille University, CNRS, IGS (UMR7256), IMM (FR3479), Luminy, Marseille F-13288, France,To whom correspondence should be addressed.
| | - Sébastien Santini
- Aix-Marseille University, CNRS, IGS (UMR7256), IMM (FR3479), Luminy, Marseille F-13288, France
| |
Collapse
|
113
|
Zhuang X, Cheng CHC. Propagation of a De Novo Gene under Natural Selection: Antifreeze Glycoprotein Genes and Their Evolutionary History in Codfishes. Genes (Basel) 2021; 12:genes12111777. [PMID: 34828383 PMCID: PMC8622921 DOI: 10.3390/genes12111777] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2021] [Revised: 11/08/2021] [Accepted: 11/08/2021] [Indexed: 11/16/2022] Open
Abstract
The de novo birth of functional genes from non-coding DNA as an important contributor to new gene formation is increasingly supported by evidence from diverse eukaryotic lineages. However, many uncertainties remain, including how the incipient de novo genes would continue to evolve and the molecular mechanisms underlying their evolutionary trajectory. Here we address these questions by investigating evolutionary history of the de novo antifreeze glycoprotein (AFGP) gene and gene family in gadid (codfish) lineages. We examined AFGP phenotype on a phylogenetic framework encompassing a broad sampling of gadids from freezing and non-freezing habitats. In three select species representing different AFGP-bearing clades, we analyzed all AFGP gene family members and the broader scale AFGP genomic regions in detail. Codon usage analyses suggest that motif duplication produced the intragenic AFGP tripeptide coding repeats, and rapid sequence divergence post-duplication stabilized the recombination-prone long repetitive coding region. Genomic loci analyses support AFGP originated once from a single ancestral genomic origin, and shed light on how the de novo gene proliferated into a gene family. Results also show the processes of gene duplication and gene loss are distinctive in separate clades, and both genotype and phenotype are commensurate with differential local selective pressures.
Collapse
Affiliation(s)
- Xuan Zhuang
- Department of Biological Sciences, University of Arkansas, Fayetteville, AR 72701, USA
- Correspondence: (X.Z.); (C.-H.C.C.)
| | - C.-H. Christina Cheng
- Department of Evolution, Ecology, and Behavior, University of Illinois, Urbana-Champaign, IL 61801, USA
- Correspondence: (X.Z.); (C.-H.C.C.)
| |
Collapse
|
114
|
Ramos-González PL, Pons T, Chabi-Jesus C, Arena GD, Freitas-Astua J. Poorly Conserved P15 Proteins of Cileviruses Retain Elements of Common Ancestry and Putative Functionality: A Theoretical Assessment on the Evolution of Cilevirus Genomes. FRONTIERS IN PLANT SCIENCE 2021; 12:771983. [PMID: 34804105 PMCID: PMC8602818 DOI: 10.3389/fpls.2021.771983] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/07/2021] [Accepted: 10/18/2021] [Indexed: 06/13/2023]
Abstract
The genus Cilevirus groups enveloped single-stranded (+) RNA virus members of the family Kitaviridae, order Martellivirales. Proteins P15, scarcely conserved polypeptides encoded by cileviruses, have no apparent homologs in public databases. Accordingly, the open reading frames (ORFs) p15, located at the 5'-end of the viral RNA2 molecules, are considered orphan genes (ORFans). In this study, we have delved into ORFs p15 and the relatively poorly understood biochemical properties of the proteins P15 to posit their importance for viruses across the genus and theorize on their origin. We detected that the ORFs p15 are under purifying selection and that, in some viral strains, the use of synonymous codons is biased, which might be a sign of adaptation to their plant hosts. Despite the high amino acid sequence divergence, proteins P15 show the conserved motif [FY]-L-x(3)-[FL]-H-x-x-[LIV]-S-C-x-C-x(2)-C-x-G-x-C, which occurs exclusively in members of this protein family. Proteins P15 also show a common predicted 3D structure that resembles the helical scaffold of the protein ORF49 encoded by radinoviruses and the phosphoprotein C-terminal domain of mononegavirids. Based on the 3D structural similarities of P15, we suggest elements of common ancestry, conserved functionality, and relevant amino acid residues. We conclude by postulating a plausible evolutionary trajectory of ORFans p15 and the 5'-end of the RNA2 of cileviruses considering both protein fold superpositions and comparative genomic analyses with the closest kitaviruses, negeviruses, nege/kita-like viruses, and unrelated viruses that share the ecological niches of cileviruses.
Collapse
Affiliation(s)
- Pedro L. Ramos-González
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
| | - Tirso Pons
- National Centre for Biotechnology (CNB-CSIC), Madrid, Spain
| | - Camila Chabi-Jesus
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
- Escola Superior de Agricultura Luiz de Queiroz (ESALQ), Universidade de São Paulo, Piracicaba, Brazil
| | - Gabriella Dias Arena
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
| | - Juliana Freitas-Astua
- Laboratório de Biologia Molecular Aplicada, Instituto Biológico de São Paulo, São Paulo, Brazil
- Embrapa Mandioca e Fruticultura, Cruz das Almas, Brazil
| |
Collapse
|
115
|
Stein WD, Hoshen MB. During evolution from the earliest tetrapoda, newly-recruited genes are increasingly paralogues of existing genes and distribute non-randomly among the chromosomes. BMC Genomics 2021; 22:794. [PMID: 34736418 PMCID: PMC8570013 DOI: 10.1186/s12864-021-08066-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Accepted: 09/28/2021] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The present availability of full genome sequences of a broad range of animal species across the whole range of evolutionary history enables one to ask questions as to the distribution of genes across the chromosomes. Do newly recruited genes, as new clades emerge, distribute at random or at non-random locations? RESULTS We extracted values for the ages of the human genes and for their current chromosome locations, from published sources. A quantitative analysis showed that the distribution of newly-added genes among and within the chromosomes appears to be increasingly non-random if one observes animals along the evolutionary series from the precursors of the tetrapoda through to the great apes, whereas the oldest genes are randomly distributed. CONCLUSIONS Randomization will result from chromosome evolution, but less and less time is available for this process as evolution proceeds. Much of the bunching of recently-added genes arises from new gene formation as paralogues in gene families, near the location of genes that were recruited in the preceding phylostratum. As examples we cite the KRTAP, ZNF, OR and some minor gene families. We show that bunching can also result from the evolution of the chromosomes themselves when, as for the KRTAP genes, blocks of genes that had previously been on disparate chromosomes become linked together.
Collapse
Affiliation(s)
- Wilfred D Stein
- Silberman Institute of Life Sciences, Hebrew University, 91904, Jerusalem, Israel.
| | - Moshe B Hoshen
- Bioinformatics Department, Jerusalem College of Technology, Tal Campus, Beit HaDfus 7, 95483, Jerusalem, Israel
| |
Collapse
|
116
|
Programmed DNA elimination: silencing genes and repetitive sequences in somatic cells. Biochem Soc Trans 2021; 49:1891-1903. [PMID: 34665225 DOI: 10.1042/bst20190951] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2021] [Revised: 09/25/2021] [Accepted: 09/28/2021] [Indexed: 12/30/2022]
Abstract
In a multicellular organism, the genomes of all cells are in general the same. Programmed DNA elimination is a notable exception to this genome constancy rule. DNA elimination removes genes and repetitive elements in the germline genome to form a reduced somatic genome in various organisms. The process of DNA elimination within an organism is highly accurate and reproducible; it typically occurs during early embryogenesis, coincident with germline-soma differentiation. DNA elimination provides a mechanism to silence selected genes and repeats in somatic cells. Recent studies in nematodes suggest that DNA elimination removes all chromosome ends, resolves sex chromosome fusions, and may also promote the birth of novel genes. Programmed DNA elimination processes are diverse among species, suggesting DNA elimination likely has evolved multiple times in different taxa. The growing list of organisms that undergo DNA elimination indicates that DNA elimination may be more widespread than previously appreciated. These various organisms will serve as complementary and comparative models to study the function, mechanism, and evolution of programmed DNA elimination in metazoans.
Collapse
|
117
|
Fesenko I, Shabalina SA, Mamaeva A, Knyazev A, Glushkevich A, Lyapina I, Ziganshin R, Kovalchuk S, Kharlampieva D, Lazarev V, Taliansky M, Koonin EV. A vast pool of lineage-specific microproteins encoded by long non-coding RNAs in plants. Nucleic Acids Res 2021; 49:10328-10346. [PMID: 34570232 DOI: 10.1093/nar/gkab816] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2021] [Revised: 08/17/2021] [Accepted: 09/17/2021] [Indexed: 12/17/2022] Open
Abstract
Pervasive transcription of eukaryotic genomes results in expression of long non-coding RNAs (lncRNAs) most of which are poorly conserved in evolution and appear to be non-functional. However, some lncRNAs have been shown to perform specific functions, in particular, transcription regulation. Thousands of small open reading frames (smORFs, <100 codons) located on lncRNAs potentially might be translated into peptides or microproteins. We report a comprehensive analysis of the conservation and evolutionary trajectories of lncRNAs-smORFs from the moss Physcomitrium patens across transcriptomes of 479 plant species. Although thousands of smORFs are subject to substantial purifying selection, the majority of the smORFs appear to be evolutionary young and could represent a major pool for functional innovation. Using nanopore RNA sequencing, we show that, on average, the transcriptional level of conserved smORFs is higher than that of non-conserved smORFs. Proteomic analysis confirmed translation of 82 novel species-specific smORFs. Numerous conserved smORFs containing low complexity regions (LCRs) or transmembrane domains were identified, the biological functions of a selected LCR-smORF were demonstrated experimentally. Thus, microproteins encoded by smORFs are a major, functionally diverse component of the plant proteome.
Collapse
Affiliation(s)
- Igor Fesenko
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Svetlana A Shabalina
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | - Anna Mamaeva
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Andrey Knyazev
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Anna Glushkevich
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Irina Lyapina
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Rustam Ziganshin
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Sergey Kovalchuk
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation
| | - Daria Kharlampieva
- Department of Cell Biology, Federal Research and Clinical Center of Physical -Chemical Medicine of Federal Medical Biological Agency, Moscow 119435, Russian Federation
| | - Vassili Lazarev
- Department of Cell Biology, Federal Research and Clinical Center of Physical -Chemical Medicine of Federal Medical Biological Agency, Moscow 119435, Russian Federation.,Moscow Institute of Physics and Technology (National Research University), Dolgoprudny, Moscow region, 141701, Russian Federation
| | - Michael Taliansky
- Shemyakin and Ovchinnikov Institute of Bioorganic Chemistry of the Russian Academy of Sciences, Moscow 117997, Russian Federation.,The James Hutton Institute, Invergowrie, Dundee DD2 5DA, UK
| | - Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| |
Collapse
|
118
|
Brzáčová Z, Peťková M, Veljačiková K, Zajičková T, Tomáška Ľ. Reconstruction of human genome evolution in yeast: an educational primer for use with "systematic humanization of the yeast cytoskeleton discerns functionally replaceable from divergent human genes". Genetics 2021; 219:6380399. [PMID: 34849890 DOI: 10.1093/genetics/iyab118] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2021] [Accepted: 07/14/2021] [Indexed: 01/01/2023] Open
Abstract
The evolution of eukaryotic organisms starting with the last eukaryotic common ancestor was accompanied by lineage-specific expansion of gene families. A paper by Garge et al. provides an excellent opportunity to have students explore how expansion of gene families via gene duplication results in protein specialization, in this case in the context of eukaryotic cytoskeletal organization . The authors tested hypotheses about conserved protein function by systematic "humanization" of the yeast cytoskeletal components while employing a wide variety of methodological approaches. We outline several exercises to promote students' ability to explore the genomic databases, perform bioinformatic analyses, design experiments for functional analysis of human genes in yeast and critically interpret results to address both specific and general questions.
Collapse
Affiliation(s)
- Zuzana Brzáčová
- Department of Genetics, Faculty of Natural Sciences, Comenius University in Bratislava, Bratislava 842 15, Slovakia
| | - Mária Peťková
- Department of Genetics, Faculty of Natural Sciences, Comenius University in Bratislava, Bratislava 842 15, Slovakia
| | - Katarína Veljačiková
- Department of Genetics, Faculty of Natural Sciences, Comenius University in Bratislava, Bratislava 842 15, Slovakia
| | - Terézia Zajičková
- Department of Genetics, Faculty of Natural Sciences, Comenius University in Bratislava, Bratislava 842 15, Slovakia
| | - Ľubomír Tomáška
- Department of Genetics, Faculty of Natural Sciences, Comenius University in Bratislava, Bratislava 842 15, Slovakia
| |
Collapse
|
119
|
Jin G, Ma PF, Wu X, Gu L, Long M, Zhang C, Li DZ. New Genes Interacted with Recent Whole Genome Duplicates in the Fast Stem Growth of Bamboos. Mol Biol Evol 2021; 38:5752-5768. [PMID: 34581782 PMCID: PMC8662795 DOI: 10.1093/molbev/msab288] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
As drivers of evolutionary innovations, new genes allow organisms to explore new niches. However, clear examples of this process remain scarce. Bamboos, the unique grass lineage diversifying into the forest, have evolved with a key innovation of fast growth of woody stem, reaching up to 1 m/day. Here, we identify 1,622 bamboo-specific orphan genes that appeared in recent 46 million years, and 19 of them evolved from noncoding ancestral sequences with entire de novo origination process reconstructed. The new genes evolved gradually in exon−intron structure, protein length, expression specificity, and evolutionary constraint. These new genes, whether or not from de novo origination, are dominantly expressed in the rapidly developing shoots, and make transcriptomes of shoots the youngest among various bamboo tissues, rather than reproductive tissue in other plants. Additionally, the particularity of bamboo shoots has also been shaped by recent whole-genome duplicates (WGDs), which evolved divergent expression patterns from ancestral states. New genes and WGDs have been evolutionarily recruited into coexpression networks to underline fast-growing trait of bamboo shoot. Our study highlights the importance of interactions between new genes and genome duplicates in generating morphological innovation.
Collapse
Affiliation(s)
- Guihua Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Peng-Fei Ma
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Xiaopei Wu
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Lianfeng Gu
- Basic Forestry and Proteomics Research Center, College of Forestry, Fujian Agriculture and Forestry University, Fuzhou, Fujian, 350002, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, 60637, USA
| | - Chengjun Zhang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| |
Collapse
|
120
|
Prabh N, Tautz D. Frequent lineage-specific substitution rate changes support an episodic model for protein evolution. G3-GENES GENOMES GENETICS 2021; 11:6372692. [PMID: 34542594 PMCID: PMC8664490 DOI: 10.1093/g3journal/jkab333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/06/2021] [Accepted: 09/13/2021] [Indexed: 12/04/2022]
Abstract
Since the inception of the molecular clock model for sequence evolution, the investigation of protein divergence has revolved around the question of a more or less constant change of amino acid sequences, with specific overall rates for each family. Although anomalies in clock-like divergence are well known, the assumption of a constant decay rate for a given protein family is usually taken as the null model for protein evolution. However, systematic tests of this null model at a genome-wide scale have lagged behind, despite the databases’ enormous growth. We focus here on divergence rate comparisons between very closely related lineages since this allows clear orthology assignments by synteny and reliable alignments, which are crucial for determining substitution rate changes. We generated a high-confidence dataset of syntenic orthologs from four ape species, including humans. We find that despite the appearance of an overall clock-like substitution pattern, several hundred protein families show lineage-specific acceleration and deceleration in divergence rates, or combinations of both in different lineages. Hence, our analysis uncovers a rather dynamic history of substitution rate changes, even between these closely related lineages, implying that one should expect that a large fraction of proteins will have had a history of episodic rate changes in deeper phylogenies. Furthermore, each of the lineages has a separate set of particularly fast diverging proteins. The genes with the highest percentage of branch-specific substitutions are ADCYAP1 in the human lineage (9.7%), CALU in chimpanzees (7.1%), SLC39A14 in the internal branch leading to humans and chimpanzees (4.1%), RNF128 in gorillas (9%), and S100Z in gibbons (15.2%). The mutational pattern in ADCYAP1 suggests a biased mutation process, possibly through asymmetric gene conversion effects. We conclude that a null model of constant change can be problematic for predicting the evolutionary trajectories of individual proteins.
Collapse
Affiliation(s)
- Neel Prabh
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, 24306 Plön, Germany
| | - Diethard Tautz
- Department of Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, August-Thienemann-Str. 2, 24306 Plön, Germany
| |
Collapse
|
121
|
Matsuo T, Nakatani K, Setoguchi T, Matsuo K, Tamada T, Suenaga Y. Secondary Structure of Human De Novo Evolved Gene Product NCYM Analyzed by Vacuum-Ultraviolet Circular Dichroism. Front Oncol 2021; 11:688852. [PMID: 34497756 PMCID: PMC8420857 DOI: 10.3389/fonc.2021.688852] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 07/31/2021] [Indexed: 11/29/2022] Open
Abstract
NCYM, a cis-antisense gene of MYCN, encodes a Homininae-specific protein that promotes the aggressiveness of human tumors. Newly evolved genes from non-genic regions are known as de novo genes, and NCYM was the first de novo gene whose oncogenic functions were validated in vivo. Targeting NCYM using drugs is a potential strategy for cancer therapy; however, the NCYM structure must be determined before drug design. In this study, we employed vacuum-ultraviolet circular dichroism to evaluate the secondary structure of NCYM. The SUMO-tagged NCYM and the isolated SUMO tag in both hydrogenated and perdeuterated forms were synthesized and purified in a cell-free in vitro system, and vacuum-ultraviolet circular dichroism spectra were measured. Significant differences between the tagged NCYM and the isolated tag were evident in the wavelength range of 190–240 nm. The circular dichroism spectral data combined with a neural network system enabled to predict the secondary structure of NCYM at the amino acid level. The 129-residue tag consists of α-helices (approximately 14%) and β-strands (approximately 29%), which corresponded to the values calculated from the atomic structure of the tag. The 238-residue tagged NCYM contained approximately 17% α-helices and 27% β-strands. The location of the secondary structure predicted using the neural network revealed that these secondary structures were enriched in the Homininae-specific region of NCYM. Deuteration of NCYM altered the secondary structure at D90 from an α-helix to another structure other than α-helix and β-strand although this change was within the experimental error range. All four nonsynonymous single-nucleotide polymorphisms (SNPs) in human populations were in this region, and the amino acid alteration in SNP N52S enhanced Myc-nick production. The D90N mutation in NCYM promoted NCYM-mediated MYCN stabilization. Our results reveal the secondary structure of NCYM and demonstrated that the Homininae-specific domain of NCYM is responsible for MYCN stabilization.
Collapse
Affiliation(s)
- Tatsuhito Matsuo
- Institute for Quantum Life Science, National Institutes for Quantum and Radiological Science and Technology, Ibaraki, Japan
| | - Kazuma Nakatani
- Department of Molecular Carcinogenesis, Chiba Cancer Center Research Institute, Chiba, Japan.,Graduate School of Medical and Pharmaceutical Sciences, Chiba University, Chiba, Japan.,Innovative Medicine CHIBA Doctoral World-leading Innovative & Smart Education (WISE) Program, Chiba University, Chiba, Japan
| | - Taiki Setoguchi
- Department of Molecular Carcinogenesis, Chiba Cancer Center Research Institute, Chiba, Japan.,Department of Neurosurgery, Chiba Cancer Center, Chiba, Japan
| | - Koichi Matsuo
- Hiroshima Synchrotron Radiation Center, Hiroshima University, Hiroshima, Japan
| | - Taro Tamada
- Institute for Quantum Life Science, National Institutes for Quantum and Radiological Science and Technology, Ibaraki, Japan
| | - Yusuke Suenaga
- Department of Molecular Carcinogenesis, Chiba Cancer Center Research Institute, Chiba, Japan
| |
Collapse
|
122
|
Rivard EL, Ludwig AG, Patel PH, Grandchamp A, Arnold SE, Berger A, Scott EM, Kelly BJ, Mascha GC, Bornberg-Bauer E, Findlay GD. A putative de novo evolved gene required for spermatid chromatin condensation in Drosophila melanogaster. PLoS Genet 2021; 17:e1009787. [PMID: 34478447 PMCID: PMC8445463 DOI: 10.1371/journal.pgen.1009787] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/16/2021] [Accepted: 08/19/2021] [Indexed: 02/07/2023] Open
Abstract
Comparative genomics has enabled the identification of genes that potentially evolved de novo from non-coding sequences. Many such genes are expressed in male reproductive tissues, but their functions remain poorly understood. To address this, we conducted a functional genetic screen of over 40 putative de novo genes with testis-enriched expression in Drosophila melanogaster and identified one gene, atlas, required for male fertility. Detailed genetic and cytological analyses showed that atlas is required for proper chromatin condensation during the final stages of spermatogenesis. Atlas protein is expressed in spermatid nuclei and facilitates the transition from histone- to protamine-based chromatin packaging. Complementary evolutionary analyses revealed the complex evolutionary history of atlas. The protein-coding portion of the gene likely arose at the base of the Drosophila genus on the X chromosome but was unlikely to be essential, as it was then lost in several independent lineages. Within the last ~15 million years, however, the gene moved to an autosome, where it fused with a conserved non-coding RNA and evolved a non-redundant role in male fertility. Altogether, this study provides insight into the integration of novel genes into biological processes, the links between genomic innovation and functional evolution, and the genetic control of a fundamental developmental process, gametogenesis.
Collapse
Affiliation(s)
- Emily L. Rivard
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Andrew G. Ludwig
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Prajal H. Patel
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | | | - Sarah E. Arnold
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | | | - Emilie M. Scott
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Brendan J. Kelly
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Grace C. Mascha
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| | - Erich Bornberg-Bauer
- University of Münster, Münster, Germany
- Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Geoffrey D. Findlay
- College of the Holy Cross, Worcester, Massachusetts, United States of America
| |
Collapse
|
123
|
Li J, Singh U, Arendsee Z, Wurtele ES. Landscape of the Dark Transcriptome Revealed Through Re-mining Massive RNA-Seq Data. Front Genet 2021; 12:722981. [PMID: 34484307 PMCID: PMC8415361 DOI: 10.3389/fgene.2021.722981] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 07/26/2021] [Indexed: 12/13/2022] Open
Abstract
The "dark transcriptome" can be considered the multitude of sequences that are transcribed but not annotated as genes. We evaluated expression of 6,692 annotated genes and 29,354 unannotated open reading frames (ORFs) in the Saccharomyces cerevisiae genome across diverse environmental, genetic and developmental conditions (3,457 RNA-Seq samples). Over 30% of the highly transcribed ORFs have translation evidence. Phylostratigraphic analysis infers most of these transcribed ORFs would encode species-specific proteins ("orphan-ORFs"); hundreds have mean expression comparable to annotated genes. These data reveal unannotated ORFs most likely to be protein-coding genes. We partitioned a co-expression matrix by Markov Chain Clustering; the resultant clusters contain 2,468 orphan-ORFs. We provide the aggregated RNA-Seq yeast data with extensive metadata as a project in MetaOmGraph (MOG), a tool designed for interactive analysis and visualization. This approach enables reuse of public RNA-Seq data for exploratory discovery, providing a rich context for experimentalists to make novel, experimentally testable hypotheses about candidate genes.
Collapse
Affiliation(s)
- Jing Li
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
| | - Urminder Singh
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Zebulun Arendsee
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| | - Eve Syrkin Wurtele
- Genetics and Genomics Graduate Program, Iowa State University, Ames, IA, United States
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, United States
- Center for Metabolic Biology, Iowa State University, Ames, IA, United States
- Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, United States
| |
Collapse
|
124
|
Parisi G, Palopoli N, Tosatto SC, Fornasari MS, Tompa P. "Protein" no longer means what it used to. Curr Res Struct Biol 2021; 3:146-152. [PMID: 34308370 PMCID: PMC8283027 DOI: 10.1016/j.crstbi.2021.06.002] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 06/18/2021] [Accepted: 06/22/2021] [Indexed: 01/02/2023] Open
Abstract
Every biologist knows that the word protein describes a group of macromolecules essential to sustain life on Earth. As biologists, we are invariably trained under a protein paradigm established since the early twentieth century. However, in recent years, the term protein unveiled itself as an euphemism to describe the overwhelming heterogeneity of these compounds. Most of our current studies are targeted on carefully selected subsets of proteins, but we tend to think and write about these as representative of the whole population. Here we discuss how seeking for universal definitions and general rules in any arbitrarily segmented study would be misleading about the conclusions. Of course, it is not our purpose to discourage the use of the word protein. Instead, we suggest to embrace the extended universe of proteins to reach a deeper understanding of their full potential, realizing that the term encompasses a group of molecules very heterogeneous in terms of size, shape, chemistry and functions, i.e. the term protein no longer means what it used to.
Collapse
Affiliation(s)
- Gustavo Parisi
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Nicolas Palopoli
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | | | - María Silvina Fornasari
- Departamento de Ciencia y Tecnología, Universidad Nacional de Quilmes, CONICET, Bernal, Buenos Aires, Argentina
| | - Peter Tompa
- VIB-VUB Center for Structural Biology (CSB), Brussels, Belgium
- Structural Biology Brussels (SBB), Vrije Universiteit Brussel (VUB), Brussels, Belgium
- Institute of Enzymology, Research Centre for Natural Sciences, Budapest, Hungary
| |
Collapse
|
125
|
Herrera-Úbeda C, Garcia-Fernàndez J. New Genes Born-In or Invading Vertebrate Genomes. Front Cell Dev Biol 2021; 9:713918. [PMID: 34295903 PMCID: PMC8290160 DOI: 10.3389/fcell.2021.713918] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2021] [Accepted: 06/15/2021] [Indexed: 12/02/2022] Open
Abstract
Which is the origin of genes is a fundamental question in Biology, indeed a question older than the discovery of genes itself. For more than a century, it was uneven to think in origins other than duplication and divergence from a previous gene. In recent years, however, the intersection of genetics, embryonic development, and bioinformatics, has brought to light that de novo generation from non-genic DNA, horizontal gene transfer and, noticeably, virus and transposon invasions, have shaped current genomes, by integrating those newcomers into old gene networks, helping to shape morphological and physiological innovations. We here summarized some of the recent research in the field, mostly in the vertebrate lineage with a focus on protein-coding novelties, showing that the placenta, the adaptative immune system, or the highly developed neocortex, among other innovations, are linked to de novo gene creation or domestication of virus and transposons. We provocatively suggest that the high tolerance to virus infections by bats may also be related to previous virus and transposon invasions in the bat lineage.
Collapse
Affiliation(s)
| | - Jordi Garcia-Fernàndez
- Department of Genetics, Microbiology and Statistics, Faculty of Biology, and Institute of Biomedicine (IBUB), University of Barcelona, Barcelona, Spain
| |
Collapse
|
126
|
Starko S, Bringloe TT, Soto Gomez M, Darby H, Graham SW, Martone PT. Genomic Rearrangements and Sequence Evolution across Brown Algal Organelles. Genome Biol Evol 2021; 13:evab124. [PMID: 34061182 PMCID: PMC8290108 DOI: 10.1093/gbe/evab124] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/27/2021] [Indexed: 02/06/2023] Open
Abstract
Organellar genomes serve as useful models for genome evolution and contain some of the most widely used phylogenetic markers, but they are poorly characterized in many lineages. Here, we report 20 novel mitochondrial genomes and 16 novel plastid genomes from the brown algae. We focused our efforts on the orders Chordales and Laminariales but also provide the first plastid genomes (plastomes) from Desmarestiales and Sphacelariales, the first mitochondrial genome (mitome) from Ralfsiales and a nearly complete mitome from Sphacelariales. We then compared gene content, sequence evolution rates, shifts in genome structural arrangements, and intron distributions across lineages. We confirm that gene content is largely conserved in both organellar genomes across the brown algal tree of life, with few cases of gene gain or loss. We further show that substitution rates are generally lower in plastid than mitochondrial genes, but plastomes are more variable in gene arrangement, as mitomes tend to be colinear even among distantly related lineages (with exceptions). Patterns of intron distribution across organellar genomes are complex. In particular, the mitomes of several laminarialean species possess group II introns that have T7-like ORFs, found previously only in mitochondrial genomes of Pylaiella spp. (Ectocarpales). The distribution of these mitochondrial introns is inconsistent with vertical transmission and likely reflects invasion by horizontal gene transfer between lineages. In the most extreme case, the mitome of Hedophyllum nigripes is ∼40% larger than the mitomes of close relatives because of these introns. Our results provide substantial insight into organellar evolution across the brown algae.
Collapse
Affiliation(s)
- Samuel Starko
- Department of Biology, University of Victoria, Victoria, Canada
- Department of Botany & Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Trevor T Bringloe
- Department of BioSciences, University of Melbourne, Melbourne, Australia
| | - Marybel Soto Gomez
- Department of Botany & Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Hayley Darby
- Department of Botany & Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Sean W Graham
- Department of Botany & Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| | - Patrick T Martone
- Department of Botany & Biodiversity Research Centre, University of British Columbia, Vancouver, Canada
| |
Collapse
|
127
|
Hata T, Satoh S, Takada N, Matsuo M, Obokata J. Kozak Sequence Acts as a Negative Regulator for De Novo Transcription Initiation of Newborn Coding Sequences in the Plant Genome. Mol Biol Evol 2021; 38:2791-2803. [PMID: 33705557 PMCID: PMC8233501 DOI: 10.1093/molbev/msab069] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
The manner in which newborn coding sequences and their transcriptional competency emerge during the process of gene evolution remains unclear. Here, we experimentally simulated eukaryotic gene origination processes by mimicking horizontal gene transfer events in the plant genome. We mapped the precise position of the transcription start sites (TSSs) of hundreds of newly introduced promoterless firefly luciferase (LUC) coding sequences in the genome of Arabidopsis thaliana cultured cells. The systematic characterization of the LUC-TSSs revealed that 80% of them occurred under the influence of endogenous promoters, while the remainder underwent de novo activation in the intergenic regions, starting from pyrimidine-purine dinucleotides. These de novo TSSs obeyed unexpected rules; they predominantly occurred ∼100 bp upstream of the LUC inserts and did not overlap with Kozak-containing putative open reading frames (ORFs). These features were the output of the immediate responses to the sequence insertions, rather than a bias in the screening of the LUC gene function. Regarding the wild-type genic TSSs, they appeared to have evolved to lack any ORFs in their vicinities. Therefore, the repulsion by the de novo TSSs of Kozak-containing ORFs described above might be the first selection gate for the occurrence and evolution of TSSs in the plant genome. Based on these results, we characterized the de novo type of TSS identified in the plant genome and discuss its significance in genome evolution.
Collapse
Affiliation(s)
- Takayuki Hata
- Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Sakyo-ku, Kyoto, Kyoto, Japan
- Faculty of Agriculture, Setsunan University, Hirakata, Osaka, Japan
| | - Soichirou Satoh
- Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Sakyo-ku, Kyoto, Kyoto, Japan
| | - Naoto Takada
- Graduate School of Life and Environmental Sciences, Kyoto Prefectural University, Sakyo-ku, Kyoto, Kyoto, Japan
| | - Mitsuhiro Matsuo
- Faculty of Agriculture, Setsunan University, Hirakata, Osaka, Japan
| | - Junichi Obokata
- Faculty of Agriculture, Setsunan University, Hirakata, Osaka, Japan
| |
Collapse
|
128
|
Edgecombe J, Urban L, Todd EV, Gemmell NJ. Might Gene Duplication and Neofunctionalization Contribute to the Sexual Lability Observed in Fish? Sex Dev 2021; 15:122-133. [PMID: 34167118 DOI: 10.1159/000515425] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 02/24/2021] [Indexed: 11/19/2022] Open
Abstract
Sex determination and differentiation varies widely across vertebrates, but is most dramatically diverse in fishes. Among fishes sex reversal and sex change are observed in 41 teleost families spanning 7 orders. These sex-changing fish perhaps highlight better than any other system that sex determination is not the narrow and fixed construct we once thought, but a plastic trait that is better viewed as a reaction norm. However, while this stunning transformation is increasingly understood, a fundamental question arises, which is why some fish species have retained this inherent plasticity in sexual fate, while others have not? Here, we explore our current understanding of sex change in fish, some of the factors that permit and constrain sex reversal, and posit that gene duplication and neofunctionalization contribute to the sexual lability observed in fish.
Collapse
Affiliation(s)
- Jonika Edgecombe
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Lara Urban
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| | - Erica V Todd
- School of Life and Environmental Sciences, Deakin University, Queenscliff, Victoria, Australia
| | - Neil J Gemmell
- Department of Anatomy, University of Otago, Dunedin, New Zealand
| |
Collapse
|
129
|
Hata T, Takada N, Hayakawa C, Kazama M, Uchikoba T, Tachikawa M, Matsuo M, Satoh S, Obokata J. De novo activated transcription of inserted foreign coding sequences is inheritable in the plant genome. PLoS One 2021; 16:e0252674. [PMID: 34111139 PMCID: PMC8191969 DOI: 10.1371/journal.pone.0252674] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/19/2021] [Indexed: 01/16/2023] Open
Abstract
The manner in which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome is poorly understood. To examine such processes of gene evolution, we performed an artificial evolutionary experiment in Arabidopsis thaliana. As a model of gene-birth events, we introduced a promoterless coding sequence of the firefly luciferase (LUC) gene and established 386 T2-generation transgenic lines. Among them, we determined the individual LUC insertion loci in 76 lines and found that one-third of them were transcribed de novo even in the intergenic or inherently unexpressed regions. In the transcribed lines, transcription-related chromatin marks were detected across the newly activated transcribed regions. These results agreed with our previous findings in A. thaliana cultured cells under a similar experimental scheme. A comparison of the results of the T2-plant and cultured cell experiments revealed that the de novo-activated transcription concomitant with local chromatin remodelling was inheritable. During one-generation inheritance, it seems likely that the transcription activities of the LUC inserts trapped by the endogenous genes/transcripts became stronger, while those of de novo transcription in the intergenic/untranscribed regions became weaker. These findings may offer a clue for the elucidation of the mechanism by which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome.
Collapse
Affiliation(s)
- Takayuki Hata
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Naoto Takada
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Chihiro Hayakawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mei Kazama
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Tomohiro Uchikoba
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Makoto Tachikawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mitsuhiro Matsuo
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Soichirou Satoh
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Junichi Obokata
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| |
Collapse
|
130
|
Bhalla N. Meiosis: Is Spermatogenesis Stress an Opportunity for Evolutionary Innovation? Curr Biol 2021; 30:R1471-R1473. [PMID: 33352126 DOI: 10.1016/j.cub.2020.10.042] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
During a brief increase in temperature, cells undergoing spermatogenesis, but not oogenesis, activate transposons. This sexual dimorphism suggests that temperature stress during spermatogenesis provides a unique opportunity for transposons to mobilize and modify genomes, driving evolutionary change without substantially affecting reproduction.
Collapse
Affiliation(s)
- Needhi Bhalla
- Department of Molecular, Cell and Developmental Biology, University of California, Santa Cruz, Santa Cruz, CA, USA.
| |
Collapse
|
131
|
Kosinski LJ, Masel J. Readthrough Errors Purge Deleterious Cryptic Sequences, Facilitating the Birth of Coding Sequences. Mol Biol Evol 2021; 37:1761-1774. [PMID: 32101291 DOI: 10.1093/molbev/msaa046] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
De novo protein-coding innovations sometimes emerge from ancestrally noncoding DNA, despite the expectation that translating random sequences is overwhelmingly likely to be deleterious. The "preadapting selection" hypothesis claims that emergence is facilitated by prior, low-level translation of noncoding sequences via molecular errors. It predicts that selection on polypeptides translated only in error is strong enough to matter and is strongest when erroneous expression is high. To test this hypothesis, we examined noncoding sequences located downstream of stop codons (i.e., those potentially translated by readthrough errors) in Saccharomyces cerevisiae genes. We identified a class of "fragile" proteins under strong selection to reduce readthrough, which are unlikely substrates for co-option. Among the remainder, sequences showing evidence of readthrough translation, as assessed by ribosome profiling, encoded C-terminal extensions with higher intrinsic structural disorder, supporting the preadapting selection hypothesis. The cryptic sequences beyond the stop codon, rather than spillover effects from the regular C-termini, are primarily responsible for the higher disorder. Results are robust to controlling for the fact that stronger selection also reduces the length of C-terminal extensions. These findings indicate that selection acts on 3' UTRs in Saccharomyces cerevisiae to purge potentially deleterious variants of cryptic polypeptides, acting more strongly in genes that experience more readthrough errors.
Collapse
Affiliation(s)
- Luke J Kosinski
- Molecular and Cellular Biology, University of Arizona, Tucson, AZ
| | - Joanna Masel
- Ecology and Evolutionary Biology, University of Arizona, Tucson, AZ
| |
Collapse
|
132
|
Wang J. Genomics of the Parasitic Nematode Ascaris and Its Relatives. Genes (Basel) 2021; 12:493. [PMID: 33800545 PMCID: PMC8065839 DOI: 10.3390/genes12040493] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2021] [Revised: 03/22/2021] [Accepted: 03/26/2021] [Indexed: 12/18/2022] Open
Abstract
Nematodes of the genus Ascaris are important parasites of humans and swine, and the phylogenetically related genera (Parascaris, Toxocara, and Baylisascaris) infect mammals of veterinary interest. Over the last decade, considerable genomic resources have been established for Ascaris, including complete germline and somatic genomes, comprehensive mRNA and small RNA transcriptomes, as well as genome-wide histone and chromatin data. These datasets provide a major resource for studies on the basic biology of these parasites and the host-parasite relationship. Ascaris and its relatives undergo programmed DNA elimination, a highly regulated process where chromosomes are fragmented and portions of the genome are lost in embryonic cells destined to adopt a somatic fate, whereas the genome remains intact in germ cells. Unlike many model organisms, Ascaris transcription drives early development beginning prior to pronuclear fusion. Studies on Ascaris demonstrated a complex small RNA network even in the absence of a piRNA pathway. Comparative genomics of these ascarids has provided perspectives on nematode sex chromosome evolution, programmed DNA elimination, and host-parasite coevolution. The genomic resources enable comparison of proteins across diverse species, revealing many new potential drug targets that could be used to control these parasitic nematodes.
Collapse
Affiliation(s)
- Jianbin Wang
- Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA;
- UT-Oak Ridge National Laboratory Graduate School of Genome Science and Technology, University of Tennessee, Knoxville, TN 37996, USA
| |
Collapse
|
133
|
Majic P, Payne JL. Enhancers Facilitate the Birth of De Novo Genes and Gene Integration into Regulatory Networks. Mol Biol Evol 2021; 37:1165-1178. [PMID: 31845961 PMCID: PMC7086177 DOI: 10.1093/molbev/msz300] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Regulatory networks control the spatiotemporal gene expression patterns that give rise to and define the individual cell types of multicellular organisms. In eumetazoa, distal regulatory elements called enhancers play a key role in determining the structure of such networks, particularly the wiring diagram of “who regulates whom.” Mutations that affect enhancer activity can therefore rewire regulatory networks, potentially causing adaptive changes in gene expression. Here, we use whole-tissue and single-cell transcriptomic and chromatin accessibility data from mouse to show that enhancers play an additional role in the evolution of regulatory networks: They facilitate network growth by creating transcriptionally active regions of open chromatin that are conducive to de novo gene evolution. Specifically, our comparative transcriptomic analysis with three other mammalian species shows that young, mouse-specific intergenic open reading frames are preferentially located near enhancers, whereas older open reading frames are not. Mouse-specific intergenic open reading frames that are proximal to enhancers are more highly and stably transcribed than those that are not proximal to enhancers or promoters, and they are transcribed in a limited diversity of cellular contexts. Furthermore, we report several instances of mouse-specific intergenic open reading frames proximal to promoters showing evidence of being repurposed enhancers. We also show that open reading frames gradually acquire interactions with enhancers over macroevolutionary timescales, helping integrate genes—those that have arisen de novo or by other means—into existing regulatory networks. Taken together, our results highlight a dual role of enhancers in expanding and rewiring gene regulatory networks.
Collapse
Affiliation(s)
- Paco Majic
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | - Joshua L Payne
- Institute of Integrative Biology, ETH Zurich, Zurich, Switzerland
- Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Corresponding author: E-mail:
| |
Collapse
|
134
|
Lange A, Patel PH, Heames B, Damry AM, Saenger T, Jackson CJ, Findlay GD, Bornberg-Bauer E. Structural and functional characterization of a putative de novo gene in Drosophila. Nat Commun 2021; 12:1667. [PMID: 33712569 PMCID: PMC7954818 DOI: 10.1038/s41467-021-21667-6] [Citation(s) in RCA: 26] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/03/2021] [Indexed: 11/26/2022] Open
Abstract
Comparative genomic studies have repeatedly shown that new protein-coding genes can emerge de novo from noncoding DNA. Still unknown is how and when the structures of encoded de novo proteins emerge and evolve. Combining biochemical, genetic and evolutionary analyses, we elucidate the function and structure of goddard, a gene which appears to have evolved de novo at least 50 million years ago within the Drosophila genus. Previous studies found that goddard is required for male fertility. Here, we show that Goddard protein localizes to elongating sperm axonemes and that in its absence, elongated spermatids fail to undergo individualization. Combining modelling, NMR and circular dichroism (CD) data, we show that Goddard protein contains a large central α-helix, but is otherwise partially disordered. We find similar results for Goddard's orthologs from divergent fly species and their reconstructed ancestral sequences. Accordingly, Goddard's structure appears to have been maintained with only minor changes over millions of years.
Collapse
Affiliation(s)
- Andreas Lange
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Prajal H Patel
- Department of Biology, College of the Holy Cross, Worcester, MA, USA
| | - Brennen Heames
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Adam M Damry
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | - Thorsten Saenger
- Department of Pediatric Kidney, Liver and Metabolic Diseases, Hannover Medical School, Hannover, Germany
| | - Colin J Jackson
- Research School of Chemistry, ANU College of Science, Canberra, Australia
| | | | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany.
| |
Collapse
|
135
|
Leurs N, Martinand-Mari C, Ventéo S, Haitina T, Debiais-Thibaud M. Evolution of Matrix Gla and Bone Gla Protein Genes in Jawed Vertebrates. Front Genet 2021; 12:620659. [PMID: 33790944 PMCID: PMC8006282 DOI: 10.3389/fgene.2021.620659] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2020] [Accepted: 02/08/2021] [Indexed: 01/05/2023] Open
Abstract
Matrix Gla protein (Mgp) and bone Gla protein (Bgp) are vitamin-K dependent proteins that bind calcium in their γ-carboxylated versions in mammals. They are recognized as positive (Bgp) or negative (Mgp and Bgp) regulators of biomineralization in a number of tissues, including skeletal tissues of bony vertebrates. The Mgp/Bgp gene family is poorly known in cartilaginous fishes, which precludes the understanding of the evolution of the biomineralization toolkit at the emergence of jawed vertebrates. Here we took advantage of recently released genomic and transcriptomic data in cartilaginous fishes and described the genomic loci and gene expression patterns of the Mgp/Bgp gene family. We identified three genes, Mgp1, Mgp2, and Bgp, in cartilaginous fishes instead of the single previously reported Mgp gene. We describe their genomic loci, resulting in a dynamic evolutionary scenario for this gene family including several events of local (tandem) duplications, but also of translocation events, along jawed vertebrate evolution. We describe the expression patterns of Mgp1, Mgp2, and Bgp in embryonic stages covering organogenesis in the small-spotted catshark Scyliorhinus canicula and present a comparative analysis with Mgp/Bgp family members previously described in bony vertebrates, highlighting ancestral features such as early embryonic, soft tissues, and neuronal expressions, but also derived features of cartilaginous fishes such as expression in fin supporting fibers. Our results support an ancestral function of Mgp in skeletal mineralization and a later derived function of Bgp in skeletal development that may be related to the divergence of bony vertebrates.
Collapse
Affiliation(s)
- Nicolas Leurs
- ISEM, CNRS, IRD, EPHE, Univ. Montpellier, Montpellier, France
| | | | - Stéphanie Ventéo
- Institute for Neurosciences of Montpellier, Saint Eloi Hospital, Inserm UMR 1051, Univ. Montpellier, Montpellier, France
| | - Tatjana Haitina
- Department of Organismal Biology, Uppsala University, Uppsala, Sweden
| | | |
Collapse
|
136
|
Silva AT, Gao B, Fisher KM, Mishler BD, Ekwealor JTB, Stark LR, Li X, Zhang D, Bowker MA, Brinda JC, Coe KK, Oliver MJ. To dry perchance to live: Insights from the genome of the desiccation-tolerant biocrust moss Syntrichia caninervis. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 105:1339-1356. [PMID: 33277766 DOI: 10.1111/tpj.15116] [Citation(s) in RCA: 43] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2020] [Accepted: 11/30/2020] [Indexed: 05/24/2023]
Abstract
With global climate change, water scarcity threatens whole agro/ecosystems. The desert moss Syntrichia caninervis, an extremophile, offers novel insights into surviving desiccation and heat. The sequenced S. caninervis genome consists of 13 chromosomes containing 16 545 protein-coding genes and 2666 unplaced scaffolds. Syntenic relationships within the S. caninervis and Physcomitrella patens genomes indicate the S. caninervis genome has undergone a single whole genome duplication event (compared to two for P. patens) and evidence suggests chromosomal or segmental losses in the evolutionary history of S. caninervis. The genome contains a large sex chromosome composed primarily of repetitive sequences with a large number of Copia and Gypsy elements. Orthogroup analyses revealed an expansion of ELIP genes encoding proteins important in photoprotection. The transcriptomic response to desiccation identified four structural clusters of novel genes. The genomic resources established for this extremophile offer new perspectives for understanding the evolution of desiccation tolerance in plants.
Collapse
Affiliation(s)
- Anderson T Silva
- Division of Plant Sciences and Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, 65211, USA
| | - Bei Gao
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Science, Urumqi, 830011, China
| | - Kirsten M Fisher
- Department of Biological Sciences, California State University, Los Angeles, California, 90032, USA
| | - Brent D Mishler
- Department of Integrative Biology, University and Jepson Herbaria, University of California, Berkeley, California, 94720-2465, USA
| | - Jenna T B Ekwealor
- Department of Integrative Biology, University and Jepson Herbaria, University of California, Berkeley, California, 94720-2465, USA
| | - Lloyd R Stark
- School of Life Sciences, University of Nevada, Las Vegas, Nevada, 89154-4004, USA
| | - Xiaoshuang Li
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Science, Urumqi, 830011, China
| | - Daoyuan Zhang
- State Key Laboratory of Desert and Oasis Ecology, Xinjiang Institute of Ecology and Geography, Chinese Academy of Science, Urumqi, 830011, China
| | - Matthew A Bowker
- School of Forestry, Northern Arizona University, Flagstaff, Arizona, 86011, USA
| | - John C Brinda
- Missouri Botanical Garden, St. Louis, Missouri, 63110-0299, USA
| | - Kirsten K Coe
- Department of Biology, Middlebury College, Middlebury, Vermont, 40506-0225, USA
| | - Melvin J Oliver
- Division of Plant Sciences and Interdisciplinary Plant Group, University of Missouri, Columbia, Missouri, 65211, USA
- USDA-ARS-MWA, Plant Genetics Research Unit, Columbia, Missouri, 65211, USA
| |
Collapse
|
137
|
Cosby RL, Judd J, Zhang R, Zhong A, Garry N, Pritham EJ, Feschotte C. Recurrent evolution of vertebrate transcription factors by transposase capture. Science 2021; 371:eabc6405. [PMID: 33602827 PMCID: PMC8186458 DOI: 10.1126/science.abc6405] [Citation(s) in RCA: 72] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2020] [Accepted: 12/18/2020] [Indexed: 12/13/2022]
Abstract
Genes with novel cellular functions may evolve through exon shuffling, which can assemble novel protein architectures. Here, we show that DNA transposons provide a recurrent supply of materials to assemble protein-coding genes through exon shuffling. We find that transposase domains have been captured-primarily via alternative splicing-to form fusion proteins at least 94 times independently over the course of ~350 million years of tetrapod evolution. We find an excess of transposase DNA binding domains fused to host regulatory domains, especially the Krüppel-associated box (KRAB) domain, and identify four independently evolved KRAB-transposase fusion proteins repressing gene expression in a sequence-specific fashion. The bat-specific KRABINER fusion protein binds its cognate transposons genome-wide and controls a network of genes and cis-regulatory elements. These results illustrate how a transcription factor and its binding sites can emerge.
Collapse
Affiliation(s)
- Rachel L Cosby
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850, USA
| | - Julius Judd
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850, USA
| | - Ruiling Zhang
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850, USA
| | - Alan Zhong
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850, USA
| | - Nathaniel Garry
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850, USA
| | - Ellen J Pritham
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14850, USA.
| |
Collapse
|
138
|
Affiliation(s)
- Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
- Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA, USA
| |
Collapse
|
139
|
Structure and function of naturally evolved de novo proteins. Curr Opin Struct Biol 2021; 68:175-183. [PMID: 33567396 DOI: 10.1016/j.sbi.2020.11.010] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2020] [Revised: 11/16/2020] [Accepted: 11/27/2020] [Indexed: 01/05/2023]
Abstract
Comparative evolutionary genomics has revealed that novel protein coding genes can emerge randomly from non-coding DNA. While most of the myriad of transcripts which continuously emerge vanish rapidly, some attain regulatory regions, become translated and survive. More surprisingly, sequence properties of de novo proteins are almost indistinguishable from randomly obtained sequences, yet de novo proteins may gain functions and integrate into eukaryotic cellular networks quite easily. We here discuss current knowledge on de novo proteins, their structures, functions and evolution. Since the existence of de novo proteins seems at odds with decade-long attempts to construct proteins with novel structures and functions from scratch, we suggest that a better understanding of de novo protein evolution may fuel new strategies for protein design.
Collapse
|
140
|
James JE, Willis SM, Nelson PG, Weibel C, Kosinski LJ, Masel J. Universal and taxon-specific trends in protein sequences as a function of age. eLife 2021; 10:e57347. [PMID: 33416492 PMCID: PMC7819706 DOI: 10.7554/elife.57347] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2020] [Accepted: 01/05/2021] [Indexed: 01/12/2023] Open
Abstract
Extant protein-coding sequences span a huge range of ages, from those that emerged only recently to those present in the last universal common ancestor. Because evolution has had less time to act on young sequences, there might be 'phylostratigraphy' trends in any properties that evolve slowly with age. A long-term reduction in hydrophobicity and hydrophobic clustering was found in previous, taxonomically restricted studies. Here we perform integrated phylostratigraphy across 435 fully sequenced species, using sensitive HMM methods to detect protein domain homology. We find that the reduction in hydrophobic clustering is universal across lineages. However, only young animal domains have a tendency to have higher structural disorder. Among ancient domains, trends in amino acid composition reflect the order of recruitment into the genetic code, suggesting that the composition of the contemporary descendants of ancient sequences reflects amino acid availability during the earliest stages of life, when these sequences first emerged.
Collapse
Affiliation(s)
- Jennifer E James
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| | - Sara M Willis
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| | - Paul G Nelson
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| | - Catherine Weibel
- Department of Physics, University of ArizonaTucsonUnited States
- Department of Mathematics, University of ArizonaTucsonUnited States
| | - Luke J Kosinski
- Department of Molecular and Cellular Biology, University of ArizonaTucsonUnited States
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of ArizonaTucsonUnited States
| |
Collapse
|
141
|
Rödelsperger C, Ebbing A, Sharma DR, Okumura M, Sommer RJ, Korswagen HC. Spatial Transcriptomics of Nematodes Identifies Sperm Cells as a Source of Genomic Novelty and Rapid Evolution. Mol Biol Evol 2021; 38:229-243. [PMID: 32785688 PMCID: PMC8480184 DOI: 10.1093/molbev/msaa207] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Divergence of gene function and expression during development can give rise to phenotypic differences at the level of cells, tissues, organs, and ultimately whole organisms. To gain insights into the evolution of gene expression and novel genes at spatial resolution, we compared the spatially resolved transcriptomes of two distantly related nematodes, Caenorhabditis elegans and Pristionchus pacificus, that diverged 60-90 Ma. The spatial transcriptomes of adult worms show little evidence for strong conservation at the level of single genes. Instead, regional expression is largely driven by recent duplication and emergence of novel genes. Estimation of gene ages across anatomical structures revealed an enrichment of novel genes in sperm-related regions. This provides first evidence in nematodes for the "out of testis" hypothesis that has been previously postulated based on studies in Drosophila and mammals. "Out of testis" genes represent a mix of products of pervasive transcription as well as fast evolving members of ancient gene families. Strikingly, numerous novel genes have known functions during meiosis in Caenorhabditis elegans indicating that even universal processes such as meiosis may be targets of rapid evolution. Our study highlights the importance of novel genes in generating phenotypic diversity and explicitly characterizes gene origination in sperm-related regions. Furthermore, it proposes new functions for previously uncharacterized genes and establishes the spatial transcriptome of Pristionchus pacificus as a catalog for future studies on the evolution of gene expression and function.
Collapse
Affiliation(s)
- Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Annabel Ebbing
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, Utrecht,
The Netherlands
| | - Devansh Raj Sharma
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Misako Okumura
- Program of Biomedical Science, Graduate School of Integrated Sciences for Life, Hiroshima University, Higashi-Hiroshima, Hiroshima, Japan
| | - Ralf J Sommer
- Department for Integrative Evolutionary Biology, Max Planck Institute for Developmental Biology, Tübingen, Germany
| | - Hendrik C Korswagen
- Hubrecht Institute, Royal Netherlands Academy of Arts and Sciences and University Medical Center Utrecht, Utrecht,
The Netherlands
- Developmental Biology, Department of Biology, Institute of Biodynamics and Biocomplexity, Utrecht University, Utrecht,
The Netherlands
| |
Collapse
|
142
|
Puntambekar S, Newhouse R, San-Miguel J, Chauhan R, Vernaz G, Willis T, Wayland MT, Umrania Y, Miska EA, Prabakaran S. Evolutionary divergence of novel open reading frames in cichlids speciation. Sci Rep 2020; 10:21570. [PMID: 33299045 PMCID: PMC7726158 DOI: 10.1038/s41598-020-78555-0] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2020] [Accepted: 11/26/2020] [Indexed: 01/02/2023] Open
Abstract
Novel open reading frames (nORFs) with coding potential may arise from noncoding DNA. Not much is known about their emergence, functional role, fixation in a population or contribution to adaptive radiation. Cichlids fishes exhibit extensive phenotypic diversification and speciation. Encounters with new environments alone are not sufficient to explain this striking diversity of cichlid radiation because other taxa coexistent with the Cichlidae demonstrate lower species richness. Wagner et al. analyzed cichlid diversification in 46 African lakes and reported that both extrinsic environmental factors and intrinsic lineage-specific traits related to sexual selection have strongly influenced the cichlid radiation, which indicates the existence of unknown molecular mechanisms responsible for rapid phenotypic diversification, such as emergence of novel open reading frames (nORFs). In this study, we integrated transcriptomic and proteomic signatures from two tissues of two cichlids species, identified nORFs and performed evolutionary analysis on these nORF regions. Our results suggest that the time scale of speciation of the two species and evolutionary divergence of these nORF genomic regions are similar and indicate a potential role for these nORFs in speciation of the cichlid fishes.
Collapse
Affiliation(s)
- Shraddha Puntambekar
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India
| | - Rachel Newhouse
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Jaime San-Miguel
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Ruchi Chauhan
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Grégoire Vernaz
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
- The Wellcome Trust/CRUK Gurdon Institute, University of Cambridge, Cambridge, CB2 1QN, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK
| | - Thomas Willis
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Matthew T Wayland
- Department of Zoology, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
| | - Yagnesh Umrania
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Eric A Miska
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Cambridge, CB10 1SA, UK
- Cambridge Centre for Proteomics, Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge, CB2 1QR, UK
| | - Sudhakaran Prabakaran
- Department of Biology, Indian Institute of Science Education and Research, Pune, Maharashtra, 411008, India.
- Department of Genetics, University of Cambridge, Downing Site, Cambridge, CB2 3EH, UK.
- St. Edmund's College, University of Cambridge, Cambridge, CB3 0BN, UK.
| |
Collapse
|
143
|
Wright BW, Ruan J, Molloy MP, Jaschke PR. Genome Modularization Reveals Overlapped Gene Topology Is Necessary for Efficient Viral Reproduction. ACS Synth Biol 2020; 9:3079-3090. [PMID: 33044064 DOI: 10.1021/acssynbio.0c00323] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Sequence overlap between two genes is common across all genomes, with viruses having high proportions of these gene overlaps. Genome modularization and refactoring is the process of disrupting natural gene overlaps to separate coding sequences to enable their individual manipulation. The biological function and fitness effects of gene overlaps are not fully understood, and their effects on gene cluster and genome-level refactoring are unknown. The bacteriophage φX174 genome has ∼26% of nucleotides involved in encoding more than one gene. In this study we use an engineered φX174 phage containing a genome with all gene overlaps removed to show that gene overlap is critical to maintaining optimal viral fecundity. Through detailed phenotypic measurements we reveal that genome modularization in φX174 causes virion replication, stability, and attachment deficiencies. Quantitation of the complete phage proteome across an infection cycle reveals 30% of proteins display abnormal expression patterns. Taken together, we have for the first time comprehensively demonstrated that gene modularization severely perturbs the coordinated functioning of a bacteriophage replication cycle. This work highlights the biological importance of gene overlap in natural genomes and that reducing gene overlap disruption should be an integral part of future genome engineering projects.
Collapse
Affiliation(s)
- Bradley W. Wright
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| | - Juanfang Ruan
- Electron Microscope Unit, Mark Wainwright Analytical Centre, The University of New South Wales, Sydney, NSW 2052, Australia
- School of Biotechnology and Biomolecular Sciences, The University of New South Wales, Sydney, NSW 2052, Australia
| | - Mark P. Molloy
- Kolling Institute, Northern Clinical School, The University of Sydney, Sydney, NSW 2006, Australia
| | - Paul R. Jaschke
- Department of Molecular Sciences, Macquarie University, Sydney, NSW 2109, Australia
| |
Collapse
|
144
|
Dowling D, Schmitz JF, Bornberg-Bauer E. Stochastic Gain and Loss of Novel Transcribed Open Reading Frames in the Human Lineage. Genome Biol Evol 2020; 12:2183-2195. [PMID: 33210146 PMCID: PMC7674706 DOI: 10.1093/gbe/evaa194] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
In addition to known genes, much of the human genome is transcribed into RNA. Chance formation of novel open reading frames (ORFs) can lead to the translation of myriad new proteins. Some of these ORFs may yield advantageous adaptive de novo proteins. However, widespread translation of noncoding DNA can also produce hazardous protein molecules, which can misfold and/or form toxic aggregates. The dynamics of how de novo proteins emerge from potentially toxic raw materials and what influences their long-term survival are unknown. Here, using transcriptomic data from human and five other primates, we generate a set of transcribed human ORFs at six conservation levels to investigate which properties influence the early emergence and long-term retention of these expressed ORFs. As these taxa diverged from each other relatively recently, we present a fine scale view of the evolution of novel sequences over recent evolutionary time. We find that novel human-restricted ORFs are preferentially located on GC-rich gene-dense chromosomes, suggesting their retention is linked to pre-existing genes. Sequence properties such as intrinsic structural disorder and aggregation propensity-which have been proposed to play a role in survival of de novo genes-remain unchanged over time. Even very young sequences code for proteins with low aggregation propensities, suggesting that genomic regions with many novel transcribed ORFs are concomitantly less likely to produce ORFs which code for harmful toxic proteins. Our data indicate that the survival of these novel ORFs is largely stochastic rather than shaped by selection.
Collapse
Affiliation(s)
- Daniel Dowling
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | - Jonathan F Schmitz
- Institute for Evolution and Biodiversity, University of Münster, Germany
| | | |
Collapse
|
145
|
Evolutionary analysis of the Moringa oleifera genome reveals a recent burst of plastid to nucleus gene duplications. Sci Rep 2020; 10:17646. [PMID: 33077763 PMCID: PMC7573628 DOI: 10.1038/s41598-020-73937-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2020] [Accepted: 09/21/2020] [Indexed: 12/22/2022] Open
Abstract
It is necessary to identify suitable alternative crops to ensure the nutritional demands of a growing global population. The genome of Moringa oleifera, a fast-growing drought-tolerant orphan crop with highly valuable agronomical, nutritional and pharmaceutical properties, has recently been reported. We model here gene family evolution in Moringa as compared with ten other flowering plant species. Despite the reduced number of genes in the compact Moringa genome, 101 gene families, grouping 957 genes, were found as significantly expanded. Expanded families were highly enriched for chloroplastidic and photosynthetic functions. Indeed, almost half of the genes belonging to Moringa expanded families grouped with their Arabidopsis thaliana plastid encoded orthologs. Microsynteny analysis together with modeling the distribution of synonymous substitutions rates, supported most plastid duplicated genes originated recently through a burst of simultaneous insertions of large regions of plastid DNA into the nuclear genome. These, together with abundant short insertions of plastid DNA, contributed to the occurrence of massive amounts of plastid DNA in the Moringa nuclear genome, representing 4.71%, the largest reported so far. Our study provides key genetic resources for future breeding programs and highlights the potential of plastid DNA to impact the structure and function of nuclear genes and genomes.
Collapse
|
146
|
High gene space divergence contrasts with frozen vegetative architecture in the moss family Funariaceae. Mol Phylogenet Evol 2020; 154:106965. [PMID: 32956800 DOI: 10.1016/j.ympev.2020.106965] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2020] [Revised: 09/13/2020] [Accepted: 09/14/2020] [Indexed: 11/22/2022]
Abstract
A new paradigm has slowly emerged regarding the diversification of bryophytes, with inferences from molecular data highlighting a dynamic evolution of their genome. However, comparative studies of expressed genes among closely related taxa is so far missing. Here we contrast the dimensions of the vegetative transcriptome of Funaria hygrometrica and Physcomitrium pyriforme against the genome of their relative, Physcomitrium (Physcomitrella) patens. These three species of Funariaceae share highly conserved vegetative bodies, and are partially sympatric, growing on mineral soil in mostly temperate regions. We analyzed the vegetative gametophytic transcriptome of F. hygrometrica and P. pyriforme and mapped short reads, transcripts, and proteins to the genome and gene space of P. patens. Only about half of the transcripts of F. hygrometrica map to their ortholog in P. patens, whereas at least 90% of those of P. pyriforme align to loci in P. patens. Such divergence is unexpected given the high morphological similarity of the gametophyte but reflects the estimated times of divergence of F. hygrometrica and P. pyriforme from P. patens, namely 55 and 20 mya, respectively. The newly sampled transcriptomes bear signatures of at least one, rather ancient, whole genome duplication (WGD), which may be shared with one reported for P. patens. The transcriptomes of F. hygrometrica and P. pyriforme reveal significant contractions or expansions of different gene families. While transcriptomes offer only an incomplete estimate of the gene space, the high number of transcripts obtained suggest a significant divergence in gene sequences, and gene number among the three species, indicative of a rather strong, dynamic genome evolution, shaped in part by whole, partial or localized genome duplication. The gene ontology of their specific and rapidly-evolving protein families, suggests that the evolution of the Funariaceae may have been driven by the diversification of metabolic genes that may optimize the adaptations to environmental conditions, a hypothesis well in line with ecological patterns in the genetic diversity and structure in seed plants.
Collapse
|
147
|
Suenaga Y, Nakatani K, Nakagawara A. De novo evolved gene product NCYM in the pathogenesis and clinical outcome of human neuroblastomas and other cancers. Jpn J Clin Oncol 2020; 50:839-846. [PMID: 32577751 DOI: 10.1093/jjco/hyaa097] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2020] [Accepted: 06/04/2020] [Indexed: 12/30/2022] Open
Abstract
NCYM is an antisense transcript of MYCN oncogene and promotes tumor progression. NCYM encodes a de novo protein whose open reading frame evolved from noncoding genomic regions in the ancestor of Homininae. Because of its topology, NCYM is always co-amplified with MYCN oncogene, and the mutual regulations between NCYM and MYCN maintain their expressions at high levels in MYCN-amplified tumors. NCYM stabilizes MYCN by inhibiting GSK3β, whereas MYCN stimulates transcription of both NCYM and MYCN. NCYM mRNA and its noncoding transcript variants MYCNOS have been shown to stimulate MYCN expression via direct binding to MYCN promoter, indicating that both coding and noncoding transcripts of NCYM induce MYCN expression. In contrast to the noncoding functions of NCYM, NCYM protein also promotes calpain-mediated cleavage of c-MYC. The cleaved product called Myc-nick inhibits cell death and promotes cancer cell migration. Furthermore, NCYM-mediated inhibition of GSK3β results in the stabilization of β-catenin, which promotes aggressiveness of bladder cancers. These MYCN-independent functions of NCYM showed their clinical significance in MYCN-non-amplified tumors, including adult tumors. This year is the 30th anniversary of the identification of NCYM/MYCNOS gene. On this special occasion, we summarize the current understanding of molecular functions and the clinical significance of NCYM and discuss future directions to achieve therapeutic strategies targeting NCYM.
Collapse
|
148
|
Wang J, Veronezi GMB, Kang Y, Zagoskin M, O'Toole ET, Davis RE. Comprehensive Chromosome End Remodeling during Programmed DNA Elimination. Curr Biol 2020; 30:3397-3413.e4. [PMID: 32679104 PMCID: PMC7484210 DOI: 10.1016/j.cub.2020.06.058] [Citation(s) in RCA: 34] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 06/09/2020] [Accepted: 06/16/2020] [Indexed: 01/14/2023]
Abstract
Germline and somatic genomes are in general the same in a multicellular organism. However, programmed DNA elimination leads to a reduced somatic genome compared to germline cells. Previous work on the parasitic nematode Ascaris demonstrated that programmed DNA elimination encompasses high-fidelity chromosomal breaks and loss of specific genome sequences including a major tandem repeat of 120 bp and ~1,000 germline-expressed genes. However, the precise chromosomal locations of these repeats, breaks regions, and eliminated genes remained unknown. We used PacBio long-read sequencing and chromosome conformation capture (Hi-C) to obtain fully assembled chromosomes of Ascaris germline and somatic genomes, enabling a complete chromosomal view of DNA elimination. We found that all 24 germline chromosomes undergo comprehensive chromosome end remodeling with DNA breaks in their subtelomeric regions and loss of distal sequences including the telomeres at both chromosome ends. All new Ascaris somatic chromosome ends are recapped by de novo telomere healing. We provide an ultrastructural analysis of Ascaris DNA elimination and show that eliminated DNA is incorporated into double membrane-bound structures, similar to micronuclei, during telophase of a DNA elimination mitosis. These micronuclei undergo dynamic changes including loss of active histone marks and localize to the cytoplasm following daughter nuclei formation and cytokinesis where they form autophagosomes. Comparative analysis of nematode chromosomes suggests that chromosome fusions occurred, forming Ascaris sex chromosomes that become independent chromosomes following DNA elimination breaks in somatic cells. These studies provide the first chromosomal view and define novel features and functions of metazoan programmed DNA elimination.
Collapse
Affiliation(s)
- Jianbin Wang
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA; RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, CO 80045, USA; Department of Biochemistry and Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA.
| | - Giovana M B Veronezi
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Yuanyuan Kang
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Maxim Zagoskin
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - Eileen T O'Toole
- Molecular, Cellular and Developmental Biology, University of Colorado at Boulder, Boulder, CO 80309, USA
| | - Richard E Davis
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA; RNA Bioscience Initiative, University of Colorado School of Medicine, Aurora, CO 80045, USA.
| |
Collapse
|
149
|
Delihas N. Genesis of Non-Coding RNA Genes in Human Chromosome 22-A Sequence Connection with Protein Genes Separated by Evolutionary Time. Noncoding RNA 2020; 6:E36. [PMID: 32899105 PMCID: PMC7549372 DOI: 10.3390/ncrna6030036] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2020] [Revised: 08/17/2020] [Accepted: 09/01/2020] [Indexed: 12/11/2022] Open
Abstract
A small phylogenetically conserved sequence of 11,231 bp, termed FAM247, is repeated in human chromosome 22 by segmental duplications. This sequence forms part of diverse genes that span evolutionary time, the protein genes being the earliest as they are present in zebrafish and/or mice genomes, and the long noncoding RNA genes and pseudogenes the most recent as they appear to be present only in the human genome. We propose that the conserved sequence provides a nucleation site for new gene development at evolutionarily conserved chromosomal loci where the FAM247 sequences reside. The FAM247 sequence also carries information in its open reading frames that provides protein exon amino acid sequences; one exon plays an integral role in immune system regulation, specifically, the function of ubiquitin-specific protease (USP18) in the regulation of interferon. An analysis of this multifaceted sequence and the genesis of genes that contain it is presented.
Collapse
Affiliation(s)
- Nicholas Delihas
- Department of Microbiology and Immunology, Renaissance School of Medicine, Stony Brook University, Stony Brook, New York, NY 11794-5222, USA
| |
Collapse
|
150
|
Douglas AE. Housing microbial symbionts: evolutionary origins and diversification of symbiotic organs in animals. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190603. [PMID: 32772661 DOI: 10.1098/rstb.2019.0603] [Citation(s) in RCA: 20] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
In many animal hosts, microbial symbionts are housed within specialized structures known as symbiotic organs, but the evolutionary origins of these structures have rarely been investigated. Here, I adopt an evolutionary developmental (evo-devo) approach, specifically to apply knowledge of the development of symbiotic organs to gain insights into their evolutionary origins and diversification. In particular, host genetic changes associated with evolution of symbiotic organs can be inferred from studies to identify the host genes that orchestrate the development of symbiotic organs, recognizing that microbial products may also play a key role in triggering the developmental programme in some associations. These studies may also reveal whether higher animal taxonomic groups (order, class, phylum, etc.) possess a common genetic regulatory network for symbiosis that is latent in taxa lacking symbiotic organs, and activated at the origination of symbiosis in different host lineages. In this way, apparent instances of convergent evolution of symbiotic organs may be homologous in terms of a common genetic blueprint for symbiosis. Advances in genetic technologies, including reverse genetic tools and genome editing, will facilitate the application of evo-devo approaches to investigate the evolution of symbiotic organs in animals. This article is part of the theme issue 'The role of the microbiome in host evolution'.
Collapse
Affiliation(s)
- Angela E Douglas
- Department of Entomology and Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853, USA
| |
Collapse
|