51
|
González D, Morales-Olavarria M, Vidal-Veuthey B, Cárdenas JP. Insights into early evolutionary adaptations of the Akkermansia genus to the vertebrate gut. Front Microbiol 2023; 14:1238580. [PMID: 37779688 PMCID: PMC10540074 DOI: 10.3389/fmicb.2023.1238580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/21/2023] [Indexed: 10/03/2023] Open
Abstract
Akkermansia, a relevant mucin degrader from the vertebrate gut microbiota, is a member of the deeply branched Verrucomicrobiota, as well as the only known member of this phylum to be described as inhabitants of the gut. Only a few Akkermansia species have been officially described so far, although there is genomic evidence addressing the existence of more species-level variants for this genus. This niche specialization makes Akkermansia an interesting model for studying the evolution of microorganisms to their adaptation to the gastrointestinal tract environment, including which kind of functions were gained when the Akkermansia genus originated or how the evolutionary pressure functions over those genes. In order to gain more insight into Akkermansia adaptations to the gastrointestinal tract niche, we performed a phylogenomic analysis of 367 high-quality Akkermansia isolates and metagenome-assembled genomes, in addition to other members of Verrucomicrobiota. This work was focused on three aspects: the definition of Akkermansia genomic species clusters and the calculation and functional characterization of the pangenome for the most represented species; the evolutionary relationship between Akkermansia and their closest relatives from Verrucomicrobiota, defining the gene families which were gained or lost during the emergence of the last Akkermansia common ancestor (LAkkCA) and; the evaluation of the evolutionary pressure metrics for each relevant gene family of main Akkermansia species. This analysis found 25 Akkermansia genomic species clusters distributed in two main clades, divergent from their non-Akkermansia relatives. Pangenome analyses suggest that Akkermansia species have open pangenomes, and the gene gain/loss model indicates that genes associated with mucin degradation (both glycoside hydrolases and peptidases), (micro)aerobic metabolism, surface interaction, and adhesion were part of LAkkCA. Specifically, mucin degradation is a very ancestral innovation involved in the origin of Akkermansia. Horizontal gene transfer detection suggests that Akkermansia could receive genes mostly from unknown sources or from other Gram-negative gut bacteria. Evolutionary metrics suggest that Akkemansia species evolved differently, and even some conserved genes suffered different evolutionary pressures among clades. These results suggest a complex evolutionary landscape of the genus and indicate that mucin degradation could be an essential feature in Akkermansia evolution as a symbiotic species.
Collapse
Affiliation(s)
- Dámariz González
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Santiago, Chile
| | - Mauricio Morales-Olavarria
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Santiago, Chile
| | - Boris Vidal-Veuthey
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Santiago, Chile
| | - Juan P. Cárdenas
- Centro de Genómica y Bioinformática, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Santiago, Chile
- Escuela de Biotecnología, Facultad de Ciencias, Ingeniería y Tecnología, Universidad Mayor, Santiago, Chile
| |
Collapse
|
52
|
Fuselli S, Greco S, Biello R, Palmitessa S, Lago M, Meneghetti C, McDougall C, Trucchi E, Rota Stabelli O, Biscotti AM, Schmidt DJ, Roberts DT, Espinoza T, Hughes JM, Ometto L, Gerdol M, Bertorelle G. Relaxation of Natural Selection in the Evolution of the Giant Lungfish Genomes. Mol Biol Evol 2023; 40:msad193. [PMID: 37671664 PMCID: PMC10503785 DOI: 10.1093/molbev/msad193] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2023] [Revised: 07/16/2023] [Accepted: 09/04/2023] [Indexed: 09/07/2023] Open
Abstract
Nonadaptive hypotheses on the evolution of eukaryotic genome size predict an expansion when the process of purifying selection becomes weak. Accordingly, species with huge genomes, such as lungfish, are expected to show a genome-wide relaxation signature of selection compared with other organisms. However, few studies have empirically tested this prediction using genomic data in a comparative framework. Here, we show that 1) the newly assembled transcriptome of the Australian lungfish, Neoceratodus forsteri, is characterized by an excess of pervasive transcription, or transcriptional leakage, possibly due to suboptimal transcriptional control, and 2) a significant relaxation signature in coding genes in lungfish species compared with other vertebrates. Based on these observations, we propose that the largest known animal genomes evolved in a nearly neutral scenario where genome expansion is less efficiently constrained.
Collapse
Affiliation(s)
- Silvia Fuselli
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Samuele Greco
- Department of Life Sciences, University of Trieste, Trieste, Italy
| | - Roberto Biello
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | | | - Marta Lago
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Corrado Meneghetti
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| | - Carmel McDougall
- Australian Rivers Institute, Griffith University, Brisbane, Queensland, Australia
| | - Emiliano Trucchi
- Department of Life and Environmental Sciences, Marche Polytechnic University, Ancona, Italy
| | - Omar Rota Stabelli
- Research and Innovation Centre, Fondazione Edmund Mach, 38010 San Michele all’Adige, Italy
- Center Agriculture Food Environment, University of Trento, 38010 San Michele all'Adige, Italy
| | - Assunta Maria Biscotti
- Department of Life and Environmental Sciences, Marche Polytechnic University, Ancona, Italy
| | - Daniel J Schmidt
- Australian Rivers Institute, Griffith University, Brisbane, Queensland, Australia
| | | | | | - Jane Margaret Hughes
- Australian Rivers Institute, Griffith University, Brisbane, Queensland, Australia
| | - Lino Ometto
- Department of Biology and Biotechnology, University of Pavia, Pavia, Italy
| | - Marco Gerdol
- Department of Life Sciences, University of Trieste, Trieste, Italy
| | - Giorgio Bertorelle
- Department of Life Sciences and Biotechnology, University of Ferrara, Ferrara, Italy
| |
Collapse
|
53
|
Menger FM, Rizvi SAA. Preassembly Theory Invoking Prehistoric DNA Alterations. WORLD FUTURES 2023; 79:635-646. [DOI: 10.1080/02604027.2023.2226594] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/16/2023]
|
54
|
Meier-Credo J, Heiniger B, Schori C, Rupprecht F, Michel H, Ahrens CH, Langer JD. Detection of Known and Novel Small Proteins in Pseudomonas stutzeri Using a Combination of Bottom-Up and Digest-Free Proteomics and Proteogenomics. Anal Chem 2023; 95:11892-11900. [PMID: 37535005 PMCID: PMC10433244 DOI: 10.1021/acs.analchem.3c00676] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Accepted: 07/24/2023] [Indexed: 08/04/2023]
Abstract
Small proteins of around 50 aa in length have been largely overlooked in genetic and biochemical assays due to the inherent challenges with detecting and characterizing them. Recent discoveries of their critical roles in many biological processes have led to an increased recognition of the importance of small proteins for basic research and as potential new drug targets. One example is CcoM, a 36 aa subunit of the cbb3-type oxidase that plays an essential role in adaptation to oxygen-limited conditions in Pseudomonas stutzeri (P. stutzeri), a model for the clinically relevant, opportunistic pathogen Pseudomonas aeruginosa. However, as no comprehensive data were available in P. stutzeri, we devised an integrated, generic approach to study small proteins more systematically. Using the first complete genome as basis, we conducted bottom-up proteomics analyses and established a digest-free, direct-sequencing proteomics approach to study cells grown under aerobic and oxygen-limiting conditions. Finally, we also applied a proteogenomics pipeline to identify missed protein-coding genes. Overall, we identified 2921 known and 29 novel proteins, many of which were differentially regulated. Among 176 small proteins 16 were novel. Direct sequencing, featuring a specialized precursor acquisition scheme, exhibited advantages in the detection of small proteins with higher (up to 100%) sequence coverage and more spectral counts, including sequences with high proline content. Three novel small proteins, uniquely identified by direct sequencing and not conserved beyond P. stutzeri, were predicted to form an operon with a conserved protein and may represent de novo genes. These data demonstrate the power of this combined approach to study small proteins in P. stutzeri and show its potential for other prokaryotes.
Collapse
Affiliation(s)
- Jakob Meier-Credo
- Proteomics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
| | - Benjamin Heiniger
- Molecular
Ecology, Agroscope & SIB Swiss Institute
of Bioinformatics, 8046 Zürich, Switzerland
| | - Christian Schori
- Molecular
Ecology, Agroscope & SIB Swiss Institute
of Bioinformatics, 8046 Zürich, Switzerland
| | - Fiona Rupprecht
- Proteomics, Max Planck Institute for Brain
Research, 60438 Frankfurt
am Main, Germany
| | - Hartmut Michel
- Department
of Molecular Membrane Biology, Max Planck
Institute of Biophysics, 60438 Frankfurt am Main, Germany
| | - Christian H. Ahrens
- Molecular
Ecology, Agroscope & SIB Swiss Institute
of Bioinformatics, 8046 Zürich, Switzerland
| | - Julian D. Langer
- Proteomics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany
- Proteomics, Max Planck Institute for Brain
Research, 60438 Frankfurt
am Main, Germany
| |
Collapse
|
55
|
Lombardo KD, Sheehy HK, Cridland JM, Begun DJ. Identifying candidate de novo genes expressed in the somatic female reproductive tract of Drosophila melanogaster. G3 (BETHESDA, MD.) 2023; 13:jkad122. [PMID: 37259569 PMCID: PMC10411569 DOI: 10.1093/g3journal/jkad122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/28/2023] [Revised: 05/18/2023] [Accepted: 05/22/2023] [Indexed: 06/02/2023]
Abstract
Most eukaryotic genes have been vertically transmitted to the present from distant ancestors. However, variable gene number across species indicates that gene gain and loss also occurs. While new genes typically originate as products of duplications and rearrangements of preexisting genes, putative de novo genes-genes born out of ancestrally nongenic sequence-have been identified. Previous studies of de novo genes in Drosophila have provided evidence that expression in male reproductive tissues is common. However, no studies have focused on female reproductive tissues. Here we begin addressing this gap in the literature by analyzing the transcriptomes of 3 female reproductive tract organs (spermatheca, seminal receptacle, and parovaria) in 3 species-our focal species, Drosophila melanogaster-and 2 closely related species, Drosophila simulans and Drosophila yakuba, with the goal of identifying putative D. melanogaster-specific de novo genes expressed in these tissues. We discovered several candidate genes, located in sequence annotated as intergenic. Consistent with the literature, these genes tend to be short, single exon, and lowly expressed. We also find evidence that some of these genes are expressed in other D. melanogaster tissues and both sexes. The relatively small number of intergenic candidate genes discovered here is similar to that observed in the accessory gland, but substantially fewer than that observed in the testis.
Collapse
Affiliation(s)
- Kaelina D Lombardo
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| | - David J Begun
- Department of Evolution and Ecology, University of California Davis, Davis, CA 95616, USA
| |
Collapse
|
56
|
Yocca AE, Platts A, Alger E, Teresi S, Mengist MF, Benevenuto J, Ferrão LFV, Jacobs M, Babinski M, Magallanes-Lundback M, Bayer P, Golicz A, Humann JL, Main D, Espley RV, Chagné D, Albert NW, Montanari S, Vorsa N, Polashock J, Díaz-Garcia L, Zalapa J, Bassil NV, Munoz PR, Iorizzo M, Edger PP. Blueberry and cranberry pangenomes as a resource for future genetic studies and breeding efforts. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.07.31.551392. [PMID: 37577683 PMCID: PMC10418200 DOI: 10.1101/2023.07.31.551392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
Domestication of cranberry and blueberry began in the United States in the early 1800s and 1900s, respectively, and in part owing to their flavors and health-promoting benefits are now cultivated and consumed worldwide. The industry continues to face a wide variety of production challenges (e.g. disease pressures) as well as a demand for higher-yielding cultivars with improved fruit quality characteristics. Unfortunately, molecular tools to help guide breeding efforts for these species have been relatively limited compared with those for other high-value crops. Here, we describe the construction and analysis of the first pangenome for both blueberry and cranberry. Our analysis of these pangenomes revealed both crops exhibit great genetic diversity, including the presence-absence variation of 48.4% genes in highbush blueberry and 47.0% genes in cranberry. Auxiliary genes, those not shared by all cultivars, are significantly enriched with molecular functions associated with disease resistance and the biosynthesis of specialized metabolites, including compounds previously associated with improving fruit quality traits. The discovery of thousands of genes, not present in the previous reference genomes for blueberry and cranberry, will serve as the basis of future research and as potential targets for future breeding efforts. The pangenome, as a multiple-sequence alignment, as well as individual annotated genomes, are publicly available for analysis on the Genome Database for Vaccinium - a curated and integrated web-based relational database. Lastly, the core-gene predictions from the pangenomes will serve useful to develop a community genotyping platform to guide future molecular breeding efforts across the family.
Collapse
Affiliation(s)
- Alan E. Yocca
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Adrian Platts
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- Department of Plant Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Elizabeth Alger
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
| | - Scott Teresi
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- Genetics and Genome Sciences, Michigan State University, East Lansing, MI, 48824, USA
| | - Molla F. Mengist
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
| | - Juliana Benevenuto
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Luis Felipe V. Ferrão
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - MacKenzie Jacobs
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, 48824, USA
| | - Michal Babinski
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
| | | | - Philipp Bayer
- University of Western Australia, Perth 6009 Australia
| | | | - Jodi L Humann
- Department of Horticulture, Washington State University, Pullman, WA, 99163, USA
| | - Dorrie Main
- Department of Horticulture, Washington State University, Pullman, WA, 99163, USA
| | - Richard V. Espley
- The New Zealand Institute for Plant and Food Research Limited (PFR), Auckland, New Zealand
| | - David Chagné
- The New Zealand Institute for Plant and Food Research Limited (PFR), Palmerston, New Zealand
| | - Nick W. Albert
- The New Zealand Institute for Plant and Food Research Limited (PFR), Palmerston, New Zealand
| | - Sara Montanari
- The New Zealand Institute for Plant and Food Research Limited (PFR), Motueka, New Zealand
| | - Nicholi Vorsa
- SEBS, Plant Biology, Rutgers University, New Brunswick NJ 01019 USA
| | - James Polashock
- SEBS, Plant Biology, Rutgers University, New Brunswick NJ 01019 USA
| | - Luis Díaz-Garcia
- USDA-ARS, VCRU, Department of Horticulture, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Juan Zalapa
- USDA-ARS, VCRU, Department of Horticulture, University of Wisconsin-Madison, Madison, WI 53706, USA
| | - Nahla V. Bassil
- USDA-ARS, National Clonal Germplasm Repository, Corvallis, OR 97333, USA
| | - Patricio R. Munoz
- Horticultural Sciences Department, University of Florida, Gainesville, FL 32611, USA
| | - Massimo Iorizzo
- Plants for Human Health Institute, North Carolina State University, Kannapolis, NC USA
- Department of Horticulture, North Carolina State University, Kannapolis, NC USA
| | - Patrick P. Edger
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA
- Genetics and Genome Sciences, Michigan State University, East Lansing, MI, 48824, USA
- MSU AgBioResearch, Michigan State University, East Lansing, MI, 48824, USA
| |
Collapse
|
57
|
Athanasouli M, Akduman N, Röseler W, Theam P, Rödelsperger C. Thousands of Pristionchus pacificus orphan genes were integrated into developmental networks that respond to diverse environmental microbiota. PLoS Genet 2023; 19:e1010832. [PMID: 37399201 DOI: 10.1371/journal.pgen.1010832] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2023] [Accepted: 06/15/2023] [Indexed: 07/05/2023] Open
Abstract
Adaptation of organisms to environmental change may be facilitated by the creation of new genes. New genes without homologs in other lineages are known as taxonomically-restricted orphan genes and may result from divergence or de novo formation. Previously, we have extensively characterized the evolution and origin of such orphan genes in the nematode model organism Pristionchus pacificus. Here, we employ large-scale transcriptomics to establish potential functional associations and to measure the degree of transcriptional plasticity among orphan genes. Specifically, we analyzed 24 RNA-seq samples from adult P. pacificus worms raised on 24 different monoxenic bacterial cultures. Based on coexpression analysis, we identified 28 large modules that harbor 3,727 diplogastrid-specific orphan genes and that respond dynamically to different bacteria. These coexpression modules have distinct regulatory architecture and also exhibit differential expression patterns across development suggesting a link between bacterial response networks and development. Phylostratigraphy revealed a considerably high number of family- and even species-specific orphan genes in certain coexpression modules. This suggests that new genes are not attached randomly to existing cellular networks and that integration can happen very fast. Integrative analysis of protein domains, gene expression and ortholog data facilitated the assignments of biological labels for 22 coexpression modules with one of the largest, fast-evolving module being associated with spermatogenesis. In summary, this work presents the first functional annotation for thousands of P. pacificus orphan genes and reveals insights into their integration into environmentally responsive gene networks.
Collapse
Affiliation(s)
- Marina Athanasouli
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Nermin Akduman
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Waltraud Röseler
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Penghieng Theam
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| | - Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, Tübingen, Germany
| |
Collapse
|
58
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532420. [PMID: 37425675 PMCID: PMC10326970 DOI: 10.1101/2023.03.13.532420] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Although previously thought to be unlikely, recent studies have shown that de novo gene origination from previously non-genic sequences is a relatively common mechanism for gene innovation in many species and taxa. These young genes provide a unique set of candidates to study the structural and functional origination of proteins. However, our understanding of their protein structures and how these structures originate and evolve are still limited, due to a lack of systematic studies. Here, we combined high-quality base-level whole genome alignments, bioinformatic analysis, and computational structure modeling to study the origination, evolution, and protein structure of lineage-specific de novo genes. We identified 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. We found a gradual shift in sequence composition, evolutionary rates, and expression patterns with their gene ages, which indicates possible gradual shifts or adaptations of their functions. Surprisingly, we found little overall protein structural changes for de novo genes in the Drosophilinae lineage. Using Alphafold2, ESMFold, and molecular dynamics, we identified a number of de novo gene candidates with protein products that are potentially well-folded, many of which are more likely to contain transmembrane and signal proteins compared to other annotated protein-coding genes. Using ancestral sequence reconstruction, we found that most potentially well-folded proteins are often born folded. Interestingly, we observed one case where disordered ancestral proteins become ordered within a relatively short evolutionary time. Single-cell RNA-seq analysis in testis showed that although most de novo genes are enriched in spermatocytes, several young de novo genes are biased in the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
59
|
Ardern Z, Uz-Zaman MH. Between noise and function: Toward a taxonomy of the non-canonical translatome. Cell Syst 2023; 14:343-345. [PMID: 37201506 DOI: 10.1016/j.cels.2023.04.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2023] [Accepted: 04/17/2023] [Indexed: 05/20/2023]
Abstract
Eukaryotic genomes are pervasively translated, but the properties of translated sequences outside of canonical genes are poorly understood. A new study in Cell Systems reveals a large translatome that is not under significant evolutionary constraint but is still an active part of diverse cellular systems.
Collapse
Affiliation(s)
- Zachary Ardern
- Parasites and Microbes Programme, Wellcome Sanger Institute, Hinxton, Cambridgeshire, UK.
| | - Md Hassan Uz-Zaman
- Department of Molecular Biosciences, University of Texas at Austin, Austin, TX, USA.
| |
Collapse
|
60
|
Wacholder A, Parikh SB, Coelho NC, Acar O, Houghton C, Chou L, Carvunis AR. A vast evolutionarily transient translatome contributes to phenotype and fitness. Cell Syst 2023; 14:363-381.e8. [PMID: 37164009 PMCID: PMC10348077 DOI: 10.1016/j.cels.2023.04.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 01/30/2023] [Accepted: 04/06/2023] [Indexed: 05/12/2023]
Abstract
Translation is the process by which ribosomes synthesize proteins. Ribosome profiling recently revealed that many short sequences previously thought to be noncoding are pervasively translated. To identify protein-coding genes in this noncanonical translatome, we combine an integrative framework for extremely sensitive ribosome profiling analysis, iRibo, with high-powered selection inferences tailored for short sequences. We construct a reference translatome for Saccharomyces cerevisiae comprising 5,400 canonical and almost 19,000 noncanonical translated elements. Only 14 noncanonical elements were evolving under detectable purifying selection. A representative subset of translated elements lacking signatures of selection demonstrated involvement in processes including DNA repair, stress response, and post-transcriptional regulation. Our results suggest that most translated elements are not conserved protein-coding genes and contribute to genotype-phenotype relationships through fast-evolving molecular mechanisms.
Collapse
Affiliation(s)
- Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Saurin Bipin Parikh
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Integrative Systems Biology Program, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Nelson Castilho Coelho
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Omer Acar
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Joint CMU-Pitt PhD Program in Computational Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Carly Houghton
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Joint CMU-Pitt PhD Program in Computational Biology, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Lin Chou
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Integrative Systems Biology Program, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA; Pittsburgh Center for Evolutionary Biology and Medicine, School of Medicine, University of Pittsburgh, Pittsburgh, PA 15213, USA.
| |
Collapse
|
61
|
Lombardo KD, Sheehy HK, Cridland JM, Begun DJ. Identifying candidate de novo genes expressed in the somatic female reproductive tract of Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.05.03.539262. [PMID: 37205537 PMCID: PMC10187257 DOI: 10.1101/2023.05.03.539262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Most eukaryotic genes have been vertically transmitted to the present from distant ancestors. However, variable gene number across species indicates that gene gain and loss also occurs. While new genes typically originate as products of duplications and rearrangements of pre-existing genes, putative de novo genes - genes born out of previously non-genic sequence - have been identified. Previous studies of de novo genes in Drosophila have provided evidence that expression in male reproductive tissues is common. However, no studies have focused on female reproductive tissues. Here we begin addressing this gap in the literature by analyzing the transcriptomes of three female reproductive tract organs (spermatheca, seminal receptacle, and parovaria) in three species - our focal species, D. melanogaster - and two closely related species, D. simulans and D. yakuba , with the goal of identifying putative D. melanogaster -specific de novo genes expressed in these tissues. We discovered several candidate genes, which, consistent with the literature, tend to be short, simple, and lowly expressed. We also find evidence that some of these genes are expressed in other D. melanogaster tissues and both sexes. The relatively small number of candidate genes discovered here is similar to that observed in the accessory gland, but substantially fewer than that observed in the testis.
Collapse
Affiliation(s)
- Kaelina D Lombardo
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - Hayley K Sheehy
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - Julie M Cridland
- Department of Evolution and Ecology, University of California, Davis CA 95616
| | - David J Begun
- Department of Evolution and Ecology, University of California, Davis CA 95616
| |
Collapse
|
62
|
Saeki N, Yamamoto C, Eguchi Y, Sekito T, Shigenobu S, Yoshimura M, Yashiroda Y, Boone C, Moriya H. Overexpression profiling reveals cellular requirements in the context of genetic backgrounds and environments. PLoS Genet 2023; 19:e1010732. [PMID: 37115757 PMCID: PMC10171610 DOI: 10.1371/journal.pgen.1010732] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2022] [Revised: 05/10/2023] [Accepted: 04/04/2023] [Indexed: 04/29/2023] Open
Abstract
Overexpression can help life adapt to stressful environments, making an examination of overexpressed genes valuable for understanding stress tolerance mechanisms. However, a systematic study of genes whose overexpression is functionally adaptive (GOFAs) under stress has yet to be conducted. We developed a new overexpression profiling method and systematically identified GOFAs in Saccharomyces cerevisiae under stress (heat, salt, and oxidative). Our results show that adaptive overexpression compensates for deficiencies and increases fitness under stress, like calcium under salt stress. We also investigated the impact of different genetic backgrounds on GOFAs, which varied among three S. cerevisiae strains reflecting differing calcium and potassium requirements for salt stress tolerance. Our study of a knockout collection also suggested that calcium prevents mitochondrial outbursts under salt stress. Mitochondria-enhancing GOFAs were only adaptive when adequate calcium was available and non-adaptive when calcium was deficient, supporting this idea. Our findings indicate that adaptive overexpression meets the cell's needs for maximizing the organism's adaptive capacity in the given environment and genetic context.
Collapse
Affiliation(s)
- Nozomu Saeki
- Graduate School of Environmental and Life Science, Okayama University, Okayama, Japan
| | - Chie Yamamoto
- Graduate School of Environmental and Life Science, Okayama University, Okayama, Japan
| | - Yuichi Eguchi
- Biomedical Business Center, RICOH Futures BU, Kanagawa, Japan
| | - Takayuki Sekito
- Graduate School of Agriculture, Ehime University, Matsuyama, Japan
| | | | - Mami Yoshimura
- RIKEN Center for Sustainable Resource Science, Wako, Japan
| | - Yoko Yashiroda
- RIKEN Center for Sustainable Resource Science, Wako, Japan
| | - Charles Boone
- RIKEN Center for Sustainable Resource Science, Wako, Japan
- Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, Canada
| | - Hisao Moriya
- Faculty of Environmental, Life, Natural Science and Technology, Okayama University, Okayama, Japan
| |
Collapse
|
63
|
Crespo-Bellido A, Duffy S. The how of counter-defense: viral evolution to combat host immunity. Curr Opin Microbiol 2023; 74:102320. [PMID: 37075547 DOI: 10.1016/j.mib.2023.102320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/10/2023] [Accepted: 03/23/2023] [Indexed: 04/21/2023]
Abstract
Viruses are locked in an evolutionary arms race with their hosts. What ultimately determines viral evolvability, or capacity for adaptive evolution, is their ability to efficiently explore and expand sequence space while under the selective regime imposed by their ecology, which includes innate and adaptive host defenses. Viral genomes have significantly higher evolutionary rates than their host counterparts and should have advantages relative to their slower-evolving hosts. However, functional constraints on virus evolutionary landscapes along with the modularity and mutational tolerance of host defense proteins may help offset the advantage conferred to viruses by high evolutionary rates. Additionally, cellular life forms from all domains of life possess many highly complex defense mechanisms that act as hurdles to viral replication. Consequently, viruses constantly probe sequence space through mutation and genetic exchange and are under pressure to optimize diverse counter-defense strategies.
Collapse
Affiliation(s)
- Alvin Crespo-Bellido
- Department of Ecology, Evolution and Natural Resources, School of Environmental and Biological Sciences, Rutgers, the State University of New Jersey, New Brunswick, NJ, USA
| | - Siobain Duffy
- Department of Ecology, Evolution and Natural Resources, School of Environmental and Biological Sciences, Rutgers, the State University of New Jersey, New Brunswick, NJ, USA.
| |
Collapse
|
64
|
Iyengar BR, Bornberg-Bauer E. Neutral Models of De Novo Gene Emergence Suggest that Gene Evolution has a Preferred Trajectory. Mol Biol Evol 2023; 40:msad079. [PMID: 37011142 PMCID: PMC10118301 DOI: 10.1093/molbev/msad079] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2023] [Revised: 03/01/2023] [Accepted: 03/28/2023] [Indexed: 04/05/2023] Open
Abstract
New protein coding genes can emerge from genomic regions that previously did not contain any genes, via a process called de novo gene emergence. To synthesize a protein, DNA must be transcribed as well as translated. Both processes need certain DNA sequence features. Stable transcription requires promoters and a polyadenylation signal, while translation requires at least an open reading frame. We develop mathematical models based on mutation probabilities, and the assumption of neutral evolution, to find out how quickly genes emerge and are lost. We also investigate the effect of the order by which DNA features evolve, and if sequence composition is biased by mutation rate. We rationalize how genes are lost much more rapidly than they emerge, and how they preferentially arise in regions that are already transcribed. Our study not only answers some fundamental questions on the topic of de novo emergence but also provides a modeling framework for future studies.
Collapse
Affiliation(s)
- Bharat Ravi Iyengar
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, Münster, Germany
- Department of Protein Evolution, Max Planck Institute for Biology Tübingen, Tübingen, Germany
| |
Collapse
|
65
|
Papadopoulos C, Albà MM. Newly evolved genes in the human lineage are functional. Trends Genet 2023; 39:235-236. [PMID: 36774242 DOI: 10.1016/j.tig.2023.02.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Accepted: 02/02/2023] [Indexed: 02/12/2023]
Abstract
Genes restricted to a given species or lineage are mysterious. Many emerged de novo from ancestral noncoding genomic regions rather than from pre-existing genes. A new study by Vakirlis and colleagues shows that, in humans, many of these are associated with phenotypic effects, accelerating our understanding of their functional importance.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Evolutionary Genomics Group, Hospital del Mar Medical Research Institute (IMIM), Barcelona 08003, Spain.
| | - M Mar Albà
- Evolutionary Genomics Group, Hospital del Mar Medical Research Institute (IMIM), Barcelona 08003, Spain.
| |
Collapse
|
66
|
Aubel M, Eicholt L, Bornberg-Bauer E. Assessing structure and disorder prediction tools for de novo emerged proteins in the age of machine learning. F1000Res 2023; 12:347. [PMID: 37113259 PMCID: PMC10126731 DOI: 10.12688/f1000research.130443.1] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 03/17/2023] [Indexed: 03/31/2023] Open
Abstract
Background: De novo protein coding genes emerge from scratch in the non-coding regions of the genome and have, per definition, no homology to other genes. Therefore, their encoded de novo proteins belong to the so-called "dark protein space". So far, only four de novo protein structures have been experimentally approximated. Low homology, presumed high disorder and limited structures result in low confidence structural predictions for de novo proteins in most cases. Here, we look at the most widely used structure and disorder predictors and assess their applicability for de novo emerged proteins. Since AlphaFold2 is based on the generation of multiple sequence alignments and was trained on solved structures of largely conserved and globular proteins, its performance on de novo proteins remains unknown. More recently, natural language models of proteins have been used for alignment-free structure predictions, potentially making them more suitable for de novo proteins than AlphaFold2. Methods: We applied different disorder predictors (IUPred3 short/long, flDPnn) and structure predictors, AlphaFold2 on the one hand and language-based models (Omegafold, ESMfold, RGN2) on the other hand, to four de novo proteins with experimental evidence on structure. We compared the resulting predictions between the different predictors as well as to the existing experimental evidence. Results: Results from IUPred, the most widely used disorder predictor, depend heavily on the choice of parameters and differ significantly from flDPnn which has been found to outperform most other predictors in a comparative assessment study recently. Similarly, different structure predictors yielded varying results and confidence scores for de novo proteins. Conclusions: We suggest that, while in some cases protein language model based approaches might be more accurate than AlphaFold2, the structure prediction of de novo emerged proteins remains a difficult task for any predictor, be it disorder or structure.
Collapse
Affiliation(s)
- Margaux Aubel
- Institute for Evolution and Bidiversity, University of Muenster, Muenster, 48149, Germany
| | - Lars Eicholt
- Institute for Evolution and Bidiversity, University of Muenster, Muenster, 48149, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Bidiversity, University of Muenster, Muenster, 48149, Germany
- Department Protein Evolution, Max Planck-Institute for Biology, Tuebingen, 72076, Germany
| |
Collapse
|
67
|
Sandmann CL, Schulz JF, Ruiz-Orera J, Kirchner M, Ziehm M, Adami E, Marczenke M, Christ A, Liebe N, Greiner J, Schoenenberger A, Muecke MB, Liang N, Moritz RL, Sun Z, Deutsch EW, Gotthardt M, Mudge JM, Prensner JR, Willnow TE, Mertins P, van Heesch S, Hubner N. Evolutionary origins and interactomes of human, young microproteins and small peptides translated from short open reading frames. Mol Cell 2023; 83:994-1011.e18. [PMID: 36806354 PMCID: PMC10032668 DOI: 10.1016/j.molcel.2023.01.023] [Citation(s) in RCA: 35] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/12/2022] [Accepted: 01/25/2023] [Indexed: 02/19/2023]
Abstract
All species continuously evolve short open reading frames (sORFs) that can be templated for protein synthesis and may provide raw materials for evolutionary adaptation. We analyzed the evolutionary origins of 7,264 recently cataloged human sORFs and found that most were evolutionarily young and had emerged de novo. We additionally identified 221 previously missed sORFs potentially translated into peptides of up to 15 amino acids-all of which are smaller than the smallest human microprotein annotated to date. To investigate the bioactivity of sORF-encoded small peptides and young microproteins, we subjected 266 candidates to a mass-spectrometry-based interactome screen with motif resolution. Based on these interactomes and additional cellular assays, we can associate several candidates with mRNA splicing, translational regulation, and endocytosis. Our work provides insights into the evolutionary origins and interaction potential of young and small proteins, thereby helping to elucidate this underexplored territory of the human proteome.
Collapse
Affiliation(s)
- Clara-L Sandmann
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany
| | - Jana F Schulz
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany
| | - Jorge Ruiz-Orera
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Marieluise Kirchner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
| | - Matthias Ziehm
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
| | - Eleonora Adami
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Maike Marczenke
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Annabel Christ
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Nina Liebe
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Johannes Greiner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Aaron Schoenenberger
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | - Michael B Muecke
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany
| | - Ning Liang
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany
| | | | - Zhi Sun
- Institute for Systems Biology, Seattle, WA 98109, USA
| | | | - Michael Gotthardt
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany
| | - Jonathan M Mudge
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | - John R Prensner
- Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA; Department of Pediatric Oncology, Dana-Farber Cancer Institute, Boston, MA 02215, USA; Division of Pediatric Hematology/Oncology, Boston Children's Hospital, Boston, MA 02115, USA
| | - Thomas E Willnow
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Department of Biomedicine, Aarhus University, 8000 Aarhus, Denmark
| | - Philipp Mertins
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, Core Facility Proteomics, 10117 Berlin, Germany
| | | | - Norbert Hubner
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), 13125 Berlin, Germany; DZHK (German Centre for Cardiovascular Research), Partner Site Berlin, 13347 Berlin, Germany; Charité-Universitätsmedizin, 10117 Berlin, Germany.
| |
Collapse
|
68
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
69
|
Luria V, Ma S, Shibata M, Pattabiraman K, Sestan N. Molecular and cellular mechanisms of human cortical connectivity. Curr Opin Neurobiol 2023; 80:102699. [PMID: 36921362 DOI: 10.1016/j.conb.2023.102699] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Accepted: 02/05/2023] [Indexed: 03/18/2023]
Abstract
Comparative studies of the cerebral cortex have identified various human and primate-specific changes in both local and long-range connectivity, which are thought to underlie our advanced cognitive capabilities. These changes are likely mediated by the divergence of spatiotemporal regulation of gene expression, which is particularly prominent in the prenatal and early postnatal human and non-human primate cerebral cortex. In this review, we describe recent advances in characterizing human and primate genetic and cellular innovations including identification of novel species-specific, especially human-specific, genes, gene expression patterns, and cell types. Finally, we highlight three recent studies linking these molecular changes to reorganization of cortical connectivity.
Collapse
Affiliation(s)
- Victor Luria
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, 06510, USA
| | - Shaojie Ma
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, 06510, USA
| | - Mikihito Shibata
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, 06510, USA
| | - Kartik Pattabiraman
- Yale Child Study Center, Yale School of Medicine, New Haven, CT, 06510, USA.
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT, 06510, USA; Yale Child Study Center, Yale School of Medicine, New Haven, CT, 06510, USA; Departments of Psychiatry, Genetics and Comparative Medicine, Program in Cellular Neuroscience, Neurodegeneration and Repair, and Kavli Institute for Neuroscience, Yale School of Medicine, New Haven, CT, 06510, USA.
| |
Collapse
|
70
|
Qi J, Mo F, An NA, Mi T, Wang J, Qi J, Li X, Zhang B, Xia L, Lu Y, Sun G, Wang X, Li C, Hu B. A Human-Specific De Novo Gene Promotes Cortical Expansion and Folding. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2204140. [PMID: 36638273 PMCID: PMC9982566 DOI: 10.1002/advs.202204140] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 12/20/2022] [Indexed: 06/17/2023]
Abstract
Newly originated de novo genes have been linked to the formation and function of the human brain. However, how a specific gene originates from ancestral noncoding DNAs and becomes involved in the preexisting network for functional outcomes remains elusive. Here, a human-specific de novo gene, SP0535, is identified that is preferentially expressed in the ventricular zone of the human fetal brain and plays an important role in cortical development and function. In human embryonic stem cell-derived cortical organoids, knockout of SP0535 compromises their growth and neurogenesis. In SP0535 transgenic (TG) mice, expression of SP0535 induces fetal cortex expansion and sulci and gyri-like structure formation. The progenitors and neurons in the SP0535 TG mouse cortex tend to proliferate and differentiate in ways that are unique to humans. SP0535 TG adult mice also exhibit improved cognitive ability and working memory. Mechanistically, SP0535 interacts with the membrane protein Na+ /K+ ATPase subunit alpha-1 (ATP1A1) and releases Src from the ATP1A1-Src complex, allowing increased level of Src phosphorylation that promotes cell proliferation. Thus, SP0535 is the first proven human-specific de novo gene that promotes cortical expansion and folding, and can function through incorporating into an existing conserved molecular network.
Collapse
Affiliation(s)
- Jianhuan Qi
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
| | - Fan Mo
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
| | - Ni A. An
- Laboratory of Bioinformatics and Genomic MedicineInstitute of Molecular MedicineCollege of Future TechnologyPeking UniversityBeijing100871China
| | - Tingwei Mi
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
| | - Jiaxin Wang
- Laboratory of Bioinformatics and Genomic MedicineInstitute of Molecular MedicineCollege of Future TechnologyPeking UniversityBeijing100871China
| | - Jun‐Tian Qi
- Laboratory of Bioinformatics and Genomic MedicineInstitute of Molecular MedicineCollege of Future TechnologyPeking UniversityBeijing100871China
| | - Xiangshang Li
- Laboratory of Bioinformatics and Genomic MedicineInstitute of Molecular MedicineCollege of Future TechnologyPeking UniversityBeijing100871China
| | - Boya Zhang
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
| | - Longkuo Xia
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
| | - Yingfei Lu
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
| | - Gaoying Sun
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
| | - Xinyue Wang
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
| | - Chuan‐Yun Li
- Laboratory of Bioinformatics and Genomic MedicineInstitute of Molecular MedicineCollege of Future TechnologyPeking UniversityBeijing100871China
| | - Baoyang Hu
- State Key Laboratory of Stem Cell and Reproductive BiologyInstitute of ZoologyChinese Academy of SciencesBeijing100101China
- Savaid Medical SchoolUniversity of Chinese Academy of SciencesBeijing100049China
- Institute for Stem Cell and RegenerationChinese Academy of SciencesBeijing100101China
- Beijing Institute for Stem Cell and Regenerative MedicineBeijing100101China
| |
Collapse
|
71
|
Poretti M, Praz CR, Sotiropoulos AG, Wicker T. A survey of lineage-specific genes in Triticeae reveals de novo gene evolution from genomic raw material. PLANT DIRECT 2023; 7:e484. [PMID: 36937792 PMCID: PMC10020141 DOI: 10.1002/pld3.484] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 01/26/2023] [Accepted: 01/27/2023] [Indexed: 06/18/2023]
Abstract
Diploid plant genomes typically contain ~35,000 genes, almost all belonging to highly conserved gene families. Only a small fraction are lineage-specific, which are found in only one or few closely related species. Little is known about how genes arise de novo in plant genomes and how often this occurs; however, they are believed to be important for plants diversification and adaptation. We developed a pipeline to identify lineage-specific genes in Triticeae, using newly available genome assemblies of wheat, barley, and rye. Applying a set of stringent criteria, we identified 5942 candidate Triticeae-specific genes (TSGs), of which 2337 were validated as protein-coding genes in wheat. Differential gene expression analyses revealed that stress-induced wheat TSGs are strongly enriched in putative secreted proteins. Some were previously described to be involved in Triticeae non-host resistance and cold response. Additionally, we show that 1079 TSGs have sequence homology to transposable elements (TEs), ~68% of them deriving from regulatory non-coding regions of Gypsy retrotransposons. Most importantly, we demonstrate that these TSGs are enriched in transmembrane domains and are among the most highly expressed wheat genes overall. To summarize, we conclude that de novo gene formation is relatively rare and that Triticeae probably possess ~779 lineage-specific genes per haploid genome. TSGs, which respond to pathogen and environmental stresses, may be interesting candidates for future targeted resistance breeding in Triticeae. Finally, we propose that non-coding regions of TEs might provide important genetic raw material for the functional innovation of TM domains and the evolution of novel secreted proteins.
Collapse
Affiliation(s)
- Manuel Poretti
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Department of BiologyUniversity of FribourgFribourgSwitzerland
| | - Coraline R. Praz
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
- Centro de Biotecnología y Genómica de PlantasUniversidad Politécnica de Madrid (UPM)–Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA)MadridSpain
| | | | - Thomas Wicker
- Department of Plant and Microbial BiologyUniversity of ZurichZurichSwitzerland
| |
Collapse
|
72
|
Late Embryogenesis Abundant Proteins Contribute to the Resistance of Toxoplasma gondii Oocysts against Environmental Stresses. mBio 2023; 14:e0286822. [PMID: 36809045 PMCID: PMC10128015 DOI: 10.1128/mbio.02868-22] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023] Open
Abstract
Toxoplasma gondii oocysts, which are shed in large quantities in the feces from infected felines, are very stable in the environment, resistant to most inactivation procedures, and highly infectious. The oocyst wall provides an important physical barrier for sporozoites contained inside oocysts, protecting them from many chemical and physical stressors, including most inactivation procedures. Furthermore, sporozoites can withstand large temperature changes, even freeze-thawing, as well as desiccation, high salinity, and other environmental insults; however, the genetic basis for this environmental resistance is unknown. Here, we show that a cluster of four genes encoding Late Embryogenesis Abundant (LEA)-related proteins are required to provide Toxoplasma sporozoites resistance to environmental stresses. Toxoplasma LEA-like genes (TgLEAs) exhibit the characteristic features of intrinsically disordered proteins, explaining some of their properties. Our in vitro biochemical experiments using recombinant TgLEA proteins show that they have cryoprotective effects on the oocyst-resident lactate dehydrogenase enzyme and that induced expression in E. coli of two of them leads to better survival after cold stress. Oocysts from a strain in which the four LEA genes were knocked out en bloc were significantly more susceptible to high salinity, freezing, and desiccation compared to wild-type oocysts. We discuss the evolutionary acquisition of LEA-like genes in Toxoplasma and other oocyst-producing apicomplexan parasites of the Sarcocystidae family and discuss how this has likely contributed to the ability of sporozoites within oocysts to survive outside the host for extended periods. Collectively, our data provide a first molecular detailed view on a mechanism that contributes to the remarkable resilience of oocysts against environmental stresses. IMPORTANCE Toxoplasma gondii oocysts are highly infectious and may survive in the environment for years. Their resistance against disinfectants and irradiation has been attributed to the oocyst and sporocyst walls by acting as physical and permeability barriers. However, the genetic basis for their resistance against stressors like changes in temperature, salinity, or humidity, is unknown. We show that a cluster of four genes encoding Toxoplasma Late Embryogenesis Abundant (TgLEA)-related proteins are important for this resistance to environmental stresses. TgLEAs have features of intrinsically disordered proteins, explaining some of their properties. Recombinant TgLEA proteins show cryoprotective effects on the parasite's lactate dehydrogenase, an abundant enzyme in oocysts, and expression in E. coli of two TgLEAs has a beneficial effect on growth after cold stress. Moreover, oocysts from a strain lacking all four TgLEA genes were more susceptible to high salinity, freezing, and desiccation compared to wild-type oocysts, highlighting the importance of the four TgLEAs for oocyst resilience.
Collapse
|
73
|
Yu J, Jiang W, Zhu SB, Liao Z, Dou X, Liu J, Guo FB, Dong C. Prediction of protein-coding small ORFs in multi-species using integrated sequence-derived features and the random forest model. Methods 2023; 210:10-19. [PMID: 36621557 DOI: 10.1016/j.ymeth.2022.12.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2022] [Revised: 12/27/2022] [Accepted: 12/30/2022] [Indexed: 01/07/2023] Open
Abstract
Proteins encoded by small open reading frames (sORFs) can serve as functional elements playing important roles in vivo. Such sORFs also constitute the potential pool for facilitating the de novo gene birth, driving evolutionary innovation and species diversity. Therefore, their theoretical and experimental identification has become a critical issue. Herein, we proposed a protein-coding sORFs prediction method merely based on integrative sequence-derived features. Our prediction performance is better or comparable compared with other nine prevalent methods, which shows that our method can provide a relatively reliable research tool for the prediction of protein-coding sORFs. Our method allows users to estimate the potential expression of a queried sORF, which has been demonstrated by the correlation analysis between our possibility estimation and codon adaption index (CAI). Based on the features that we used, we demonstrated that the sequence features of the protein-coding sORFs in the two domains have significant differences implying that it might be a relatively hard task in terms of cross-domain prediction, hence domain-specific models were developed, which allowed users to predict protein-coding sORFs both in eukaryotes and prokaryotes. Finally, a web-server was developed and provided to boost and facilitate the study of the related field, which is freely available at http://guolab.whu.edu.cn/codingCapacity/index.html.
Collapse
Affiliation(s)
- Jiafeng Yu
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Wenwen Jiang
- Department of Bioinformatics, Nanjing University of Posts and Telecommunications, Nanjing 210023, China
| | - Sen-Bin Zhu
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Zhen Liao
- School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu 611731, China
| | - Xianghua Dou
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Jian Liu
- Shandong Key Laboratory of Biophysics, Institute of Biophysics, Dezhou University, Dezhou 253023, China
| | - Feng-Biao Guo
- School of Pharmaceutical Sciences, Wuhan University, Wuhan 430071, China.
| | - Chuan Dong
- School of Pharmaceutical Sciences, Wuhan University, Wuhan 430071, China.
| |
Collapse
|
74
|
Affiliation(s)
- April Rich
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, University of Pittsburgh Medical School, Pittsburgh, PA, USA
| | - Anne-Ruxandra Carvunis
- Department of Computational and Systems Biology, Pittsburgh Center for Evolutionary Biology and Medicine, University of Pittsburgh Medical School, Pittsburgh, PA, USA.
| |
Collapse
|
75
|
Vakirlis N, Vance Z, Duggan KM, McLysaght A. De novo birth of functional microproteins in the human lineage. Cell Rep 2022; 41:111808. [PMID: 36543139 PMCID: PMC10073203 DOI: 10.1016/j.celrep.2022.111808] [Citation(s) in RCA: 27] [Impact Index Per Article: 13.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Revised: 06/21/2022] [Accepted: 11/18/2022] [Indexed: 12/24/2022] Open
Abstract
Small open reading frames (sORFs) can encode functional "microproteins" that perform crucial biological tasks. However, their size makes them less amenable to genomic analysis, and their origins and conservation are poorly understood. Given their short length, it is plausible that some of these functional microproteins have recently originated entirely de novo from noncoding sequences. Here we sought to identify such cases in the human lineage by reconstructing the evolutionary origins of human microproteins previously found to have measurable, statistically significant fitness effects. By tracing the formation of each ORF and its transcriptional activation, we show that novel microproteins with significant phenotypic effects have emerged de novo throughout animal evolution, including two after the human-chimpanzee split. Notably, traditional methods for assessing coding potential would miss most of these cases. This evidence demonstrates that the functional potential intrinsic to sORFs can be relatively rapidly and frequently realized through de novo gene emergence.
Collapse
Affiliation(s)
- Nikolaos Vakirlis
- Institute for Fundamental Biomedical Research, Biomedical Sciences Research Center "Alexander Fleming", Vari, Greece.
| | - Zoe Vance
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin, Ireland
| | - Kate M Duggan
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin, Ireland
| | - Aoife McLysaght
- Smurfit Institute of Genetics, Trinity College Dublin, University of Dublin, Dublin, Ireland.
| |
Collapse
|
76
|
The Theory of Carcino-Evo-Devo and Its Non-Trivial Predictions. Genes (Basel) 2022; 13:genes13122347. [PMID: 36553613 PMCID: PMC9777766 DOI: 10.3390/genes13122347] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 12/04/2022] [Accepted: 12/08/2022] [Indexed: 12/15/2022] Open
Abstract
To explain the sources of additional cell masses in the evolution of multicellular organisms, the theory of carcino-evo-devo, or evolution by tumor neofunctionalization, has been developed. The important demand for a new theory in experimental science is the capability to formulate non-trivial predictions which can be experimentally confirmed. Several non-trivial predictions were formulated using carcino-evo-devo theory, four of which are discussed in the present paper: (1) The number of cellular oncogenes should correspond to the number of cell types in the organism. The evolution of oncogenes, tumor suppressor and differentiation gene classes should proceed concurrently. (2) Evolutionarily new and evolving genes should be specifically expressed in tumors (TSEEN genes). (3) Human orthologs of fish TSEEN genes should acquire progressive functions connected with new cell types, tissues and organs. (4) Selection of tumors for new functions in the organism is possible. Evolutionarily novel organs should recapitulate tumor features in their development. As shown in this paper, these predictions have been confirmed by the laboratory of the author. Thus, we have shown that carcino-evo-devo theory has predictive power, fulfilling a fundamental requirement for a new theory.
Collapse
|
77
|
Petrzilek J, Pasulka J, Malik R, Horvat F, Kataruka S, Fulka H, Svoboda P. De novo emergence, existence, and demise of a protein-coding gene in murids. BMC Biol 2022; 20:272. [PMID: 36482406 PMCID: PMC9733328 DOI: 10.1186/s12915-022-01470-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2022] [Accepted: 11/15/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Genes, principal units of genetic information, vary in complexity and evolutionary history. Less-complex genes (e.g., long non-coding RNA (lncRNA) expressing genes) readily emerge de novo from non-genic sequences and have high evolutionary turnover. Genesis of a gene may be facilitated by adoption of functional genic sequences from retrotransposon insertions. However, protein-coding sequences in extant genomes rarely lack any connection to an ancestral protein-coding sequence. RESULTS We describe remarkable evolution of the murine gene D6Ertd527e and its orthologs in the rodent Muroidea superfamily. The D6Ertd527e emerged in a common ancestor of mice and hamsters most likely as a lncRNA-expressing gene. A major contributing factor was a long terminal repeat (LTR) retrotransposon insertion carrying an oocyte-specific promoter and a 5' terminal exon of the gene. The gene survived as an oocyte-specific lncRNA in several extant rodents while in some others the gene or its expression were lost. In the ancestral lineage of Mus musculus, the gene acquired protein-coding capacity where the bulk of the coding sequence formed through CAG (AGC) trinucleotide repeat expansion and duplications. These events generated a cytoplasmic serine-rich maternal protein. Knock-out of D6Ertd527e in mice has a small but detectable effect on fertility and the maternal transcriptome. CONCLUSIONS While this evolving gene is not showing a clear function in laboratory mice, its documented evolutionary history in Muroidea during the last ~ 40 million years provides a textbook example of how a several common mutation events can support de novo gene formation, evolution of protein-coding capacity, as well as gene's demise.
Collapse
Affiliation(s)
- Jan Petrzilek
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
- Present address: Vienna BioCenter PhD Program, Doctoral School of the University of Vienna and Medical University of Vienna, Vienna, Austria
| | - Josef Pasulka
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
| | - Radek Malik
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
| | - Filip Horvat
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
- Bioinformatics Group, Division of Biology, Faculty of Science, University of Zagreb, Horvatovac 102a, 10000, Zagreb, Croatia
| | - Shubhangini Kataruka
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
- Present address: Department of Genetics, Yale School of Medicine, New Haven, CT, 06510, USA
| | - Helena Fulka
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
- Current address: Institute of Experimental Medicine of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic
| | - Petr Svoboda
- Institute of Molecular Genetics of the Czech Academy of Sciences, Videnska 1083, 142 20, Prague 4, Czech Republic.
| |
Collapse
|
78
|
Intrinsically Disordered Proteins: An Overview. Int J Mol Sci 2022; 23:ijms232214050. [PMID: 36430530 PMCID: PMC9693201 DOI: 10.3390/ijms232214050] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2022] [Revised: 11/07/2022] [Accepted: 11/08/2022] [Indexed: 11/16/2022] Open
Abstract
Many proteins and protein segments cannot attain a single stable three-dimensional structure under physiological conditions; instead, they adopt multiple interconverting conformational states. Such intrinsically disordered proteins or protein segments are highly abundant across proteomes, and are involved in various effector functions. This review focuses on different aspects of disordered proteins and disordered protein regions, which form the basis of the so-called "Disorder-function paradigm" of proteins. Additionally, various experimental approaches and computational tools used for characterizing disordered regions in proteins are discussed. Finally, the role of disordered proteins in diseases and their utility as potential drug targets are explored.
Collapse
|
79
|
Laloum D, Robinson-Rechavi M. Rhythmicity is linked to expression cost at the protein level but to expression precision at the mRNA level. PLoS Comput Biol 2022; 18:e1010399. [PMID: 36095022 PMCID: PMC9518874 DOI: 10.1371/journal.pcbi.1010399] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2021] [Revised: 09/28/2022] [Accepted: 07/17/2022] [Indexed: 11/18/2022] Open
Abstract
Many genes have nycthemeral rhythms of expression, i.e. a 24-hours periodic variation, at either mRNA or protein level or both, and most rhythmic genes are tissue-specific. Here, we investigate and discuss the evolutionary origins of rhythms in gene expression. Our results suggest that rhythmicity of protein expression could have been favored by selection to minimize costs. Trends are consistent in bacteria, plants and animals, and are also supported by tissue-specific patterns in mouse. Unlike for protein level, cost cannot explain rhythm at the RNA level. We suggest that instead it allows to periodically reduce expression noise. Noise control had the strongest support in mouse, with limited evidence in other species. We have also found that genes under stronger purifying selection are rhythmically expressed at the mRNA level, and we propose that this is because they are noise sensitive genes. Finally, the adaptive role of rhythmic expression is supported by rhythmic genes being highly expressed yet tissue-specific. This provides a good evolutionary explanation for the observation that nycthemeral rhythms are often tissue-specific. For many genes, their expression, i.e. the production of RNA and proteins, is rhythmic with a 24-hour period. Here, we study and discuss the evolutionary origins of these rhythms. Our analyses of data from different species suggest that the rhythmicity of protein level may have been favored by selection for cost minimization. Furthermore, we have shown that cost cannot explain the rhythmic variations in RNA levels. Instead, we suggest that it periodically reduces the stochasticity of gene expression. We also found that genes under stronger purifying selection are rhythmically expressed at the mRNA level, and propose that this is because they are noise-sensitive genes. Finally, rhythmic expression involves genes that are often highly expressed and tissue-specific. This provides a good evolutionary explanation for the tissue-specificity of these rhythms.
Collapse
Affiliation(s)
- David Laloum
- Department of Ecology and Evolution, Batiment Biophore, Quartier UNIL-Sorge, Université de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Batiment Génopode, Quartier UNIL-Sorge, Université de Lausanne, Lausanne, Switzerland
| | - Marc Robinson-Rechavi
- Department of Ecology and Evolution, Batiment Biophore, Quartier UNIL-Sorge, Université de Lausanne, Lausanne, Switzerland
- Swiss Institute of Bioinformatics, Batiment Génopode, Quartier UNIL-Sorge, Université de Lausanne, Lausanne, Switzerland
- * E-mail:
| |
Collapse
|
80
|
Parikh SB, Houghton C, Van Oss SB, Wacholder A, Carvunis A. Origins, evolution, and physiological implications of de novo genes in yeast. Yeast 2022; 39:471-481. [PMID: 35959631 PMCID: PMC9544372 DOI: 10.1002/yea.3810] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2022] [Revised: 08/08/2022] [Accepted: 08/09/2022] [Indexed: 12/03/2022] Open
Abstract
De novo gene birth is the process by which new genes emerge in sequences that were previously noncoding. Over the past decade, researchers have taken advantage of the power of yeast as a model and a tool to study the evolutionary mechanisms and physiological implications of de novo gene birth. We summarize the mechanisms that have been proposed to explicate how noncoding sequences can become protein-coding genes, highlighting the discovery of pervasive translation of the yeast transcriptome and its presumed impact on evolutionary innovation. We summarize current best practices for the identification and characterization of de novo genes. Crucially, we explain that the field is still in its nascency, with the physiological roles of most young yeast de novo genes identified thus far still utterly unknown. We hope this review inspires researchers to investigate the true contribution of de novo gene birth to cellular physiology and phenotypic diversity across yeast strains and species.
Collapse
Affiliation(s)
- Saurin B. Parikh
- Department of Computational and Systems Biology, School of Medicine, Pittsburgh Center for Evolutionary Biology and EvolutionUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Carly Houghton
- Department of Computational and Systems Biology, School of Medicine, Pittsburgh Center for Evolutionary Biology and EvolutionUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - S. Branden Van Oss
- Department of Computational and Systems Biology, School of Medicine, Pittsburgh Center for Evolutionary Biology and EvolutionUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Aaron Wacholder
- Department of Computational and Systems Biology, School of Medicine, Pittsburgh Center for Evolutionary Biology and EvolutionUniversity of PittsburghPittsburghPennsylvaniaUSA
| | - Anne‐Ruxandra Carvunis
- Department of Computational and Systems Biology, School of Medicine, Pittsburgh Center for Evolutionary Biology and EvolutionUniversity of PittsburghPittsburghPennsylvaniaUSA
| |
Collapse
|
81
|
Pajic P, Shen S, Qu J, May AJ, Knox S, Ruhl S, Gokcumen O. A mechanism of gene evolution generating mucin function. SCIENCE ADVANCES 2022; 8:eabm8757. [PMID: 36026444 PMCID: PMC9417175 DOI: 10.1126/sciadv.abm8757] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2021] [Accepted: 07/12/2022] [Indexed: 05/12/2023]
Abstract
How novel gene functions evolve is a fundamental question in biology. Mucin proteins, a functionally but not evolutionarily defined group of proteins, allow the study of convergent evolution of gene function. By analyzing the genomic variation of mucins across a wide range of mammalian genomes, we propose that exonic repeats and their copy number variation contribute substantially to the de novo evolution of new gene functions. By integrating bioinformatic, phylogenetic, proteomic, and immunohistochemical approaches, we identified 15 undescribed instances of evolutionary convergence, where novel mucins originated by gaining densely O-glycosylated exonic repeat domains. Our results suggest that secreted proteins rich in proline are natural precursors for acquiring mucin function. Our findings have broad implications for understanding the role of exonic repeats in the parallel evolution of new gene functions, especially those involving protein glycosylation.
Collapse
Affiliation(s)
- Petar Pajic
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, NY 14260, USA
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, NY 14214, USA
| | - Shichen Shen
- Department of Pharmaceutical Sciences, University at Buffalo, The State University of New York, Buffalo, NY 14214, USA
- Center of Excellence in Bioinformatics and Life Science, Buffalo, NY 14203, USA
| | - Jun Qu
- Department of Pharmaceutical Sciences, University at Buffalo, The State University of New York, Buffalo, NY 14214, USA
- Center of Excellence in Bioinformatics and Life Science, Buffalo, NY 14203, USA
| | - Alison J. May
- Program in Craniofacial Biology, Department of Cell and Tissue Biology, School of Dentistry, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Sarah Knox
- Program in Craniofacial Biology, Department of Cell and Tissue Biology, School of Dentistry, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Stefan Ruhl
- Department of Oral Biology, School of Dental Medicine, University at Buffalo, The State University of New York, Buffalo, NY 14214, USA
| | - Omer Gokcumen
- Department of Biological Sciences, University at Buffalo, The State University of New York, Buffalo, NY 14260, USA
| |
Collapse
|
82
|
Sangster AG, Zarin T, Moses AM. Evolution of short linear motifs and disordered proteins Topic: yeast as model system to study evolution. Curr Opin Genet Dev 2022; 76:101964. [PMID: 35939968 DOI: 10.1016/j.gde.2022.101964] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 06/29/2022] [Accepted: 07/08/2022] [Indexed: 11/26/2022]
Abstract
Evolutionary preservation of protein structure had a major influence on the field of molecular evolution: changes in individual amino acids that did not disrupt protein folding would either have no effect or subtly change the 'lock' so that it could fit a new 'key'. Homology of individual amino acids could be confidently assigned through sequence alignments, and models of evolution could be tested. This view of molecular evolution excluded large regions of proteins that could not be confidently aligned, such as intrinsically disordered regions (IDRs) that do not fold into stable structures. In the last decade, major progress has been made in understanding the evolution of IDRs, much of it facilitated by new experimental and computational approaches in yeast. Here, we review this progress as well as several still outstanding questions.
Collapse
Affiliation(s)
- Ami G Sangster
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada
| | - Taraneh Zarin
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada. https://twitter.com/@taraneh_z
| | - Alan M Moses
- Cell & Systems Biology, University of Toronto, 25 Harbord St., Toronto, ON M5S 3G5, Canada.
| |
Collapse
|
83
|
Abstract
"De novo" genes evolve from previously non-genic DNA. This strikes many of us as remarkable, because it seems extraordinarily unlikely that random sequence would produce a functional gene. How is this possible? In this two-part review, I first summarize what is known about the origins and molecular functions of the small number of de novo genes for which such information is available. I then speculate on what these examples may tell us about how de novo genes manage to emerge despite what seem like enormous opposing odds.
Collapse
Affiliation(s)
- Caroline M Weisman
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, NJ, USA.
| |
Collapse
|
84
|
Song H, Guo Z, Zhang X, Sui J. De novo genes in Arachis hypogaea cv. Tifrunner: systematic identification, molecular evolution, and potential contributions to cultivated peanut. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1081-1095. [PMID: 35748398 DOI: 10.1111/tpj.15875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 06/15/2023]
Abstract
De novo genes are derived from non-coding sequences, and they can play essential roles in organisms. Cultivated peanut (Arachis hypogaea) is a major oil and protein crop derived from a cross between Arachis duranensis and Arachis ipaensis. However, few de novo genes have been documented in Arachis. Here, we identified 381 de novo genes in A. hypogaea cv. Tifrunner based on comparison with five closely related Arachis species. There are distinct differences in gene expression patterns and gene structures between conserved and de novo genes. The identified de novo genes originated from ancestral sequence regions associated with metabolic and biosynthetic processes, and they were subsequently integrated into existing regulatory networks. De novo paralogs and homoeologs were identified in A. hypogaea cv. Tifrunner. De novo paralogs and homoeologs with conserved expression have mismatching cis-acting elements under normal growth conditions. De novo genes potentially have pluripotent functions in responses to biotic stresses as well as in growth and development based on quantitative trait locus data. This work provides a foundation for future research examining gene birth processes and gene function in Arachis and related taxa.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Zhonglong Guo
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, School of Life Sciences and School of Advanced Agricultural Sciences, Peking University, Beijing, China
| | - Xiaojun Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Jiongming Sui
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
85
|
Eicholt LA, Aubel M, Berk K, Bornberg‐Bauer E, Lange A. Heterologous expression of naturally evolved putative de novo proteins with chaperones. Protein Sci 2022; 31:e4371. [PMID: 35900020 PMCID: PMC9278007 DOI: 10.1002/pro.4371] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2022] [Revised: 05/03/2022] [Accepted: 05/14/2022] [Indexed: 11/23/2022]
Abstract
Over the past decade, evidence has accumulated that new protein-coding genes can emerge de novo from previously non-coding DNA. Most studies have focused on large scale computational predictions of de novo protein-coding genes across a wide range of organisms. In contrast, experimental data concerning the folding and function of de novo proteins are scarce. This might be due to difficulties in handling de novo proteins in vitro, as most are short and predicted to be disordered. Here, we propose a guideline for the effective expression of eukaryotic de novo proteins in Escherichia coli. We used 11 sequences from Drosophila melanogaster and 10 from Homo sapiens, that are predicted de novo proteins from former studies, for heterologous expression. The candidate de novo proteins have varying secondary structure and disorder content. Using multiple combinations of purification tags, E. coli expression strains, and chaperone systems, we were able to increase the number of solubly expressed putative de novo proteins from 30% to 62%. Our findings indicate that the best combination for expressing putative de novo proteins in E. coli is a GST-tag with T7 Express cells and co-expressed chaperones. We found that, overall, proteins with higher predicted disorder were easier to express. STATEMENT: Today, we know that proteins do not only evolve by duplication and divergence of existing proteins but also arise from previously non-coding DNA. These proteins are called de novo proteins. Their properties are still poorly understood and their experimental analysis faces major obstacles. Here, we aim to present a starting point for soluble expression of de novo proteins with the help of chaperones and thereby enable further characterization.
Collapse
Affiliation(s)
- Lars A. Eicholt
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Margaux Aubel
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Katrin Berk
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| | - Erich Bornberg‐Bauer
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
- Max Planck‐Institute for Biology TuebingenTübingenGermany
| | - Andreas Lange
- Institute for Evolution and BiodiversityUniversity of MuensterMünsterGermany
| |
Collapse
|
86
|
Database of Potential Promoter Sequences in the Capsicum annuum Genome. BIOLOGY 2022; 11:biology11081117. [PMID: 35892972 PMCID: PMC9332048 DOI: 10.3390/biology11081117] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 07/19/2022] [Accepted: 07/23/2022] [Indexed: 11/16/2022]
Abstract
In this study, we used a mathematical method for the multiple alignment of highly divergent sequences (MAHDS) to create a database of potential promoter sequences (PPSs) in the Capsicum annuum genome. To search for PPSs, 20 statistically significant classes of sequences located in the range from −499 to +100 nucleotides near the annotated genes were calculated. For each class, a position–weight matrix (PWM) was computed and then used to identify PPSs in the C. annuum genome. In total, 825,136 PPSs were detected, with a false positive rate of 0.13%. The PPSs obtained with the MAHDS method were tested using TSSFinder, which detects transcription start sites. The databank of the found PPSs provides their coordinates in chromosomes, the alignment of each PPS with the PWM, and the level of statistical significance as a normal distribution argument, and can be used in genetic engineering and biotechnology.
Collapse
|
87
|
Brito-Estrada O, Hassel KR, Makarewich CA. An Integrated Approach for Microprotein Identification and Sequence Analysis. J Vis Exp 2022:10.3791/63841. [PMID: 35913170 PMCID: PMC9521633 DOI: 10.3791/63841] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/06/2024] Open
Abstract
Next-generation sequencing (NGS) has propelled the field of genomics forward and produced whole genome sequences for numerous animal species and model organisms. However, despite this wealth of sequence information, comprehensive gene annotation efforts have proven challenging, especially for small proteins. Notably, conventional protein annotation methods were designed to intentionally exclude putative proteins encoded by short open reading frames (sORFs) less than 300 nucleotides in length to filter out the exponentially higher number of spurious noncoding sORFs throughout the genome. As a result, hundreds of functional small proteins called microproteins (<100 amino acids in length) have been incorrectly classified as noncoding RNAs or overlooked entirely. Here we provide a detailed protocol to leverage free, publicly available bioinformatic tools to query genomic regions for microprotein-coding potential based on evolutionary conservation. Specifically, we provide step-by-step instructions on how to examine sequence conservation and coding potential using Phylogenetic Codon Substitution Frequencies (PhyloCSF) on the user-friendly University of California Santa Cruz (UCSC) Genome Browser. Additionally, we detail steps to efficiently generate multiple species alignments of identified microprotein sequences to visualize amino acid sequence conservation and recommend resources to analyze microprotein characteristics, including predicted domain structures. These powerful tools can be used to help identify putative microprotein-coding sequences in noncanonical genomic regions or to rule out the presence of a conserved coding sequence with translational potential in a noncoding transcript of interest.
Collapse
Affiliation(s)
- Omar Brito-Estrada
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children's Hospital Medical Center
| | - Keira R Hassel
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children's Hospital Medical Center
| | - Catherine A Makarewich
- The Heart Institute, Division of Molecular Cardiovascular Biology, Cincinnati Children's Hospital Medical Center; Department of Pediatrics, University of Cincinnati College of Medicine;
| |
Collapse
|
88
|
Jayaraman V, Toledo‐Patiño S, Noda‐García L, Laurino P. Mechanisms of protein evolution. Protein Sci 2022; 31:e4362. [PMID: 35762715 PMCID: PMC9214755 DOI: 10.1002/pro.4362] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2022] [Revised: 05/11/2022] [Accepted: 05/14/2022] [Indexed: 11/06/2022]
Abstract
How do proteins evolve? How do changes in sequence mediate changes in protein structure, and in turn in function? This question has multiple angles, ranging from biochemistry and biophysics to evolutionary biology. This review provides a brief integrated view of some key mechanistic aspects of protein evolution. First, we explain how protein evolution is primarily driven by randomly acquired genetic mutations and selection for function, and how these mutations can even give rise to completely new folds. Then, we also comment on how phenotypic protein variability, including promiscuity, transcriptional and translational errors, may also accelerate this process, possibly via "plasticity-first" mechanisms. Finally, we highlight open questions in the field of protein evolution, with respect to the emergence of more sophisticated protein systems such as protein complexes, pathways, and the emergence of pre-LUCA enzymes.
Collapse
Affiliation(s)
- Vijay Jayaraman
- Department of Molecular Cell BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Saacnicteh Toledo‐Patiño
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| | - Lianet Noda‐García
- Department of Plant Pathology and Microbiology, Institute of Environmental Sciences, Robert H. Smith Faculty of Agriculture, Food and EnvironmentHebrew University of JerusalemRehovotIsrael
| | - Paola Laurino
- Protein Engineering and Evolution UnitOkinawa Institute of Science and Technology Graduate UniversityOkinawaJapan
| |
Collapse
|
89
|
Chenevert M, Miller B, Karkoutli A, Rusnak A, Lott SE, Atallah J. The early embryonic transcriptome of a Hawaiian Drosophila picture-wing fly shows evidence of altered gene expression and novel gene evolution. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2022; 338:277-291. [PMID: 35322942 DOI: 10.1002/jez.b.23129] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 01/14/2022] [Accepted: 02/13/2022] [Indexed: 06/14/2023]
Abstract
A massive adaptive radiation on the Hawaiian archipelago has produced approximately one-quarter of the fly species in the family Drosophilidae. The Hawaiian Drosophila clade has long been recognized as a model system for the study of both the ecology of island endemics and the evolution of developmental mechanisms, but relatively few genomic and transcriptomic datasets are available for this group. We present here a differential expression analysis of the transcriptional profiles of two highly conserved embryonic stages in the Hawaiian picture-wing fly Drosophila grimshawi. When we compared our results to previously published datasets across the family Drosophilidae, we identified cases of both gains and losses of gene representation in D. grimshawi, including an apparent delay in Hox gene activation. We also found a high expression of unannotated genes. Most transcripts of unannotated genes with open reading frames do not have identified homologs in non-Hawaiian Drosophila species, although the vast majority have sequence matches in genomes of other Hawaiian picture-wing flies. Some of these unannotated genes may have arisen from noncoding sequence in the ancestor of Hawaiian flies or during the evolution of the clade. Our results suggest that both the modified use of ancestral genes and the evolution of new ones may occur in rapid radiations.
Collapse
Affiliation(s)
- Madeline Chenevert
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
- Hayward Genetics Center, Tulane University School of Medicine, New Orleans, Louisiana, USA
| | - Bronwyn Miller
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
| | - Ahmad Karkoutli
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
- LSUHSC School of Medicine, New Orleans, Louisiana, USA
| | - Anna Rusnak
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
- Center for Biomedical Engineering, Brown University, Box A-2, Arnold Lab, Providence, Rhode Island, USA
| | - Susan E Lott
- Department of Evolution & Ecology, University of California-Davis, Davis, California, USA
| | - Joel Atallah
- Department of Biological Sciences, University of New Orleans, New Orleans, Louisiana, USA
| |
Collapse
|
90
|
Prabh N, Rödelsperger C. Multiple Pristionchus pacificus genomes reveal distinct evolutionary dynamics between de novo candidates and duplicated genes. Genome Res 2022; 32:1315-1327. [PMID: 35618417 PMCID: PMC9341508 DOI: 10.1101/gr.276431.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2021] [Accepted: 05/20/2022] [Indexed: 01/03/2023]
Abstract
The birth of new genes is a major molecular innovation driving phenotypic diversity across all domains of life. Although repurposing of existing protein-coding material by duplication is considered the main process of new gene formation, recent studies have discovered thousands of transcriptionally active sequences as a rich source of new genes. However, differential loss rates have to be assumed to reconcile the high birth rates of these incipient de novo genes with the dominance of ancient gene families in individual genomes. Here, we test this rapid turnover hypothesis in the context of the nematode model organism Pristionchus pacificus We extended the existing species-level phylogenomic framework by sequencing the genomes of six divergent P. pacificus strains. We used these data to study the evolutionary dynamics of different age classes and categories of origin at a population level. Contrasting de novo candidates with new families that arose by duplication and divergence from known genes, we find that de novo candidates are typically shorter, show less expression, and are overrepresented on the sex chromosome. Although the contribution of de novo candidates increases toward young age classes, multiple comparisons within the same age class showed significantly higher attrition in de novo candidates than in known genes. Similarly, young genes remain under weak evolutionary constraints with de novo candidates representing the fastest evolving subcategory. Altogether, this study provides empirical evidence for the rapid turnover hypothesis and highlights the importance of the evolutionary timescale when quantifying the contribution of different mechanisms toward new gene formation.
Collapse
Affiliation(s)
- Neel Prabh
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, 72076 Tübingen, Germany
| | - Christian Rödelsperger
- Department for Integrative Evolutionary Biology, Max Planck Institute for Biology, 72076 Tübingen, Germany
| |
Collapse
|
91
|
Cardoso-Silva CB, Aono AH, Mancini MC, Sforça DA, da Silva CC, Pinto LR, Adams KL, de Souza AP. Taxonomically Restricted Genes Are Associated With Responses to Biotic and Abiotic Stresses in Sugarcane ( Saccharum spp.). FRONTIERS IN PLANT SCIENCE 2022; 13:923069. [PMID: 35845637 PMCID: PMC9280035 DOI: 10.3389/fpls.2022.923069] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 04/18/2022] [Accepted: 06/13/2022] [Indexed: 06/15/2023]
Abstract
Orphan genes (OGs) are protein-coding genes that are restricted to particular clades or species and lack homology with genes from other organisms, making their biological functions difficult to predict. OGs can rapidly originate and become functional; consequently, they may support rapid adaptation to environmental changes. Extensive spread of mobile elements and whole-genome duplication occurred in the Saccharum group, which may have contributed to the origin and diversification of OGs in the sugarcane genome. Here, we identified and characterized OGs in sugarcane, examined their expression profiles across tissues and genotypes, and investigated their regulation under varying conditions. We identified 319 OGs in the Saccharum spontaneum genome without detected homology to protein-coding genes in green plants, except those belonging to Saccharinae. Transcriptomic analysis revealed 288 sugarcane OGs with detectable expression levels in at least one tissue or genotype. We observed similar expression patterns of OGs in sugarcane genotypes originating from the closest geographical locations. We also observed tissue-specific expression of some OGs, possibly indicating a complex regulatory process for maintaining diverse functional activity of these genes across sugarcane tissues and genotypes. Sixty-six OGs were differentially expressed under stress conditions, especially cold and osmotic stresses. Gene co-expression network and functional enrichment analyses suggested that sugarcane OGs are involved in several biological mechanisms, including stimulus response and defence mechanisms. These findings provide a valuable genomic resource for sugarcane researchers, especially those interested in selecting stress-responsive genes.
Collapse
Affiliation(s)
- Cláudio Benício Cardoso-Silva
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Alexandre Hild Aono
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Melina Cristina Mancini
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Danilo Augusto Sforça
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
| | - Carla Cristina da Silva
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Agronomy Department, Federal University of Viçosa (UFV), Viçosa, Brazil
| | - Luciana Rossini Pinto
- Sugarcane Research Advanced Centre, Agronomic Institute of Campinas (IAC/APTA), Ribeirão Preto, Brazil
| | - Keith L. Adams
- Department of Botany, University of British Columbia, Vancouver, BC, Canada
| | - Anete Pereira de Souza
- Center of Molecular Biology and Genetic Engineering (CBMEG), University of Campinas (UNICAMP), Campinas, Brazil
- Institute of Biology, University of Campinas (UNICAMP), Campinas, Brazil
| |
Collapse
|
92
|
Weisman CM, Murray AW, Eddy SR. Mixing genome annotation methods in a comparative analysis inflates the apparent number of lineage-specific genes. Curr Biol 2022; 32:2632-2639.e2. [PMID: 35588743 PMCID: PMC9346927 DOI: 10.1016/j.cub.2022.04.085] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2022] [Revised: 03/17/2022] [Accepted: 04/21/2022] [Indexed: 12/16/2022]
Abstract
Comparisons of genomes of different species are used to identify lineage-specific genes, those genes that appear unique to one species or clade. Lineage-specific genes are often thought to represent genetic novelty that underlies unique adaptations. Identification of these genes depends not only on genome sequences, but also on inferred gene annotations. Comparative analyses typically use available genomes that have been annotated using different methods, increasing the risk that orthologous DNA sequences may be erroneously annotated as a gene in one species but not another, appearing lineage specific as a result. To evaluate the impact of such "annotation heterogeneity," we identified four clades of species with sequenced genomes with more than one publicly available gene annotation, allowing us to compare the number of lineage-specific genes inferred when differing annotation methods are used to those resulting when annotation method is uniform across the clade. In these case studies, annotation heterogeneity increases the apparent number of lineage-specific genes by up to 15-fold, suggesting that annotation heterogeneity is a substantial source of potential artifact.
Collapse
Affiliation(s)
- Caroline M Weisman
- Lewis-Sigler Institute for Integrative Genomics, Carl Icahn Laboratory, Princeton University, South Drive, Princeton, NJ 08540, USA.
| | - Andrew W Murray
- Department of Molecular & Cellular Biology, Harvard University, Divinity Avenue, Cambridge, MA 02138, USA
| | - Sean R Eddy
- Department of Molecular & Cellular Biology, Harvard University, Divinity Avenue, Cambridge, MA 02138, USA; Howard Hughes Medical Institute, Jones Bridge Road, Chevy Chase, MD 20815, USA; John A. Paulson School of Engineering and Applied Sciences, Harvard University, Oxford Street, Cambridge, MA 02138, USA
| |
Collapse
|
93
|
Raxwal VK, Singh S, Agarwal M, Riha K. Transcriptional and post-transcriptional regulation of young genes in plants. BMC Biol 2022; 20:134. [PMID: 35676681 PMCID: PMC9178820 DOI: 10.1186/s12915-022-01339-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/30/2022] [Indexed: 12/03/2022] Open
Abstract
Background New genes continuously emerge from non-coding DNA or by diverging from existing genes, but most of them are rapidly lost and only a few become fixed within the population. We hypothesized that young genes are subject to transcriptional and post-transcriptional regulation to limit their expression and minimize their exposure to purifying selection. Results We performed a protein-based homology search across the tree of life to determine the evolutionary age of protein-coding genes present in the rice genome. We found that young genes in rice have relatively low expression levels, which can be attributed to distal enhancers, and closed chromatin conformation at their transcription start sites (TSS). The chromatin in TSS regions can be re-modeled in response to abiotic stress, indicating conditional expression of young genes. Furthermore, transcripts of young genes in Arabidopsis tend to be targeted by nonsense-mediated RNA decay, presenting another layer of regulation limiting their expression. Conclusions These data suggest that transcriptional and post-transcriptional mechanisms contribute to the conditional expression of young genes, which may alleviate purging selection while providing an opportunity for phenotypic exposure and functionalization. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01339-7.
Collapse
Affiliation(s)
- Vivek Kumar Raxwal
- Department of Botany, University of Delhi, Delhi, 110007, India. .,Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| | - Somya Singh
- Department of Botany, University of Delhi, Delhi, 110007, India
| | - Manu Agarwal
- Department of Botany, University of Delhi, Delhi, 110007, India.
| | - Karel Riha
- Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| |
Collapse
|
94
|
Suenaga Y, Kato M, Nagai M, Nakatani K, Kogashi H, Kobatake M, Makino T. Open reading frame dominance indicates protein‐coding potential of RNAs. EMBO Rep 2022; 23:e54321. [PMID: 35438231 PMCID: PMC9171421 DOI: 10.15252/embr.202154321] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/12/2021] [Revised: 03/24/2022] [Accepted: 03/25/2022] [Indexed: 11/13/2022] Open
Abstract
Recent studies have identified numerous RNAs with both coding and noncoding functions. However, the sequence characteristics that determine this bifunctionality remain largely unknown. In the present study, we develop and test the open reading frame (ORF) dominance score, which we define as the fraction of the longest ORF in the sum of all putative ORF lengths. This score correlates with translation efficiency in coding transcripts and with translation of noncoding RNAs. In bacteria and archaea, coding and noncoding transcripts have narrow distributions of high and low ORF dominance, respectively, whereas those of eukaryotes show relatively broader ORF dominance distributions, with considerable overlap between coding and noncoding transcripts. The extent of overlap positively and negatively correlates with the mutation rate of genomes and the effective population size of species, respectively. Tissue‐specific transcripts show higher ORF dominance than ubiquitously expressed transcripts, and the majority of tissue‐specific transcripts are expressed in mature testes. These data suggest that the decrease in population size and the emergence of testes in eukaryotic organisms allowed for the evolution of potentially bifunctional RNAs.
Collapse
Affiliation(s)
- Yusuke Suenaga
- Department of Molecular Carcinogenesis Chiba Cancer Centre Research Institute Chiba Japan
| | - Mamoru Kato
- Division of Bioinformatics National Cancer Centre Research Institute Tokyo Japan
| | - Momoko Nagai
- Division of Bioinformatics National Cancer Centre Research Institute Tokyo Japan
| | - Kazuma Nakatani
- Department of Molecular Carcinogenesis Chiba Cancer Centre Research Institute Chiba Japan
- Department of Molecular Biology and Oncology Chiba University School of Medicine Chiba Japan
- Innovative Medicine CHIBA Doctoral WISE Program Chiba University School of Medicine Chiba Japan
| | - Hiroyuki Kogashi
- Department of Molecular Carcinogenesis Chiba Cancer Centre Research Institute Chiba Japan
- Department of Molecular Biology and Oncology Chiba University School of Medicine Chiba Japan
| | - Miho Kobatake
- Department of Molecular Carcinogenesis Chiba Cancer Centre Research Institute Chiba Japan
| | - Takashi Makino
- Laboratory of Evolutionary Genomics Graduate School of Life Sciences Tohoku University Sendai Japan
| |
Collapse
|
95
|
Kosinski LJ, Aviles NR, Gomez K, Masel J. Random peptides rich in small and disorder-promoting amino acids are less likely to be harmful. Genome Biol Evol 2022; 14:evac085. [PMID: 35668555 PMCID: PMC9210321 DOI: 10.1093/gbe/evac085] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 04/01/2022] [Accepted: 05/27/2022] [Indexed: 11/15/2022] Open
Abstract
Proteins are the workhorses of the cell, yet they carry great potential for harm via misfolding and aggregation. Despite the dangers, proteins are sometimes born de novo from non-coding DNA. Proteins are more likely to be born from non-coding regions that produce peptides that do little to no harm when translated than from regions that produce harmful peptides. To investigate which newborn proteins are most likely to "first, do no harm", we estimate fitnesses from an experiment that competed Escherichia coli lineages that each expressed a unique random peptide. A variety of peptide metrics significantly predict lineage fitness, but this predictive power stems from simple amino acid frequencies rather than the ordering of amino acids. Amino acids that are smaller and that promote intrinsic structural disorder have more benign fitness effects. We validate that the amino acids that indicate benign effects in random peptides expressed in E. coli also do so in an independent dataset of random N-terminal tags in which it is possible to control for expression level. The same amino acids are also enriched in young animal proteins.
Collapse
Affiliation(s)
- Luke J Kosinski
- Department of Molecular and Cellular Biology, University of Arizona, Tucson, USA
| | - Nathan R Aviles
- Graduate Interdisciplinary Program in Statistics, University of Arizona, Tucson, USA
| | - Kevin Gomez
- Graduate Interdisciplinary Program in Applied Math, University of Arizona, Tucson, USA
| | - Joanna Masel
- Department of Ecology and Evolutionary Biology, University of Arizona, Tucson, USA
| |
Collapse
|
96
|
Lee BY, Kim J, Lee J. Intraspecific de novo gene birth revealed by presence-absence variant genes in Caenorhabditis elegans. NAR Genom Bioinform 2022; 4:lqac031. [PMID: 35464238 PMCID: PMC9022459 DOI: 10.1093/nargab/lqac031] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2021] [Revised: 03/30/2022] [Accepted: 04/13/2022] [Indexed: 12/24/2022] Open
Abstract
Genes embed their evolutionary history in the form of various alleles. Presence-absence variants (PAVs) are extreme cases of such alleles, where a gene present in one haplotype does not exist in another. Because PAVs may result from either birth or death of a gene, PAV genes and their alternative alleles, if available, can represent a basis for rapid intraspecific gene evolution. Using long-read sequencing technologies, this study traced the possible evolution of PAV genes in the PD1074 and CB4856 C. elegans strains as well as their alternative alleles in 14 other wild strains. We updated the CB4856 genome by filling 18 gaps and identified 46 genes and 7,460 isoforms from both strains not annotated previously. We verified 328 PAV genes, out of which 46 were C. elegans-specific. Among these possible newly born genes, 12 had alternative alleles in other wild strains; in particular, the alternative alleles of three genes showed signatures of active transposons. Alternative alleles of three other genes showed another type of signature reflected in accumulation of small insertions or deletions. Research on gene evolution using both species-specific PAV genes and their alternative alleles may provide new insights into the process of gene evolution.
Collapse
Affiliation(s)
- Bo Yun Lee
- Research Institute of Basic Sciences, Seoul National University, Seoul 08826, Korea
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul 08826, Korea
| | - Jun Kim
- Research Institute of Basic Sciences, Seoul National University, Seoul 08826, Korea
- Department of Biological Sciences, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, Korea
| | - Junho Lee
- Research Institute of Basic Sciences, Seoul National University, Seoul 08826, Korea
- Institute of Molecular Biology and Genetics, Seoul National University, Seoul 08826, Korea
- Department of Biological Sciences, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul 08826, Korea
| |
Collapse
|
97
|
Li Q, Lindtke D, Rodríguez-Ramírez C, Kakioka R, Takahashi H, Toyoda A, Kitano J, Ehrlich RL, Chang Mell J, Yeaman S. Local Adaptation and the Evolution of Genome Architecture in Threespine Stickleback. Genome Biol Evol 2022; 14:6589818. [PMID: 35594844 PMCID: PMC9178229 DOI: 10.1093/gbe/evac075] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/16/2022] [Indexed: 12/11/2022] Open
Abstract
Theory predicts that local adaptation should favor the evolution of a concentrated genetic architecture, where the alleles driving adaptive divergence are tightly clustered on chromosomes. Adaptation to marine versus freshwater environments in threespine stickleback has resulted in an architecture that seems consistent with this prediction: divergence among populations is mainly driven by a few genomic regions harboring multiple quantitative trait loci for environmentally adapted traits, as well as candidate genes with well-established phenotypic effects. One theory for the evolution of these "genomic islands" is that rearrangements remodel the genome to bring causal loci into tight proximity, but this has not been studied explicitly. We tested this theory using synteny analysis to identify micro- and macro-rearrangements in the stickleback genome and assess their potential involvement in the evolution of genomic islands. To identify rearrangements, we conducted a de novo assembly of the closely related tubesnout (Aulorhyncus flavidus) genome and compared this to the genomes of threespine stickleback and two other closely related species. We found that small rearrangements, within-chromosome duplications, and lineage-specific genes (LSGs) were enriched around genomic islands, and that all three chromosomes harboring large genomic islands have experienced macro-rearrangements. We also found that duplicates and micro-rearrangements are 9.9× and 2.9× more likely to involve genes differentially expressed between marine and freshwater genotypes. While not conclusive, these results are consistent with the explanation that strong divergent selection on candidate genes drove the recruitment of rearrangements to yield clusters of locally adaptive loci.
Collapse
Affiliation(s)
- Qiushi Li
- Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Canada T2N 1N4
| | - Dorothea Lindtke
- Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Canada T2N 1N4
| | - Carlos Rodríguez-Ramírez
- Division of Evolutionary Ecology, Institute of Ecology and Evolution, University of Bern, Bern, Switzerland
| | - Ryo Kakioka
- Tropical Biosphere Research Center, University of the Ryukyus, Nishihara, Nakagami-gun, Okinawa 903-0213, Japan
| | - Hiroshi Takahashi
- National Fisheries University, 2-7-1 Nagata-honmachi, Shimonoseki, Yamaguchi 759-6595, Japan
| | - Atsushi Toyoda
- Comparative Genomics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
| | - Jun Kitano
- Ecological Genetics Laboratory, National Institute of Genetics, Mishima, Shizuoka 411-8540, Japan
| | - Rachel L Ehrlich
- Department of Microbiology & Immunology, Drexel University College of Medicine, Philadelphia 19102, PA, USA
| | - Joshua Chang Mell
- Department of Microbiology & Immunology, Drexel University College of Medicine, Philadelphia 19102, PA, USA
| | - Sam Yeaman
- Department of Biological Sciences, University of Calgary, 2500 University Drive NW, Calgary, Canada T2N 1N4
| |
Collapse
|
98
|
Maltseva AL, Lobov AA, Pavlova PA, Panova M, Gafarova ER, Marques JP, Danilov LG, Granovitch AI. Orphan gene in Littorina: An unexpected role of symbionts in the host evolution. Gene 2022; 824:146389. [PMID: 35257790 DOI: 10.1016/j.gene.2022.146389] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 01/29/2022] [Accepted: 02/28/2022] [Indexed: 11/16/2022]
Abstract
Mechanisms of reproductive isolation between closely related sympatric species are of high evolutionary significance as they may function as initial drivers of speciation and protect species integrity afterwards. Proteins involved in the establishment of reproductive barriers often evolve fast and may be key players in cessation of gene flow between the incipient species. The five Atlantic Littorina (Neritrema) species represent a notable example of recent radiation. The geographic ranges of these young species largely overlap and the mechanisms of reproductive isolation are poorly understood. In this study, we performed a detailed analysis of the reproductive protein LOSP, previously identified in Littorina. We showed that this protein is evolutionary young and taxonomically restricted to the genus Littorina. It has high sequence variation both within and between Littorina species, which is compatible with its presumable role in the reproductive isolation. The strongest differences in the LOSP structure were detected between Littorina subgenera with distinctive repetitive motifs present exclusively in the Neritrema species, but not in L. littorea. Moreover, the sequence of these repetitive structural elements demonstrates a high homology with genetic elements of bacteria, identified as components of Littorina associated microbiomes. We suggest that these elements were acquired from a symbiotic bacterial donor via horizontal genetic transfer (HGT), which is indirectly confirmed by the presence of multiple transposable elements in the LOSP flanking and intronic regions. Furthermore, we hypothesize that this HGT-driven evolutionary innovation promoted LOSP function in reproductive isolation, which might be one of the factors determining the intensive cladogenesis in the Littorina (Neritrema) lineage in contrast to the anagenesis in the L. littorea clade.
Collapse
Affiliation(s)
- A L Maltseva
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, Russia.
| | - A A Lobov
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, Russia; Laboratory of Regenerative Biomedicine, Institute of Cytology Russian Academy of Sciences, St Petersburg, Russia
| | - P A Pavlova
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, Russia
| | - M Panova
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, Russia; Department of Marine Sciences - Tjärnö, University of Gothenburg, Sweden
| | - E R Gafarova
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, Russia
| | - J P Marques
- CIBIO, Centro de Investigação em Biodiversidade e Recursos Genéticos, InBIO Laboratório Associado, Universidade do Porto, Vairão, Portugal; Departamento de Biologia, Faculdade de Ciências do Porto, 4169-007 Porto, Portugal; ISEM, Univ Montpellier, CNRS, EPHE, IRD, 34095 Montpellier, France
| | - L G Danilov
- Department of Genetics and Biotechnology, St. Petersburg State University, St. Petersburg, Russia
| | - A I Granovitch
- Department of Invertebrate Zoology, St Petersburg State University, St Petersburg, Russia
| |
Collapse
|
99
|
Delihas N. An ancestral genomic sequence that serves as a nucleation site for de novo gene birth. PLoS One 2022; 17:e0267864. [PMID: 35552551 PMCID: PMC9097989 DOI: 10.1371/journal.pone.0267864] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2022] [Accepted: 04/17/2022] [Indexed: 11/24/2022] Open
Abstract
The process of gene birth is of major interest with current excitement concerning de novo gene formation. We report a new and different mechanism of de novo gene birth based on the finding and the characteristics of a short non-coding sequence situated between two protein genes, termed a spacer sequence. This non-coding sequence is present in genomes of Mus musculus, the house mouse and Philippine tarsier, a primitive ancestral primate. The ancestral sequence is highly conserved during primate evolution with certain base pairs totally invariant from mouse to humans. By following the birth of the sequence of human lincRNA BCRP3 (BCR activator of RhoGEF and GTPase 3 pseudogene) during primate evolution, we find diverse genes, long non-coding RNA and protein genes (and sequences that do not appear to encode a gene) that all stem from the 3’ end of the spacer, and all begin with a similar sequence. During primate evolution, part of the BCRP3 sequence initially formed in the Old World Monkeys and developed into different primate genes before evolving into the BCRP3 gene in humans. The gene developmental process consists of the initiation of DNA synthesis at spacer 3’ ends, addition of a complex of tandem transposable elements and the addition of a segment of another gene. The findings support the concept of the spacer sequence as a starting site for DNA synthesis that leads to formation of different genes with the addition of other sequences. These data suggest a new process of de novo gene birth.
Collapse
Affiliation(s)
- Nicholas Delihas
- Department of Microbiology and Immunology, Renaissance School of Medicine, Stony Brook University, Stony Brook, New York, United States of America
- * E-mail:
| |
Collapse
|
100
|
A novel regulatory gene promotes novel cell fate by suppressing ancestral fate in the sea anemone Nematostella vectensis. Proc Natl Acad Sci U S A 2022; 119:e2113701119. [PMID: 35500123 PMCID: PMC9172639 DOI: 10.1073/pnas.2113701119] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
Abstract
In this study, we demonstrate how a new cell type can arise through duplication of an ancestral cell type followed by functional divergence of the new daughter cell. Specifically, we show that stinging cells in a cnidarian (namely, a sea anemone) emerged by duplication of an ancestral neuron followed by inhibition of the RFamide neuropeptide it once secreted. This finding is evidence that stinging cells evolved from a specific subtype of neurons and suggests other neuronal subtypes may have been coopted for other novel secretory functions. Cnidocytes (i.e., stinging cells) are an unequivocally novel cell type used by cnidarians (i.e., corals, jellyfish, and their kin) to immobilize prey. Although they are known to share a common evolutionary origin with neurons, the developmental program that promoted the emergence of cnidocyte fate is not known. Using functional genomics in the sea anemone, Nematostella vectensis, we show that cnidocytes develop by suppression of neural fate in a subset of neurons expressing RFamide. We further show that a single regulatory gene, a C2H2-type zinc finger transcription factor (ZNF845), coordinates both the gain of novel (cnidocyte-specific) traits and the inhibition of ancestral (neural) traits during cnidocyte development and that this gene arose by domain shuffling in the stem cnidarian. Thus, we report a mechanism by which a truly novel regulatory gene (ZNF845) promotes the development of a truly novel cell type (cnidocyte) through duplication of an ancestral cell lineage (neuron) and inhibition of its ancestral identity (RFamide).
Collapse
|