1
|
Lee U, Mozeika SM, Zhao L. A Synergistic, Cultivator Model of De Novo Gene Origination. Genome Biol Evol 2024; 16:evae103. [PMID: 38748819 PMCID: PMC11152449 DOI: 10.1093/gbe/evae103] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/12/2024] [Indexed: 06/07/2024] Open
Abstract
The origin and fixation of evolutionarily young genes is a fundamental question in evolutionary biology. However, understanding the origins of newly evolved genes arising de novo from noncoding genomic sequences is challenging. This is partly due to the low likelihood that several neutral or nearly neutral mutations fix prior to the appearance of an important novel molecular function. This issue is particularly exacerbated in large effective population sizes where the effect of drift is small. To address this problem, we propose a regulation-focused, cultivator model for de novo gene evolution. This cultivator-focused model posits that each step in a novel variant's evolutionary trajectory is driven by well-defined, selectively advantageous functions for the cultivator genes, rather than solely by the de novo genes, emphasizing the critical role of genome organization in the evolution of new genes.
Collapse
Affiliation(s)
- UnJin Lee
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Shawn M Mozeika
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| |
Collapse
|
2
|
James C, Trevisan-Herraz M, Juan D, Rico D. Evolutionary analysis of gene ages across TADs associates chromatin topology with whole-genome duplications. Cell Rep 2024; 43:113895. [PMID: 38517894 DOI: 10.1016/j.celrep.2024.113895] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2022] [Revised: 11/03/2023] [Accepted: 02/16/2024] [Indexed: 03/24/2024] Open
Abstract
Topologically associated domains (TADs) are interaction subnetworks of chromosomal regions in 3D genomes. TAD boundaries frequently coincide with genome breaks while boundary deletion is under negative selection, suggesting that TADs may facilitate genome rearrangements and evolution. We show that genes co-localize by evolutionary age in humans and mice, resulting in TADs having different proportions of younger and older genes. We observe a major transition in the age co-localization patterns between the genes born during vertebrate whole-genome duplications (WGDs) or before and those born afterward. We also find that genes recently duplicated in primates and rodents are more frequently essential when they are located in old-enriched TADs and interact with genes that last duplicated during the WGD. Therefore, the evolutionary relevance of recent genes may increase when located in TADs with established regulatory networks. Our data suggest that TADs could play a role in organizing ancestral functions and evolutionary novelty.
Collapse
Affiliation(s)
- Caelinn James
- Biosciences Institute, Newcastle University, Newcastle Upon Tyne, UK; Scotland's Rural College (SRUC), The Roslin Institute Building, Easter Bush, Midlothian, UK
| | - Marco Trevisan-Herraz
- Biosciences Institute, Newcastle University, Newcastle Upon Tyne, UK; Translational and Clinical Research Institute, Newcastle University, Newcastle Upon Tyne, UK
| | - David Juan
- Institut de Biologia Evolutiva, Consejo Superior de Investigaciones Científicas-Universitat Pompeu Fabra, Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain; Systems Biology Department, Spanish National Centre for Biotechnology (CNB-CSIC), Madrid, Spain
| | - Daniel Rico
- Biosciences Institute, Newcastle University, Newcastle Upon Tyne, UK; Centro Andaluz de Biología Molecular y Medicina Regenerativa (CABIMER), CSIC-Universidad de Sevilla-Universidad Pablo de Olavide-Junta de Andalucía, Seville, Spain.
| |
Collapse
|
3
|
Liu X, Xiao C, Xu X, Zhang J, Mo F, Chen JY, Delihas N, Zhang L, An NA, Li CY. Origin of functional de novo genes in humans from "hopeful monsters". WILEY INTERDISCIPLINARY REVIEWS. RNA 2024; 15:e1845. [PMID: 38605485 DOI: 10.1002/wrna.1845] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2023] [Revised: 03/13/2024] [Accepted: 03/18/2024] [Indexed: 04/13/2024]
Abstract
For a long time, it was believed that new genes arise only from modifications of preexisting genes, but the discovery of de novo protein-coding genes that originated from noncoding DNA regions demonstrates the existence of a "motherless" origination process for new genes. However, the features, distributions, expression profiles, and origin modes of these genes in humans seem to support the notion that their origin is not a purely "motherless" process; rather, these genes arise preferentially from genomic regions encoding preexisting precursors with gene-like features. In such a case, the gene loci are typically not brand new. In this short review, we will summarize the definition and features of human de novo genes and clarify their process of origination from ancestral non-coding genomic regions. In addition, we define the favored precursors, or "hopeful monsters," for the origin of de novo genes and present a discussion of the functional significance of these young genes in brain development and tumorigenesis in humans. This article is categorized under: RNA Evolution and Genomics > RNA and Ribonucleoprotein Evolution.
Collapse
Affiliation(s)
- Xiaoge Liu
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Chunfu Xiao
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Xinwei Xu
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Jie Zhang
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Fan Mo
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Jia-Yu Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Chemistry and Biomedicine Innovation Center (ChemBIC), Nanjing University, Nanjing, China
| | - Nicholas Delihas
- Department of Microbiology and Immunology, Renaissance School of Medicine, Stony Brook University, Stony Brook, New York, USA
| | - Li Zhang
- Chinese Institute for Brain Research, Beijing, China
| | - Ni A An
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
| | - Chuan-Yun Li
- State Key Laboratory of Protein and Plant Gene Research, Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, College of Future Technology, Peking University, Beijing, China
- Chinese Institute for Brain Research, Beijing, China
- Southwest United Graduate School, Kunming, China
| |
Collapse
|
4
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. Nat Commun 2024; 15:810. [PMID: 38280868 PMCID: PMC10821953 DOI: 10.1038/s41467-024-45028-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 01/09/2024] [Indexed: 01/29/2024] Open
Abstract
Recent studies reveal that de novo gene origination from previously non-genic sequences is a common mechanism for gene innovation. These young genes provide an opportunity to study the structural and functional origins of proteins. Here, we combine high-quality base-level whole-genome alignments and computational structural modeling to study the origination, evolution, and protein structures of lineage-specific de novo genes. We identify 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. Sequence composition, evolutionary rates, and expression patterns indicate possible gradual functional or adaptive shifts with their gene ages. Surprisingly, we find little overall protein structural changes in candidates from the Drosophilinae lineage. We identify several candidates with potentially well-folded protein structures. Ancestral sequence reconstruction analysis reveals that most potentially well-folded candidates are often born well-folded. Single-cell RNA-seq analysis in testis shows that although most de novo gene candidates are enriched in spermatocytes, several young candidates are biased towards the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and protein structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA.
| |
Collapse
|
5
|
Mani S, Tlusty T. Gene birth in a model of non-genic adaptation. BMC Biol 2023; 21:257. [PMID: 37957718 PMCID: PMC10644530 DOI: 10.1186/s12915-023-01745-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Accepted: 10/24/2023] [Indexed: 11/15/2023] Open
Abstract
BACKGROUND Over evolutionary timescales, genomic loci can switch between functional and non-functional states through processes such as pseudogenization and de novo gene birth. Particularly, de novo gene birth is a widespread process, and many examples continue to be discovered across diverse evolutionary lineages. However, the general mechanisms that lead to functionalization are poorly understood, and estimated rates of de novo gene birth remain contentious. Here, we address this problem within a model that takes into account mutations and structural variation, allowing us to estimate the likelihood of emergence of new functions at non-functional loci. RESULTS Assuming biologically reasonable mutation rates and mutational effects, we find that functionalization of non-genic loci requires the realization of strict conditions. This is in line with the observation that most de novo genes are localized to the vicinity of established genes. Our model also provides an explanation for the empirical observation that emerging proto-genes are often lost despite showing signs of adaptation. CONCLUSIONS Our work elucidates the properties of non-genic loci that make them fertile for adaptation, and our results offer mechanistic insights into the process of de novo gene birth.
Collapse
Affiliation(s)
- Somya Mani
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, Republic of Korea.
| | - Tsvi Tlusty
- Center for Soft and Living Matter, Institute for Basic Science, Ulsan 44919, Republic of Korea
- Departments of Physics and Chemistry, Ulsan National Institute of Science and Technology (UNIST), Ulsan 44919, Republic of Korea
| |
Collapse
|
6
|
Lee HK, Willi M, Liu C, Hennighausen L. Cell-specific and shared regulatory elements control a multigene locus active in mammary and salivary glands. Nat Commun 2023; 14:4992. [PMID: 37591874 PMCID: PMC10435465 DOI: 10.1038/s41467-023-40712-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Accepted: 08/08/2023] [Indexed: 08/19/2023] Open
Abstract
Regulation of high-density loci harboring genes with different cell-specificities remains a puzzle. Here we investigate a locus that evolved through gene duplication and contains eight genes and 20 candidate regulatory elements, including one super-enhancer. Casein genes (Csn1s1, Csn2, Csn1s2a, Csn1s2b, Csn3) are expressed in mammary glands, induced 10,000-fold during pregnancy and account for 50% of mRNAs during lactation, Prr27 and Fdcsp are salivary-specific and Odam has dual specificity. We probed the function of 12 candidate regulatory elements, individually and in combination, in the mouse genome. The super-enhancer is essential for the expression of Csn3, Csn1s2b, Odam and Fdcsp but largely dispensable for Csn1s1, Csn2 and Csn1s2a. Csn3 activation also requires its own local enhancer. Synergism between local enhancers and cytokine-responsive promoter elements facilitates activation of Csn2 during pregnancy. Our work identifies the regulatory complexity of a multigene locus with an ancestral super-enhancer active in mammary and salivary tissue and local enhancers and promoter elements unique to mammary tissue.
Collapse
Affiliation(s)
- Hye Kyung Lee
- Section of Genetics and Physiology, Laboratory of Cellular and Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland, 20892, USA.
| | - Michaela Willi
- Section of Genetics and Physiology, Laboratory of Cellular and Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Chengyu Liu
- Transgenic Core, National Heart, Lung, and Blood Institute, US National Institutes of Health, Bethesda, Maryland, 20892, USA
| | - Lothar Hennighausen
- Section of Genetics and Physiology, Laboratory of Cellular and Molecular Biology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland, 20892, USA.
| |
Collapse
|
7
|
Peng J, Zhao L. The origin and structural evolution of de novo genes in Drosophila. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.03.13.532420. [PMID: 37425675 PMCID: PMC10326970 DOI: 10.1101/2023.03.13.532420] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/11/2023]
Abstract
Although previously thought to be unlikely, recent studies have shown that de novo gene origination from previously non-genic sequences is a relatively common mechanism for gene innovation in many species and taxa. These young genes provide a unique set of candidates to study the structural and functional origination of proteins. However, our understanding of their protein structures and how these structures originate and evolve are still limited, due to a lack of systematic studies. Here, we combined high-quality base-level whole genome alignments, bioinformatic analysis, and computational structure modeling to study the origination, evolution, and protein structure of lineage-specific de novo genes. We identified 555 de novo gene candidates in D. melanogaster that originated within the Drosophilinae lineage. We found a gradual shift in sequence composition, evolutionary rates, and expression patterns with their gene ages, which indicates possible gradual shifts or adaptations of their functions. Surprisingly, we found little overall protein structural changes for de novo genes in the Drosophilinae lineage. Using Alphafold2, ESMFold, and molecular dynamics, we identified a number of de novo gene candidates with protein products that are potentially well-folded, many of which are more likely to contain transmembrane and signal proteins compared to other annotated protein-coding genes. Using ancestral sequence reconstruction, we found that most potentially well-folded proteins are often born folded. Interestingly, we observed one case where disordered ancestral proteins become ordered within a relatively short evolutionary time. Single-cell RNA-seq analysis in testis showed that although most de novo genes are enriched in spermatocytes, several young de novo genes are biased in the early spermatogenesis stage, indicating potentially important but less emphasized roles of early germline cells in the de novo gene origination in testis. This study provides a systematic overview of the origin, evolution, and structural changes of Drosophilinae-specific de novo genes.
Collapse
Affiliation(s)
- Junhui Peng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
8
|
Grandchamp A, Kühl L, Lebherz M, Brüggemann K, Parsch J, Bornberg-Bauer E. Population genomics reveals mechanisms and dynamics of de novo expressed open reading frame emergence in Drosophila melanogaster. Genome Res 2023; 33:872-890. [PMID: 37442576 PMCID: PMC10519401 DOI: 10.1101/gr.277482.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Accepted: 06/06/2023] [Indexed: 07/15/2023]
Abstract
Novel genes are essential for evolutionary innovations and differ substantially even between closely related species. Recently, multiple studies across many taxa showed that some novel genes arise de novo, that is, from previously noncoding DNA. To characterize the underlying mutations that allowed de novo gene emergence and their order of occurrence, homologous regions must be detected within noncoding sequences in closely related sister genomes. So far, most studies do not detect noncoding homologs of de novo genes because of incomplete assemblies and annotations, and long evolutionary distances separating genomes. Here, we overcome these issues by searching for de novo expressed open reading frames (neORFs), the not-yet fixed precursors of de novo genes that emerged within a single species. We sequenced and assembled genomes with long-read technology and the corresponding transcriptomes from inbred lines of Drosophila melanogaster, derived from seven geographically diverse populations. We found line-specific neORFs in abundance but few neORFs shared by lines, suggesting a rapid turnover. Gain and loss of transcription is more frequent than the creation of ORFs, for example, by forming new start and stop codons. Consequently, the gain of ORFs becomes rate limiting and is frequently the initial step in neORFs emergence. Furthermore, transposable elements (TEs) are major drivers for intragenomic duplications of neORFs, yet TE insertions are less important for the emergence of neORFs. However, highly mutable genomic regions around TEs provide new features that enable gene birth. In conclusion, neORFs have a high birth-death rate, are rapidly purged, but surviving neORFs spread neutrally through populations and within genomes.
Collapse
Affiliation(s)
- Anna Grandchamp
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany;
| | - Lucas Kühl
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Marie Lebherz
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - Kathrin Brüggemann
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
| | - John Parsch
- Division of Evolutionary Biology, Faculty of Biology, Ludwig-Maximilians-Universität München, 82152 Munich, Germany
| | - Erich Bornberg-Bauer
- Institute for Evolution and Biodiversity, University of Münster, 48149 Münster, Germany
- Max Planck Institute for Biology Tübingen, Department of Protein Evolution, 72076 Tübingen, Germany
| |
Collapse
|
9
|
Di Giorgio E, Benetti R, Kerschbamer E, Xodo L, Brancolini C. Super-enhancer landscape rewiring in cancer: The epigenetic control at distal sites. INTERNATIONAL REVIEW OF CELL AND MOLECULAR BIOLOGY 2023; 380:97-148. [PMID: 37657861 DOI: 10.1016/bs.ircmb.2023.03.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/03/2023]
Abstract
Super-enhancers evolve as elements at the top of the hierarchical control of gene expression. They are important end-gatherers of signaling pathways that control stemness, differentiation or adaptive responses. Many epigenetic regulations focus on these regions, and not surprisingly, during the process of tumorigenesis, various alterations can account for their dysfunction. Super-enhancers are emerging as key drivers of the aberrant gene expression landscape that sustain the aggressiveness of cancer cells. In this review, we will describe and discuss about the structure of super-enhancers, their epigenetic regulation, and the major changes affecting their functionality in cancer.
Collapse
Affiliation(s)
- Eros Di Giorgio
- Laboratory of Biochemistry, Department of Medicine, Università degli Studi di Udine, Udine, Italy
| | - Roberta Benetti
- Laboratory of Epigenomics, Department of Medicine, Università degli Studi di Udine, Udine, Italy
| | - Emanuela Kerschbamer
- Laboratory of Epigenomics, Department of Medicine, Università degli Studi di Udine, Udine, Italy
| | - Luigi Xodo
- Laboratory of Biochemistry, Department of Medicine, Università degli Studi di Udine, Udine, Italy
| | - Claudio Brancolini
- Laboratory of Epigenomics, Department of Medicine, Università degli Studi di Udine, Udine, Italy.
| |
Collapse
|
10
|
Evolution and implications of de novo genes in humans. Nat Ecol Evol 2023:10.1038/s41559-023-02014-y. [PMID: 36928843 DOI: 10.1038/s41559-023-02014-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2022] [Accepted: 02/06/2023] [Indexed: 03/18/2023]
Abstract
Genes and translated open reading frames (ORFs) that emerged de novo from previously non-coding sequences provide species with opportunities for adaptation. When aberrantly activated, some human-specific de novo genes and ORFs have disease-promoting properties-for instance, driving tumour growth. Thousands of putative de novo coding sequences have been described in humans, but we still do not know what fraction of those ORFs has readily acquired a function. Here, we discuss the challenges and controversies surrounding the detection, mechanisms of origin, annotation, validation and characterization of de novo genes and ORFs. Through manual curation of literature and databases, we provide a thorough table with most de novo genes reported for humans to date. We re-evaluate each locus by tracing the enabling mutations and list proposed disease associations, protein characteristics and supporting evidence for translation and protein detection. This work will support future explorations of de novo genes and ORFs in humans.
Collapse
|
11
|
Lee HK, Liu C, Hennighausen L. A cytokine-responsive promoter is required for distal enhancer function mediating the hundreds-fold increase in milk protein gene expression during lactation. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.02.06.527375. [PMID: 36945539 PMCID: PMC10028739 DOI: 10.1101/2023.02.06.527375] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
Abstract
During lactation, specialized cells in the mammary gland produce milk to nourish the young. Milk protein genes are controlled by distal enhancers activating expression several hundred-fold during lactation. However, the role of promoter elements is not understood. We addressed this issue using the Csn2 gene, which accounts for 10% of mRNA in mammary tissue. We identified STAT5 and other mammary transcription factors binding to three distal candidate enhancers and a cytokine-response promoter element. While deletion of the enhancers or the introduction of an inactivating mutation in a single promoter element had a marginable effect, their combined loss led to a 99.99% reduction of Csn2 expression. Our findings reveal the essential role of a promoter element in the exceptional activation of a milk protein gene and highlight the importance of analyzing regulatory elements in their native genomic context to fully understand the multifaceted functions of enhancer clusters and promoters.
Collapse
Affiliation(s)
- Hye Kyung Lee
- Laboratory of Genetics and Physiology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Chengyu Liu
- Transgenic Core, National Heart, Lung, and Blood Institute, US National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Lothar Hennighausen
- Laboratory of Genetics and Physiology, National Institute of Diabetes and Digestive and Kidney Diseases, US National Institutes of Health, Bethesda, Maryland 20892, USA
| |
Collapse
|
12
|
Galupa R, Alvarez-Canales G, Borst NO, Fuqua T, Gandara L, Misunou N, Richter K, Alves MRP, Karumbi E, Perkins ML, Kocijan T, Rushlow CA, Crocker J. Enhancer architecture and chromatin accessibility constrain phenotypic space during Drosophila development. Dev Cell 2023; 58:51-62.e4. [PMID: 36626871 PMCID: PMC9860173 DOI: 10.1016/j.devcel.2022.12.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 10/18/2022] [Accepted: 12/07/2022] [Indexed: 01/11/2023]
Abstract
Developmental enhancers bind transcription factors and dictate patterns of gene expression during development. Their molecular evolution can underlie phenotypical evolution, but the contributions of the evolutionary pathways involved remain little understood. Here, using mutation libraries in Drosophila melanogaster embryos, we observed that most point mutations in developmental enhancers led to changes in gene expression levels but rarely resulted in novel expression outside of the native pattern. In contrast, random sequences, often acting as developmental enhancers, drove expression across a range of cell types; random sequences including motifs for transcription factors with pioneer activity acted as enhancers even more frequently. Our findings suggest that the phenotypic landscapes of developmental enhancers are constrained by enhancer architecture and chromatin accessibility. We propose that the evolution of existing enhancers is limited in its capacity to generate novel phenotypes, whereas the activity of de novo elements is a primary source of phenotypic novelty.
Collapse
Affiliation(s)
- Rafael Galupa
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| | | | | | - Timothy Fuqua
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lautaro Gandara
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Natalia Misunou
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Kerstin Richter
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Esther Karumbi
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Tin Kocijan
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Justin Crocker
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| |
Collapse
|
13
|
An NA, Zhang J, Mo F, Luan X, Tian L, Shen QS, Li X, Li C, Zhou F, Zhang B, Ji M, Qi J, Zhou WZ, Ding W, Chen JY, Yu J, Zhang L, Shu S, Hu B, Li CY. De novo genes with an lncRNA origin encode unique human brain developmental functionality. Nat Ecol Evol 2023; 7:264-278. [PMID: 36593289 PMCID: PMC9911349 DOI: 10.1038/s41559-022-01925-6] [Citation(s) in RCA: 13] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2021] [Accepted: 10/04/2022] [Indexed: 01/03/2023]
Abstract
Human de novo genes can originate from neutral long non-coding RNA (lncRNA) loci and are evolutionarily significant in general, yet how and why this all-or-nothing transition to functionality happens remains unclear. Here, in 74 human/hominoid-specific de novo genes, we identified distinctive U1 elements and RNA splice-related sequences accounting for RNA nuclear export, differentiating mRNAs from lncRNAs, and driving the origin of de novo genes from lncRNA loci. The polymorphic sites facilitating the lncRNA-mRNA conversion through regulating nuclear export are selectively constrained, maintaining a boundary that differentiates mRNAs from lncRNAs. The functional new genes actively passing through it thus showed a mode of pre-adaptive origin, in that they acquire functions along with the achievement of their coding potential. As a proof of concept, we verified the regulations of splicing and U1 recognition on the nuclear export efficiency of one of these genes, the ENSG00000205704, in human neural progenitor cells. Notably, knock-out or over-expression of this gene in human embryonic stem cells accelerates or delays the neuronal maturation of cortical organoids, respectively. The transgenic mice with ectopically expressed ENSG00000205704 showed enlarged brains with cortical expansion. We thus demonstrate the key roles of nuclear export in de novo gene origin. These newly originated genes should reflect the novel uniqueness of human brain development.
Collapse
Affiliation(s)
- Ni A. An
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Jie Zhang
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Fan Mo
- grid.9227.e0000000119573309State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, China
| | - Xuke Luan
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Lu Tian
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Qing Sunny Shen
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Xiangshang Li
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Chunqiong Li
- grid.510934.a0000 0005 0398 4153Chinese Institute for Brain Research, Beijing, China
| | - Fanqi Zhou
- grid.506261.60000 0001 0706 7839State Key Laboratory of Medical Molecular Biology, Key Laboratory of RNA Regulation and Hematopoiesis, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, CAMS and Peking Union Medical College, Beijing, China
| | - Boya Zhang
- grid.9227.e0000000119573309State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, China
| | - Mingjun Ji
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Jianhuan Qi
- grid.9227.e0000000119573309State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China ,grid.410726.60000 0004 1797 8419University of Chinese Academy of Sciences, Beijing, China
| | - Wei-Zhen Zhou
- grid.415105.40000 0004 9430 5605State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Wanqiu Ding
- grid.11135.370000 0001 2256 9319Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China
| | - Jia-Yu Chen
- grid.41156.370000 0001 2314 964XState Key Laboratory of Pharmaceutical Biotechnology, School of Life Sciences, Chemistry and Biomedicine Innovation Center (ChemBIC), Nanjing University, Nanjing, China
| | - Jia Yu
- grid.506261.60000 0001 0706 7839State Key Laboratory of Medical Molecular Biology, Key Laboratory of RNA Regulation and Hematopoiesis, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, CAMS and Peking Union Medical College, Beijing, China
| | - Li Zhang
- grid.510934.a0000 0005 0398 4153Chinese Institute for Brain Research, Beijing, China
| | - Shaokun Shu
- grid.11135.370000 0001 2256 9319Peking University International Cancer Institute, Beijing, China
| | - Baoyang Hu
- State Key Laboratory of Stem Cell and Reproductive Biology, Institute of Stem Cell and Regeneration, Institute of Zoology, Chinese Academy of Sciences, Beijing, China. .,University of Chinese Academy of Sciences, Beijing, China.
| | - Chuan-Yun Li
- Laboratory of Bioinformatics and Genomic Medicine, Institute of Molecular Medicine, Peking University, Beijing, China. .,Chinese Institute for Brain Research, Beijing, China.
| |
Collapse
|
14
|
Song H, Guo Z, Zhang X, Sui J. De novo genes in Arachis hypogaea cv. Tifrunner: systematic identification, molecular evolution, and potential contributions to cultivated peanut. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 111:1081-1095. [PMID: 35748398 DOI: 10.1111/tpj.15875] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/08/2021] [Revised: 06/15/2022] [Accepted: 06/21/2022] [Indexed: 06/15/2023]
Abstract
De novo genes are derived from non-coding sequences, and they can play essential roles in organisms. Cultivated peanut (Arachis hypogaea) is a major oil and protein crop derived from a cross between Arachis duranensis and Arachis ipaensis. However, few de novo genes have been documented in Arachis. Here, we identified 381 de novo genes in A. hypogaea cv. Tifrunner based on comparison with five closely related Arachis species. There are distinct differences in gene expression patterns and gene structures between conserved and de novo genes. The identified de novo genes originated from ancestral sequence regions associated with metabolic and biosynthetic processes, and they were subsequently integrated into existing regulatory networks. De novo paralogs and homoeologs were identified in A. hypogaea cv. Tifrunner. De novo paralogs and homoeologs with conserved expression have mismatching cis-acting elements under normal growth conditions. De novo genes potentially have pluripotent functions in responses to biotic stresses as well as in growth and development based on quantitative trait locus data. This work provides a foundation for future research examining gene birth processes and gene function in Arachis and related taxa.
Collapse
Affiliation(s)
- Hui Song
- Grassland Agri-husbandry Research Center, College of Grassland Science, Qingdao Agricultural University, Qingdao, China
| | - Zhonglong Guo
- State Key Laboratory of Protein and Plant Gene Research, Peking-Tsinghua Center for Life Sciences, School of Life Sciences and School of Advanced Agricultural Sciences, Peking University, Beijing, China
| | - Xiaojun Zhang
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| | - Jiongming Sui
- College of Agronomy, Qingdao Agricultural University, Qingdao, China
| |
Collapse
|
15
|
Raxwal VK, Singh S, Agarwal M, Riha K. Transcriptional and post-transcriptional regulation of young genes in plants. BMC Biol 2022; 20:134. [PMID: 35676681 PMCID: PMC9178820 DOI: 10.1186/s12915-022-01339-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/30/2022] [Indexed: 12/03/2022] Open
Abstract
Background New genes continuously emerge from non-coding DNA or by diverging from existing genes, but most of them are rapidly lost and only a few become fixed within the population. We hypothesized that young genes are subject to transcriptional and post-transcriptional regulation to limit their expression and minimize their exposure to purifying selection. Results We performed a protein-based homology search across the tree of life to determine the evolutionary age of protein-coding genes present in the rice genome. We found that young genes in rice have relatively low expression levels, which can be attributed to distal enhancers, and closed chromatin conformation at their transcription start sites (TSS). The chromatin in TSS regions can be re-modeled in response to abiotic stress, indicating conditional expression of young genes. Furthermore, transcripts of young genes in Arabidopsis tend to be targeted by nonsense-mediated RNA decay, presenting another layer of regulation limiting their expression. Conclusions These data suggest that transcriptional and post-transcriptional mechanisms contribute to the conditional expression of young genes, which may alleviate purging selection while providing an opportunity for phenotypic exposure and functionalization. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01339-7.
Collapse
Affiliation(s)
- Vivek Kumar Raxwal
- Department of Botany, University of Delhi, Delhi, 110007, India. .,Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| | - Somya Singh
- Department of Botany, University of Delhi, Delhi, 110007, India
| | - Manu Agarwal
- Department of Botany, University of Delhi, Delhi, 110007, India.
| | - Karel Riha
- Central European Institute of Technology (CEITEC), Masaryk University, Brno, Czech Republic.
| |
Collapse
|
16
|
New Genomic Signals Underlying the Emergence of Human Proto-Genes. Genes (Basel) 2022; 13:genes13020284. [PMID: 35205330 PMCID: PMC8871994 DOI: 10.3390/genes13020284] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Revised: 01/20/2022] [Accepted: 01/24/2022] [Indexed: 12/04/2022] Open
Abstract
De novo genes are novel genes which emerge from non-coding DNA. Until now, little is known about de novo genes’ properties, correlated to their age and mechanisms of emergence. In this study, we investigate four related properties: introns, upstream regulatory motifs, 5′ Untranslated regions (UTRs) and protein domains, in 23,135 human proto-genes. We found that proto-genes contain introns, whose number and position correlates with the genomic position of proto-gene emergence. The origin of these introns is debated, as our results suggest that 41% of proto-genes might have captured existing introns, and 13.7% of them do not splice the ORF. We show that proto-genes which emerged via overprinting tend to be more enriched in core promotor motifs, while intergenic and intronic genes are more enriched in enhancers, even if the TATA motif is most commonly found upstream in these genes. Intergenic and intronic 5′ UTRs of proto-genes have a lower potential to stabilise mRNA structures than exonic proto-genes and established human genes. Finally, we confirm that proteins expressed by proto-genes gain new putative domains with age. Overall, we find that regulatory motifs inducing transcription and translation of previously non-coding sequences may facilitate proto-gene emergence. Our study demonstrates that introns, 5′ UTRs, and domains have specific properties in proto-genes. We also emphasize that the genomic positions of de novo genes strongly impacts these properties.
Collapse
|
17
|
Abstract
Modern genome-scale methods that identify new genes, such as proteogenomics and ribosome profiling, have revealed, to the surprise of many, that overlap in genes, open reading frames and even coding sequences is widespread and functionally integrated into prokaryotic, eukaryotic and viral genomes. In parallel, the constraints that overlapping regions place on genome sequences and their evolution can be harnessed in bioengineering to build more robust synthetic strains and constructs. With a focus on overlapping protein-coding and RNA-coding genes, this Review examines their discovery, topology and biogenesis in the context of their genome biology. We highlight exciting new uses for sequence overlap to control translation, compress synthetic genetic constructs, and protect against mutation.
Collapse
|
18
|
Cherezov RO, Vorontsova JE, Simonova OB. The Phenomenon of Evolutionary “De Novo Generation” of Genes. Russ J Dev Biol 2021. [DOI: 10.1134/s1062360421060035] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
19
|
Santiago-Algarra D, Souaid C, Singh H, Dao LTM, Hussain S, Medina-Rivera A, Ramirez-Navarro L, Castro-Mondragon JA, Sadouni N, Charbonnier G, Spicuglia S. Epromoters function as a hub to recruit key transcription factors required for the inflammatory response. Nat Commun 2021; 12:6660. [PMID: 34795220 PMCID: PMC8602369 DOI: 10.1038/s41467-021-26861-0] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 10/14/2021] [Indexed: 12/14/2022] Open
Abstract
Gene expression is controlled by the involvement of gene-proximal (promoters) and distal (enhancers) regulatory elements. Our previous results demonstrated that a subset of gene promoters, termed Epromoters, work as bona fide enhancers and regulate distal gene expression. Here, we hypothesized that Epromoters play a key role in the coordination of rapid gene induction during the inflammatory response. Using a high-throughput reporter assay we explored the function of Epromoters in response to type I interferon. We find that clusters of IFNa-induced genes are frequently associated with Epromoters and that these regulatory elements preferentially recruit the STAT1/2 and IRF transcription factors and distally regulate the activation of interferon-response genes. Consistently, we identified and validated the involvement of Epromoter-containing clusters in the regulation of LPS-stimulated macrophages. Our findings suggest that Epromoters function as a local hub recruiting the key TFs required for coordinated regulation of gene clusters during the inflammatory response.
Collapse
Affiliation(s)
- David Santiago-Algarra
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
| | - Charbel Souaid
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
| | - Himanshu Singh
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
| | - Lan T M Dao
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
- Vinmec Research Institute of Stem cell and Gene technology, Vinmec Healthcare System, Hanoi, Vietnam
| | - Saadat Hussain
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
| | - Alejandra Medina-Rivera
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Mexico
| | - Lucia Ramirez-Navarro
- Laboratorio Internacional de Investigación sobre el Genoma Humano, Universidad Nacional Autónoma de México, Juriquilla, Mexico
| | - Jaime A Castro-Mondragon
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Centre for Molecular Medicine Norway (NCMM), Nordic EMBL Partnership, University of Oslo, 0318, Oslo, Norway
| | - Nori Sadouni
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
| | - Guillaume Charbonnier
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France
- Equipe Labellisée Ligue Contre le Cancer, Paris, France
| | - Salvatore Spicuglia
- Aix-Marseille University, INSERM, TAGC, UMR 1090, Marseille, France.
- Equipe Labellisée Ligue Contre le Cancer, Paris, France.
| |
Collapse
|
20
|
Yates TB, Feng K, Zhang J, Singan V, Jawdy SS, Ranjan P, Abraham PE, Barry K, Lipzen A, Pan C, Schmutz J, Chen JG, Tuskan GA, Muchero W. The Ancient Salicoid Genome Duplication Event: A Platform for Reconstruction of De Novo Gene Evolution in Populus trichocarpa. Genome Biol Evol 2021; 13:evab198. [PMID: 34469536 PMCID: PMC8445398 DOI: 10.1093/gbe/evab198] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/22/2021] [Indexed: 12/13/2022] Open
Abstract
Orphan genes are characteristic genomic features that have no detectable homology to genes in any other species and represent an important attribute of genome evolution as sources of novel genetic functions. Here, we identified 445 genes specific to Populus trichocarpa. Of these, we performed deeper reconstruction of 13 orphan genes to provide evidence of de novo gene evolution. Populus and its sister genera Salix are particularly well suited for the study of orphan gene evolution because of the Salicoid whole-genome duplication event which resulted in highly syntenic sister chromosomal segments across the Salicaceae. We leveraged this genomic feature to reconstruct de novo gene evolution from intergenera, interspecies, and intragenomic perspectives by comparing the syntenic regions within the P. trichocarpa reference, then P. deltoides, and finally Salix purpurea. Furthermore, we demonstrated that 86.5% of the putative orphan genes had evidence of transcription. Additionally, we also utilized the Populus genome-wide association mapping panel, a collection of 1,084 undomesticated P. trichocarpa genotypes to further determine putative regulatory networks of orphan genes using expression quantitative trait loci (eQTL) mapping. Functional enrichment of these eQTL subnetworks identified common biological themes associated with orphan genes such as response to stress and defense response. We also identify a putative cis-element for a de novo gene and leverage conserved synteny to describe evolution of a putative transcription factor binding site. Overall, 45% of orphan genes were captured in trans-eQTL networks.
Collapse
Affiliation(s)
- Timothy B Yates
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Kai Feng
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Jin Zhang
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Vasanth Singan
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Sara S Jawdy
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Priya Ranjan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Paul E Abraham
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Kerrie Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Anna Lipzen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | - Chongle Pan
- School of Computer Science and Department of Microbiology and Plant Biology, University of Oklahoma, Norman, Oklahoma, USA
| | - Jeremy Schmutz
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory, Berkeley, California, USA
- HudsonAlpha Institute for Biotechnology, Huntsville, Alabama, USA
| | - Jin-Gui Chen
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Gerald A Tuskan
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| | - Wellington Muchero
- Bredesen Center for Interdisciplinary Research, University of Tennessee, Knoxville, Tennessee, USA
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
- Center for Bioenergy Innovation, Oak Ridge, Tennessee, USA
| |
Collapse
|
21
|
Abstract
Because gene expression is important for evolutionary adaptation, its misregulation is an important cause of maladaptation. A misregulated gene can be incorrectly silent ("off") when a transcription factor (TF) that is required for its activation does not binds its regulatory region. Conversely, a misregulated gene can be incorrectly active ("on") when a TF not normally involved in its activation binds its regulatory region, a phenomenon also known as regulatory crosstalk. DNA mutations that destroy or create TF binding sites on DNA are an important source of misregulation and crosstalk. Although misregulation reduces fitness in an environment to which an organism is well-adapted, it may become adaptive in a new environment. Here, I derive simple yet general mathematical expressions that delimit the conditions under which misregulation can be adaptive. These expressions depend on the strength of selection against misregulation, on the fraction of DNA sequence space filled with TF binding sites, and on the fraction of genes that must be expressed for optimal adaptation. I then use empirical data from RNA sequencing, protein-binding microarrays, and genome evolution, together with population genetic simulations to ask when these conditions are likely to be met. I show that they can be met under realistic circumstances, but these circumstances may vary among organisms and environments. My analysis provides a framework in which improved theory and data collection can help us demonstrate the role of misregulation in adaptation. It also shows that misregulation, like DNA mutation, is one of life's many imperfections that can help propel Darwinian evolution.
Collapse
Affiliation(s)
- Andreas Wagner
- Department of Evolutionary Biology and Environmental Studies, University of Zurich, Zurich, CH-8057, Switzerland.,The Santa Fe Institute, Santa Fe, NM 87501, USA.,Swiss Institute of Bioinformatics, Lausanne, Switzerland
| |
Collapse
|
22
|
Hata T, Takada N, Hayakawa C, Kazama M, Uchikoba T, Tachikawa M, Matsuo M, Satoh S, Obokata J. De novo activated transcription of inserted foreign coding sequences is inheritable in the plant genome. PLoS One 2021; 16:e0252674. [PMID: 34111139 PMCID: PMC8191969 DOI: 10.1371/journal.pone.0252674] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2020] [Accepted: 05/19/2021] [Indexed: 01/16/2023] Open
Abstract
The manner in which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome is poorly understood. To examine such processes of gene evolution, we performed an artificial evolutionary experiment in Arabidopsis thaliana. As a model of gene-birth events, we introduced a promoterless coding sequence of the firefly luciferase (LUC) gene and established 386 T2-generation transgenic lines. Among them, we determined the individual LUC insertion loci in 76 lines and found that one-third of them were transcribed de novo even in the intergenic or inherently unexpressed regions. In the transcribed lines, transcription-related chromatin marks were detected across the newly activated transcribed regions. These results agreed with our previous findings in A. thaliana cultured cells under a similar experimental scheme. A comparison of the results of the T2-plant and cultured cell experiments revealed that the de novo-activated transcription concomitant with local chromatin remodelling was inheritable. During one-generation inheritance, it seems likely that the transcription activities of the LUC inserts trapped by the endogenous genes/transcripts became stronger, while those of de novo transcription in the intergenic/untranscribed regions became weaker. These findings may offer a clue for the elucidation of the mechanism by which inserted foreign coding sequences become transcriptionally activated and fixed in the plant genome.
Collapse
Affiliation(s)
- Takayuki Hata
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Naoto Takada
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Chihiro Hayakawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mei Kazama
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Tomohiro Uchikoba
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Makoto Tachikawa
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Mitsuhiro Matsuo
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| | - Soichirou Satoh
- Graduate School of Life and Environfmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
- Faculty of Life and Environmental Sciences, Kyoto Prefectural University, Kyoto-shi, Kyoto, Japan
| | - Junichi Obokata
- Faculty of Agriculture, Setsunan University, Hirakata-shi, Osaka, Japan
| |
Collapse
|
23
|
Witt E, Svetec N, Benjamin S, Zhao L. Transcription Factors Drive Opposite Relationships between Gene Age and Tissue Specificity in Male and Female Drosophila Gonads. Mol Biol Evol 2021; 38:2104-2115. [PMID: 33481021 PMCID: PMC8097261 DOI: 10.1093/molbev/msab011] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Evolutionarily young genes are usually preferentially expressed in the testis across species. Although it is known that older genes are generally more broadly expressed than younger genes, the properties that shaped this pattern are unknown. Older genes may gain expression across other tissues uniformly, or faster in certain tissues than others. Using Drosophila gene expression data, we confirmed previous findings that younger genes are disproportionately testis biased and older genes are disproportionately ovary biased. We found that the relationship between gene age and expression is stronger in the ovary than any other tissue and weakest in testis. We performed ATAC-seq on Drosophila testis and found that although genes of all ages are more likely to have open promoter chromatin in testis than in ovary, promoter chromatin alone does not explain the ovary bias of older genes. Instead, we found that upstream transcription factor (TF) expression is highly predictive of gene expression in ovary but not in testis. In the ovary, TF expression is more predictive of gene expression than open promoter chromatin, whereas testis gene expression is similarly influenced by both TF expression and open promoter chromatin. We propose that the testis is uniquely able to express younger genes controlled by relatively few TFs, whereas older genes with more TF partners are broadly expressed with peak expression most likely in the ovary. The testis allows widespread baseline expression that is relatively unresponsive to regulatory changes, whereas the ovary transcriptome is more responsive to trans-regulation and has a higher ceiling for gene expression.
Collapse
Affiliation(s)
- Evan Witt
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, USA
| |
Collapse
|
24
|
Uncovering de novo gene birth in yeast using deep transcriptomics. Nat Commun 2021; 12:604. [PMID: 33504782 PMCID: PMC7841160 DOI: 10.1038/s41467-021-20911-3] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2019] [Accepted: 01/04/2021] [Indexed: 01/30/2023] Open
Abstract
De novo gene origination has been recently established as an important mechanism for the formation of new genes. In organisms with a large genome, intergenic and intronic regions provide plenty of raw material for new transcriptional events to occur, but little is know about how de novo transcripts originate in more densely-packed genomes. Here, we identify 213 de novo originated transcripts in Saccharomyces cerevisiae using deep transcriptomics and genomic synteny information from multiple yeast species grown in two different conditions. We find that about half of the de novo transcripts are expressed from regions which already harbor other genes in the opposite orientation; these transcripts show similar expression changes in response to stress as their overlapping counterparts, and some appear to translate small proteins. Thus, a large fraction of de novo genes in yeast are likely to co-evolve with already existing genes.
Collapse
|
25
|
Bylino OV, Ibragimov AN, Shidlovskii YV. Evolution of Regulated Transcription. Cells 2020; 9:E1675. [PMID: 32664620 PMCID: PMC7408454 DOI: 10.3390/cells9071675] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2020] [Revised: 07/07/2020] [Accepted: 07/10/2020] [Indexed: 12/12/2022] Open
Abstract
The genomes of all organisms abound with various cis-regulatory elements, which control gene activity. Transcriptional enhancers are a key group of such elements in eukaryotes and are DNA regions that form physical contacts with gene promoters and precisely orchestrate gene expression programs. Here, we follow gradual evolution of this regulatory system and discuss its features in different organisms. In eubacteria, an enhancer-like element is often a single regulatory element, is usually proximal to the core promoter, and is occupied by one or a few activators. Activation of gene expression in archaea is accompanied by the recruitment of an activator to several enhancer-like sites in the upstream promoter region. In eukaryotes, activation of expression is accompanied by the recruitment of activators to multiple enhancers, which may be distant from the core promoter, and the activators act through coactivators. The role of the general DNA architecture in transcription control increases in evolution. As a whole, it can be seen that enhancers of multicellular eukaryotes evolved from the corresponding prototypic enhancer-like regulatory elements with the gradually increasing genome size of organisms.
Collapse
Affiliation(s)
- Oleg V. Bylino
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., 119334 Moscow, Russia; (O.V.B.); (A.N.I.)
| | - Airat N. Ibragimov
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., 119334 Moscow, Russia; (O.V.B.); (A.N.I.)
- Center for Precision Genome Editing and Genetic Technologies for Biomedicine, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., 119334 Moscow, Russia
| | - Yulii V. Shidlovskii
- Laboratory of Gene Expression Regulation in Development, Institute of Gene Biology, Russian Academy of Sciences, 34/5 Vavilov St., 119334 Moscow, Russia; (O.V.B.); (A.N.I.)
- I.M. Sechenov First Moscow State Medical University, 8, bldg. 2 Trubetskaya St., 119048 Moscow, Russia
| |
Collapse
|