1
|
Yang Y, Wen X, Wu Z, Wang K, Zhu Y. Large-scale long terminal repeat insertions produced a significant set of novel transcripts in cotton. SCIENCE CHINA. LIFE SCIENCES 2023; 66:1711-1724. [PMID: 37079218 DOI: 10.1007/s11427-022-2341-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Accepted: 04/03/2023] [Indexed: 04/21/2023]
Abstract
Genomic analysis has revealed that the 1,637-Mb Gossypium arboreum genome contains approximately 81% transposable elements (TEs), while only 57% of the 735-Mb G. raimondii genome is occupied by TEs. In this study, we investigated whether there were unknown transcripts associated with TE or TE fragments and, if so, how these new transcripts were evolved and regulated. As sequence depths increased from 4 to 100 G, a total of 10,284 novel intergenic transcripts (intergenic genes) were discovered. On average, approximately 84% of these intergenic transcripts possibly overlapped with the long terminal repeat (LTR) insertions in the otherwise untranscribed intergenic regions and were expressed at relatively low levels. Most of these intergenic transcripts possessed no transcription activation markers, while the majority of the regular genic genes possessed at least one such marker. Genes without transcription activation markers formed their+1 and -1 nucleosomes more closely (only (117±1.4)bp apart), while twice as big spaces (approximately (403.5±46.0) bp apart) were detected for genes with the activation markers. The analysis of 183 previously assembled genomes across three different kingdoms demonstrated systematically that intergenic transcript numbers in a given genome correlated positively with its LTR content. Evolutionary analysis revealed that genic genes originated during one of the whole-genome duplication events around 137.7 million years ago (MYA) for all eudicot genomes or 13.7 MYA for the Gossypium family, respectively, while the intergenic transcripts evolved around 1.6 MYA, resultant of the last LTR insertion. The characterization of these low-transcribed intergenic transcripts can facilitate our understanding of the potential biological roles played by LTRs during speciation and diversifications.
Collapse
Affiliation(s)
- Yan Yang
- Institute for Advanced Studies, Wuhan University, Wuhan, 430072, China
| | - Xingpeng Wen
- Institute for Advanced Studies, Wuhan University, Wuhan, 430072, China
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Zhiguo Wu
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Kun Wang
- College of Life Sciences, Wuhan University, Wuhan, 430072, China
| | - Yuxian Zhu
- Institute for Advanced Studies, Wuhan University, Wuhan, 430072, China.
- College of Life Sciences, Wuhan University, Wuhan, 430072, China.
- Hubei Hongshan Laboratory, Wuhan, 430072, China.
- TaiKang Center for Life and Medical Sciences, RNA Institute, Remin Hospital, Wuhan University, Wuhan, 430072, China.
| |
Collapse
|
2
|
Pradhan RK, Ramakrishna W. Transposons: Unexpected players in cancer. Gene 2022; 808:145975. [PMID: 34592349 DOI: 10.1016/j.gene.2021.145975] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2021] [Revised: 09/19/2021] [Accepted: 09/24/2021] [Indexed: 12/21/2022]
Abstract
Transposons are repetitive DNA sequences encompassing about half of the human genome. They play a vital role in genome stability maintenance and contribute to genomic diversity and evolution. Their activity is regulated by various mechanisms considering the deleterious effects of these mobile elements. Various genetic risk factors and environmental stress conditions affect the regulatory pathways causing alteration of transposon expression. Our knowledge of the biological role of transposons is limited especially in various types of cancers. Retrotransposons of different types (LTR-retrotransposons, LINEs and SINEs) regulate a plethora of genes that have a role in cell reprogramming, tumor suppression, cell cycle, apoptosis, cell adhesion and migration, and DNA repair. The regulatory mechanisms of transposons, their deregulation and different mechanisms underlying transposon-mediated carcinogenesis in humans focusing on the three most prevalent types, lung, breast and colorectal cancers, were reviewed. The modes of regulation employed include alternative splicing, deletion, insertion, duplication in genes and promoters resulting in upregulation, downregulation or silencing of genes.
Collapse
|
3
|
Riba A, Fumagalli MR, Caselle M, Osella M. A Model-Driven Quantitative Analysis of Retrotransposon Distributions in the Human Genome. Genome Biol Evol 2021; 12:2045-2059. [PMID: 32986810 PMCID: PMC7750997 DOI: 10.1093/gbe/evaa201] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/19/2020] [Indexed: 12/21/2022] Open
Abstract
Retrotransposons, DNA sequences capable of creating copies of themselves, compose about half of the human genome and played a central role in the evolution of mammals. Their current position in the host genome is the result of the retrotranscription process and of the following host genome evolution. We apply a model from statistical physics to show that the genomic distribution of the two most populated classes of retrotransposons in human deviates from random placement, and that this deviation increases with time. The time dependence suggests a major role of the host genome dynamics in shaping the current retrotransposon distributions. Focusing on a neutral scenario, we show that a simple model based on random placement followed by genome expansion and sequence duplications can reproduce the empirical retrotransposon distributions, even though more complex and possibly selective mechanisms can have contributed. Besides the inherent interest in understanding the origin of current retrotransposon distributions, this work sets a general analytical framework to analyze quantitatively the effects of genome evolutionary dynamics on the distribution of genomic elements.
Collapse
Affiliation(s)
| | - Maria Rita Fumagalli
- Institute of Biophysics - CNR, National Research Council, Genova, Italy.,Department of Environmental Science and Policy, Center for Complexity and Biosystems, University of Milan, Milano, Italy
| | - Michele Caselle
- Department of Physics and INFN, University of Torino, Torino, Italy
| | - Matteo Osella
- Department of Physics and INFN, University of Torino, Torino, Italy
| |
Collapse
|
4
|
Ali A, Han K, Liang P. Role of Transposable Elements in Gene Regulation in the Human Genome. Life (Basel) 2021; 11:118. [PMID: 33557056 PMCID: PMC7913837 DOI: 10.3390/life11020118] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 01/28/2021] [Accepted: 02/02/2021] [Indexed: 02/07/2023] Open
Abstract
Transposable elements (TEs), also known as mobile elements (MEs), are interspersed repeats that constitute a major fraction of the genomes of higher organisms. As one of their important functional impacts on gene function and genome evolution, TEs participate in regulating the expression of genes nearby and even far away at transcriptional and post-transcriptional levels. There are two known principal ways by which TEs regulate the expression of genes. First, TEs provide cis-regulatory sequences in the genome with their intrinsic regulatory properties for their own expression, making them potential factors for regulating the expression of the host genes. TE-derived cis-regulatory sites are found in promoter and enhancer elements, providing binding sites for a wide range of trans-acting factors. Second, TEs encode for regulatory RNAs with their sequences showed to be present in a substantial fraction of miRNAs and long non-coding RNAs (lncRNAs), indicating the TE origin of these RNAs. Furthermore, TEs sequences were found to be critical for regulatory functions of these RNAs, including binding to the target mRNA. TEs thus provide crucial regulatory roles by being part of cis-regulatory and regulatory RNA sequences. Moreover, both TE-derived cis-regulatory sequences and TE-derived regulatory RNAs have been implicated in providing evolutionary novelty to gene regulation. These TE-derived regulatory mechanisms also tend to function in a tissue-specific fashion. In this review, we aim to comprehensively cover the studies regarding these two aspects of TE-mediated gene regulation, mainly focusing on the mechanisms, contribution of different types of TEs, differential roles among tissue types, and lineage-specificity, based on data mostly in humans.
Collapse
Affiliation(s)
- Arsala Ali
- Department of Biological Sciences, Brock University, St. Catharines, ON L2S 3A1, Canada;
| | - Kyudong Han
- Department of Microbiology, Dankook University, Cheonan 31116, Korea;
- Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Korea
| | - Ping Liang
- Department of Biological Sciences, Brock University, St. Catharines, ON L2S 3A1, Canada;
- Centre of Biotechnologies, Brock University, St. Catharines, ON L2S 3A1, Canada
| |
Collapse
|
5
|
Nishihara H. Retrotransposons spread potential cis-regulatory elements during mammary gland evolution. Nucleic Acids Res 2020; 47:11551-11562. [PMID: 31642473 PMCID: PMC7145552 DOI: 10.1093/nar/gkz1003] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2018] [Revised: 10/14/2019] [Accepted: 10/17/2019] [Indexed: 12/18/2022] Open
Abstract
Acquisition of cis-elements is a major driving force for rewiring a gene regulatory network. Several kinds of transposable elements (TEs), mostly retrotransposons that propagate via a copy-and-paste mechanism, are known to possess transcription factor binding motifs and have provided source sequences for enhancers/promoters. However, it remains largely unknown whether retrotransposons have spread the binding sites of master regulators of morphogenesis and accelerated cis-regulatory expansion involved in common mammalian morphological features during evolution. Here, I demonstrate that thousands of binding sites for estrogen receptor α (ERα) and three related pioneer factors (FoxA1, GATA3 and AP2γ) that are essential regulators of mammary gland development arose from a spreading of the binding motifs by retrotransposons. The TE-derived functional elements serve primarily as distal enhancers and are enriched around genes associated with mammary gland morphogenesis. The source TEs occurred via a two-phased expansion consisting of mainly L2/MIR in a eutherian ancestor and endogenous retrovirus 1 (ERV1) in simian primates and murines. Thus the build-up of potential sources for cis-elements by retrotransposons followed by their frequent utilization by the host (co-option/exaptation) may have a general accelerating effect on both establishing and diversifying a gene regulatory network, leading to morphological innovation.
Collapse
Affiliation(s)
- Hidenori Nishihara
- Department of Life Science and Technology, Tokyo Institute of Technology, 4259-S2-17, Nagatsuta-cho, Midori-ku, Yokohama, Kanagawa 226-8501, Japan
| |
Collapse
|
6
|
Sampathkumar NK, Bravo JI, Chen Y, Danthi PS, Donahue EK, Lai RW, Lu R, Randall LT, Vinson N, Benayoun BA. Widespread sex dimorphism in aging and age-related diseases. Hum Genet 2020; 139:333-356. [PMID: 31677133 PMCID: PMC7031050 DOI: 10.1007/s00439-019-02082-w] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2019] [Accepted: 10/26/2019] [Indexed: 02/07/2023]
Abstract
Although aging is a conserved phenomenon across evolutionary distant species, aspects of the aging process have been found to differ between males and females of the same species. Indeed, observations across mammalian studies have revealed the existence of longevity and health disparities between sexes, including in humans (i.e. with a female or male advantage). However, the underlying mechanisms for these sex differences in health and lifespan remain poorly understood, and it is unclear which aspects of this dimorphism stem from hormonal differences (i.e. predominance of estrogens vs. androgens) or from karyotypic differences (i.e. XX vs. XY sex chromosome complement). In this review, we discuss the state of the knowledge in terms of sex dimorphism in various aspects of aging and in human age-related diseases. Where the interplay between sex differences and age-related differences has not been explored fully, we present the state of the field to highlight important future research directions. We also discuss various dietary, drug or genetic interventions that were shown to improve longevity in a sex-dimorphic fashion. Finally, emerging tools and models that can be leveraged to decipher the mechanisms underlying sex differences in aging are also briefly discussed.
Collapse
Affiliation(s)
- Nirmal K Sampathkumar
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- Maurice Wohl Clinical Neuroscience Institute, King's College London, London, UK
| | - Juan I Bravo
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- Graduate Program in the Biology of Aging, University of Southern California, Los Angeles, CA, 90089, USA
| | - Yilin Chen
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- Masters Program in Nutrition, Healthspan, and Longevity, University of Southern California, Los Angeles, CA, 90089, USA
| | - Prakroothi S Danthi
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Erin K Donahue
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- Neuroscience Graduate Program, University of Southern California, Los Angeles, CA, 90089, USA
| | - Rochelle W Lai
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
| | - Ryan Lu
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- Graduate Program in the Biology of Aging, University of Southern California, Los Angeles, CA, 90089, USA
| | - Lewis T Randall
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA
- Graduate Program in the Biology of Aging, University of Southern California, Los Angeles, CA, 90089, USA
| | - Nika Vinson
- Department of Urology, Pelvic Medicine and Reconstructive Surgery, UCLA David Geffen School of Medicine, Los Angeles, CA, 90024, USA
| | - Bérénice A Benayoun
- Leonard Davis School of Gerontology, University of Southern California, Los Angeles, CA, 90089, USA.
- USC Norris Comprehensive Cancer Center, Epigenetics and Gene Regulation, Los Angeles, CA, 90089, USA.
- USC Stem Cell Initiative, Los Angeles, CA, 90089, USA.
| |
Collapse
|
7
|
Piggin CL, Roden DL, Law AMK, Molloy MP, Krisp C, Swarbrick A, Naylor MJ, Kalyuga M, Kaplan W, Oakes SR, Gallego-Ortega D, Clark SJ, Carroll JS, Bartonicek N, Ormandy CJ. ELF5 modulates the estrogen receptor cistrome in breast cancer. PLoS Genet 2020; 16:e1008531. [PMID: 31895944 PMCID: PMC6959601 DOI: 10.1371/journal.pgen.1008531] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 01/14/2020] [Accepted: 11/20/2019] [Indexed: 11/28/2022] Open
Abstract
Acquired resistance to endocrine therapy is responsible for half of the therapeutic failures in the treatment of breast cancer. Recent findings have implicated increased expression of the ETS transcription factor ELF5 as a potential modulator of estrogen action and driver of endocrine resistance, and here we provide the first insight into the mechanisms by which ELF5 modulates estrogen sensitivity. Using chromatin immunoprecipitation sequencing we found that ELF5 binding overlapped with FOXA1 and ER at super enhancers, enhancers and promoters, and when elevated, caused FOXA1 and ER to bind to new regions of the genome, in a pattern that replicated the alterations to the ER/FOXA1 cistrome caused by the acquisition of resistance to endocrine therapy. RNA sequencing demonstrated that these changes altered estrogen-driven patterns of gene expression, the expression of ER transcription-complex members, and 6 genes known to be involved in driving the acquisition of endocrine resistance. Using rapid immunoprecipitation mass spectrometry of endogenous proteins, and proximity ligation assays, we found that ELF5 interacted physically with members of the ER transcription complex, such as DNA-PKcs. We found 2 cases of endocrine-resistant brain metastases where ELF5 levels were greatly increased and ELF5 patterns of gene expression were enriched, compared to the matched primary tumour. Thus ELF5 alters ER-driven gene expression by modulating the ER/FOXA1 cistrome, by interacting with it, and by modulating the expression of members of the ER transcriptional complex, providing multiple mechanisms by which ELF5 can drive endocrine resistance.
Collapse
Affiliation(s)
- Catherine L. Piggin
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Daniel L. Roden
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Andrew M. K. Law
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Mark P. Molloy
- Australian Proteome Analysis Facility, Macquarie University, Sydney, Australia
| | - Christoph Krisp
- Australian Proteome Analysis Facility, Macquarie University, Sydney, Australia
| | - Alexander Swarbrick
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Matthew J. Naylor
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
- School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, Australia
| | - Maria Kalyuga
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Warren Kaplan
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Samantha R. Oakes
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - David Gallego-Ortega
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Susan J. Clark
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Jason S. Carroll
- Cancer Research UK Cambridge Research Institute, Li Ka Shing Centre Robinson Way, Cambridge, United Kingdom
| | - Nenad Bartonicek
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| | - Christopher J. Ormandy
- Garvan Institute of Medical Research and The Kinghorn Cancer Centre, Victoria Street Darlinghurst Sydney, NSW, Australia
- St Vincent’s Clinical School, Faculty of Medicine, UNSW Sydney, Australia
| |
Collapse
|
8
|
Rohrmoser M, Kluge M, Yahia Y, Gruber-Eber A, Maqbool MA, Forné I, Krebs S, Blum H, Greifenberg AK, Geyer M, Descostes N, Imhof A, Andrau JC, Friedel CC, Eick D. MIR sequences recruit zinc finger protein ZNF768 to expressed genes. Nucleic Acids Res 2019; 47:700-715. [PMID: 30476274 PMCID: PMC6344866 DOI: 10.1093/nar/gky1148] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2018] [Accepted: 10/29/2018] [Indexed: 12/16/2022] Open
Abstract
Mammalian-wide interspersed repeats (MIRs) are retrotransposed elements of mammalian genomes. Here, we report the specific binding of zinc finger protein ZNF768 to the sequence motif GCTGTGTG (N20) CCTCTCTG in the core region of MIRs. ZNF768 binding is preferentially associated with euchromatin and promoter regions of genes. Binding was observed for genes expressed in a cell type-specific manner in human B cell line Raji and osteosarcoma U2OS cells. Mass spectrometric analysis revealed binding of ZNF768 to Elongator components Elp1, Elp2 and Elp3 and other nuclear factors. The N-terminus of ZNF768 contains a heptad repeat array structurally related to the C-terminal domain (CTD) of RNA polymerase II. This array evolved in placental animals but not marsupials and monotreme species, displays species-specific length variations, and possibly fulfills CTD related functions in gene regulation. We propose that the evolution of MIRs and ZNF768 has extended the repertoire of gene regulatory mechanisms in mammals and that ZNF768 binding is associated with cell type-specific gene expression.
Collapse
Affiliation(s)
- Michaela Rohrmoser
- Department of Molecular Epigenetics, Helmholtz Center Munich and Center for Integrated Protein Science Munich (CIPSM), Marchioninistrasse 25, 81377 Munich, Germany
| | - Michael Kluge
- Institute for Informatics, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 Munich, Germany
| | - Yousra Yahia
- Institut de Génétique Moléculaire de Montpellier (IGMM), Univ Montpellier, CNRS-UMR5535, Montpellier, France
| | - Anita Gruber-Eber
- Department of Molecular Epigenetics, Helmholtz Center Munich and Center for Integrated Protein Science Munich (CIPSM), Marchioninistrasse 25, 81377 Munich, Germany
| | - Muhammad Ahmad Maqbool
- Institut de Génétique Moléculaire de Montpellier (IGMM), Univ Montpellier, CNRS-UMR5535, Montpellier, France
| | - Ignasi Forné
- Biomedical Center Munich, ZFP, Großhadener Strasse 9, 82152 Planegg-Martinsried, Germany
| | - Stefan Krebs
- Laboratory for Functional Genome Analysis (LAFUGA) at the Gene Center, Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 Munich, Germany
| | - Helmut Blum
- Laboratory for Functional Genome Analysis (LAFUGA) at the Gene Center, Ludwig-Maximilians-Universität München, Feodor-Lynen-Strasse 25, 81377 Munich, Germany
| | - Ann Katrin Greifenberg
- Institute of Structural Biology, University of Bonn, Sigmund-Freud-Str. 25, 53127 Bonn, Germany
| | - Matthias Geyer
- Institute of Structural Biology, University of Bonn, Sigmund-Freud-Str. 25, 53127 Bonn, Germany
| | - Nicolas Descostes
- Department of Biochemistry and Molecular Pharmacology, New York University Langone School of Medicine, New York, NY 10016, USA.,Howard Hughes Medical Institute, New York University Langone School of Medicine, New York, NY 10016, USA
| | - Axel Imhof
- Biomedical Center Munich, ZFP, Großhadener Strasse 9, 82152 Planegg-Martinsried, Germany
| | - Jean-Christophe Andrau
- Institut de Génétique Moléculaire de Montpellier (IGMM), Univ Montpellier, CNRS-UMR5535, Montpellier, France
| | - Caroline C Friedel
- Institute for Informatics, Ludwig-Maximilians-Universität München, Amalienstrasse 17, 80333 Munich, Germany
| | - Dirk Eick
- Department of Molecular Epigenetics, Helmholtz Center Munich and Center for Integrated Protein Science Munich (CIPSM), Marchioninistrasse 25, 81377 Munich, Germany
| |
Collapse
|
9
|
Orozco-Arias S, Isaza G, Guyot R. Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning. Int J Mol Sci 2019; 20:E3837. [PMID: 31390781 PMCID: PMC6696364 DOI: 10.3390/ijms20153837] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 07/31/2019] [Accepted: 08/02/2019] [Indexed: 01/26/2023] Open
Abstract
Transposable elements (TEs) are genomic units able to move within the genome of virtually all organisms. Due to their natural repetitive numbers and their high structural diversity, the identification and classification of TEs remain a challenge in sequenced genomes. Although TEs were initially regarded as "junk DNA", it has been demonstrated that they play key roles in chromosome structures, gene expression, and regulation, as well as adaptation and evolution. A highly reliable annotation of these elements is, therefore, crucial to better understand genome functions and their evolution. To date, much bioinformatics software has been developed to address TE detection and classification processes, but many problematic aspects remain, such as the reliability, precision, and speed of the analyses. Machine learning and deep learning are algorithms that can make automatic predictions and decisions in a wide variety of scientific applications. They have been tested in bioinformatics and, more specifically for TEs, classification with encouraging results. In this review, we will discuss important aspects of TEs, such as their structure, importance in the evolution and architecture of the host, and their current classifications and nomenclatures. We will also address current methods and their limitations in identifying and classifying TEs.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170001, Colombia
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170001, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170001, Colombia
| | - Romain Guyot
- Department of Electronics and Automatization, Universidad Autónoma de Manizales, Manizales 170001, Colombia.
- Institut de Recherche pour le Développement, CIRAD, University Montpellier, 34000 Montpellier, France.
| |
Collapse
|
10
|
Abstract
Nobel laureate Nikolaas Tinbergen provided clear criteria for declaring a neuroscience problem solved, criteria which despite the passage of more than 50 years and vastly expanded neuroscience tool kits remain applicable today. Tinbergen said for neuroscientists to claim that a behavior is understood, they must correspondingly understand its (i) development and its (ii) mechanisms and its (iii) function and its (iv) evolution. Now, all four of these domains represent hotbeds of current experimental work, each using arrays of new techniques which overlap only partly. Thus, as new methodologies come online, from single-nerve-cell RNA sequencing, for example, to smart FISH, large-scale calcium imaging from cortex and deep brain structures, computational ethology, and so on, one person, however smart, cannot master everything. Our response to the likely “fracturing” of neuroscience recognizes the value of ever larger consortia. This response suggests new kinds of problems for (i) funding and (ii) the fair distribution of credit, especially for younger scientists.
Collapse
|
11
|
Mustafin RN. The Relationship between Transposons and Transcription Factors in the Evolution of Eukaryotes. J EVOL BIOCHEM PHYS+ 2019. [DOI: 10.1134/s0022093019010022] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Mustafin RN, Khusnutdinova EK. The Role of Transposons in Epigenetic Regulation of Ontogenesis. Russ J Dev Biol 2018. [DOI: 10.1134/s1062360418020066] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
13
|
Karakülah G. RTFAdb: A database of computationally predicted associations between retrotransposons and transcription factors in the human and mouse genomes. Genomics 2017; 110:257-262. [PMID: 29155231 DOI: 10.1016/j.ygeno.2017.11.002] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2017] [Revised: 10/31/2017] [Accepted: 11/14/2017] [Indexed: 12/22/2022]
Abstract
In recent years, retrotransposons have gained increasing attention as a source of binding motifs for transcription factors (TFs). Despite the substantial roles of these mobile genetic elements in the regulation of gene expression, a comprehensive resource enabling the investigation of retrotransposon species that are bound by TFs is still lacking. Herein, I introduce for the first time a novel database called RTFAdb, which allows exploring computationally predicted associations between retrotransposons and TFs in diverse cell lines and tissues of human and mouse. My database, using over 3.000 TF ChIP-seq binding profiles collected from human and mouse samples, makes possible searching more than 1.500 retrotransposon species in the binding sites of a total of 596 TFs. RTFAdb is freely available at http://tools.ibg.deu.edu.tr/rtfa/ and has the potential to offer novel insights into mammalian transcriptional networks by providing an additional layer of information regarding the regulatory roles of retrotransposons.
Collapse
Affiliation(s)
- Gökhan Karakülah
- İzmir International Biomedicine and Genome Institute (iBG-İzmir), Dokuz Eylül University, 35340, İnciraltı, İzmir, Turkey.
| |
Collapse
|
14
|
Modelling the evolution of transcription factor binding preferences in complex eukaryotes. Sci Rep 2017; 7:7596. [PMID: 28790414 PMCID: PMC5548724 DOI: 10.1038/s41598-017-07761-0] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2017] [Accepted: 06/30/2017] [Indexed: 12/27/2022] Open
Abstract
Transcription factors (TFs) exert their regulatory action by binding to DNA with specific sequence preferences. However, different TFs can partially share their binding sequences due to their common evolutionary origin. This "redundancy" of binding defines a way of organizing TFs in "motif families" by grouping TFs with similar binding preferences. Since these ultimately define the TF target genes, the motif family organization entails information about the structure of transcriptional regulation as it has been shaped by evolution. Focusing on the human TF repertoire, we show that a one-parameter evolutionary model of the Birth-Death-Innovation type can explain the TF empirical repartition in motif families, and allows to highlight the relevant evolutionary forces at the origin of this organization. Moreover, the model allows to pinpoint few deviations from the neutral scenario it assumes: three over-expanded families (including HOX and FOX genes), a set of "singleton" TFs for which duplication seems to be selected against, and a higher-than-average rate of diversification of the binding preferences of TFs with a Zinc Finger DNA binding domain. Finally, a comparison of the TF motif family organization in different eukaryotic species suggests an increase of redundancy of binding with organism complexity.
Collapse
|
15
|
Chuong EB, Elde NC, Feschotte C. Regulatory activities of transposable elements: from conflicts to benefits. Nat Rev Genet 2016; 18:71-86. [PMID: 27867194 DOI: 10.1038/nrg.2016.139] [Citation(s) in RCA: 762] [Impact Index Per Article: 95.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor-binding sites and non-coding RNAs. Many recent studies reinvigorate the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and the conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalysed the evolution of gene-regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic effect of regulatory activities encoded by TEs in health and disease.
Collapse
Affiliation(s)
- Edward B Chuong
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| | - Nels C Elde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| |
Collapse
|
16
|
Buckley RM, Adelson DL. Mammalian genome evolution as a result of epigenetic regulation of transposable elements. Biomol Concepts 2015; 5:183-94. [PMID: 25372752 DOI: 10.1515/bmc-2014-0013] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2014] [Accepted: 05/27/2014] [Indexed: 12/29/2022] Open
Abstract
Transposable elements (TEs) make up a large proportion of mammalian genomes and are a strong evolutionary force capable of rewiring regulatory networks and causing genome rearrangements. Additionally, there are many eukaryotic epigenetic defense mechanisms able to transcriptionally silence TEs. Furthermore, small RNA molecules that target TE DNA sequences often mediate these epigenetic defense mechanisms. As a result, epigenetic marks associated with TE silencing can be reestablished after epigenetic reprogramming - an event during the mammalian life cycle that results in widespread loss of parental epigenetic marks. Furthermore, targeted epigenetic marks associated with TE silencing may have an impact on nearby gene expression. Therefore, TEs may have driven species evolution via their ability to heritably alter the epigenetic regulation of gene expression in mammals.
Collapse
|
17
|
Colliva A, Pellegrini R, Testori A, Caselle M. Ising-model description of long-range correlations in DNA sequences. PHYSICAL REVIEW. E, STATISTICAL, NONLINEAR, AND SOFT MATTER PHYSICS 2015; 91:052703. [PMID: 26066195 DOI: 10.1103/physreve.91.052703] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Indexed: 06/04/2023]
Abstract
We model long-range correlations of nucleotides in the human DNA sequence using the long-range one-dimensional (1D) Ising model. We show that, for distances between 10(3) and 10(6) bp, the correlations show a universal behavior and may be described by the non-mean-field limit of the long-range 1D Ising model. This allows us to make some testable hypothesis on the nature of the interaction between distant portions of the DNA chain which led to the DNA structure that we observe today in higher eukaryotes.
Collapse
Affiliation(s)
- A Colliva
- Dipartimento di Fisica dell'Università di Torino and I.N.F.N. sez. di Torino, Via Pietro Giuria 1, I-10125 Torino, Italy
| | - R Pellegrini
- Physics Department, Swansea University, Singleton Park, Swansea SA2 8PP, UK
| | - A Testori
- Dipartimento di Fisica dell'Università di Torino and I.N.F.N. sez. di Torino, Via Pietro Giuria 1, I-10125 Torino, Italy
| | - M Caselle
- Dipartimento di Fisica dell'Università di Torino and I.N.F.N. sez. di Torino, Via Pietro Giuria 1, I-10125 Torino, Italy
| |
Collapse
|
18
|
Hellen EHB, Kern AD. The role of DNA insertions in phenotypic differentiation between humans and other primates. Genome Biol Evol 2015; 7:1168-78. [PMID: 25635043 PMCID: PMC4419785 DOI: 10.1093/gbe/evv012] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022] Open
Abstract
What makes us human is one of the most interesting and enduring questions in evolutionary biology. To assist in answering this question, we have identified insertions in the human genome which cannot be found in five comparison primate species: Chimpanzee, gorilla, orangutan, gibbon, and macaque. A total of 21,269 nonpolymorphic human-specific insertions were identified, of which only 372 were found in exons. Any function conferred by the remaining 20,897 is likely to be regulatory. Many of these insertions are likely to have been fitness neutral; however, a small number has been identified in genes showing signs of positive selection. Insertions found within positively selected genes show associations to neural phenotypes, which were also enriched in the whole data set. Other phenotypes that are found to be enriched in the data set include dental and sensory perception-related phenotypes, features which are known to differ between humans and other apes. The analysis provides several likely candidates, either genes or regulatory regions, which may be involved in the processes that differentiate humans from other apes.
Collapse
Affiliation(s)
| | - Andrew D Kern
- Department of Genetics, Nelson Biolabs, Piscataway, NJ, USA
| |
Collapse
|
19
|
Shapiro JA. Epigenetic control of mobile DNA as an interface between experience and genome change. Front Genet 2014; 5:87. [PMID: 24795749 PMCID: PMC4007016 DOI: 10.3389/fgene.2014.00087] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2014] [Accepted: 04/01/2014] [Indexed: 12/29/2022] Open
Abstract
Mobile DNA in the genome is subject to RNA-targeted epigenetic control. This control regulates the activity of transposons, retrotransposons and genomic proviruses. Many different life history experiences alter the activities of mobile DNA and the expression of genetic loci regulated by nearby insertions. The same experiences induce alterations in epigenetic formatting and lead to trans-generational modifications of genome expression and stability. These observations lead to the hypothesis that epigenetic formatting directed by non-coding RNA provides a molecular interface between life history events and genome alteration.
Collapse
Affiliation(s)
- James A. Shapiro
- Department of Biochemistry and Molecular Biology, University of ChicagoChicago, IL, USA
| |
Collapse
|
20
|
Genome-wide activity of unliganded estrogen receptor-α in breast cancer cells. Proc Natl Acad Sci U S A 2014; 111:4892-7. [PMID: 24639548 DOI: 10.1073/pnas.1315445111] [Citation(s) in RCA: 68] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023] Open
Abstract
Estrogen receptor-α (ERα) has central role in hormone-dependent breast cancer and its ligand-induced functions have been extensively characterized. However, evidence exists that ERα has functions that are independent of ligands. In the present work, we investigated the binding of ERα to chromatin in the absence of ligands and its functions on gene regulation. We demonstrated that in MCF7 breast cancer cells unliganded ERα binds to more than 4,000 chromatin sites. Unexpectedly, although almost entirely comprised in the larger group of estrogen-induced binding sites, we found that unliganded-ERα binding is specifically linked to genes with developmental functions, compared with estrogen-induced binding. Moreover, we found that siRNA-mediated down-regulation of ERα in absence of estrogen is accompanied by changes in the expression levels of hundreds of coding and noncoding RNAs. Down-regulated mRNAs showed enrichment in genes related to epithelial cell growth and development. Stable ERα down-regulation using shRNA, which caused cell growth arrest, was accompanied by increased H3K27me3 at ERα binding sites. Finally, we found that FOXA1 and AP2γ binding to several sites is decreased upon ERα silencing, suggesting that unliganded ERα participates, together with other factors, in the maintenance of the luminal-specific cistrome in breast cancer cells.
Collapse
|
21
|
Hénaff E, Vives C, Desvoyes B, Chaurasia A, Payet J, Gutierrez C, Casacuberta JM. Extensive amplification of the E2F transcription factor binding sites by transposons during evolution of Brassica species. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2014; 77:852-62. [PMID: 24447172 DOI: 10.1111/tpj.12434] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/31/2013] [Revised: 12/24/2013] [Accepted: 01/09/2014] [Indexed: 05/10/2023]
Abstract
Transposable elements (TEs) are major players in genome evolution. The effects of their movement vary from gene knockouts to more subtle effects such as changes in gene expression. It has recently been shown that TEs may contain transcription factor binding sites (TFBSs), and it has been proposed that they may rewire new genes into existing transcriptional networks. However, little is known about the dynamics of this process and its effect on transcription factor binding. Here we show that TEs have extensively amplified the number of sequences that match the E2F TFBS during Brassica speciation, and, as a result, as many as 85% of the sequences that fit the E2F TFBS consensus are within TEs in some Brassica species. We show that these sequences found within TEs bind E2Fa in vivo, which indicates a direct effect of these TEs on E2F-mediated gene regulation. Our results suggest that the TEs located close to genes may directly participate in gene promoters, whereas those located far from genes may have an indirect effect by diluting the effective amount of E2F protein able to bind to its cognate promoters. These results illustrate an extreme case of the effect of TEs in TFBS evolution, and suggest a singular way by which they affect host genes by modulating essential transcriptional networks.
Collapse
Affiliation(s)
- Elizabeth Hénaff
- Center for Research in Agricultural Genomics, Consejo Superior de Investigaciones Científicas-Institut de Recerca i Tecnologia Agroalimentàries-Universitat Autònoma de Barcelona-Universitat de Barcelona, Campus Universitat Autònoma de Barcelona, Bellaterra - Cerdanyola del Vallès, 08193, Barcelona, Spain
| | | | | | | | | | | | | |
Collapse
|
22
|
Stindl R. The telomeric sync model of speciation: species-wide telomere erosion triggers cycles of transposon-mediated genomic rearrangements, which underlie the saltatory appearance of nonadaptive characters. Naturwissenschaften 2014; 101:163-86. [PMID: 24493020 PMCID: PMC3935097 DOI: 10.1007/s00114-014-1152-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2013] [Revised: 01/13/2014] [Accepted: 01/16/2014] [Indexed: 12/16/2022]
Abstract
Charles Darwin knew that the fossil record is not overwhelmingly supportive of genetic and phenotypic gradualism; therefore, he developed the core of his theory on the basis of breeding experiments. Here, I present evidence for the existence of a cell biological mechanism that strongly points to the almost forgotten European concept of saltatory evolution of nonadaptive characters, which is in perfect agreement with the gaps in the fossil record. The standard model of chromosomal evolution has always been handicapped by a paradox, namely, how speciation can occur by spontaneous chromosomal rearrangements that are known to decrease the fertility of heterozygotes in a population. However, the hallmark of almost all closely related species is a differing chromosome complement and therefore chromosomal rearrangements seem to be crucial for speciation. Telomeres, the caps of eukaryotic chromosomes, erode in somatic tissues during life, but have been thought to remain stable in the germline of a species. Recently, a large human study spanning three healthy generations clearly found a cumulative telomere effect, which is indicative of transgenerational telomere erosion in the human species. The telomeric sync model of speciation presented here is based on telomere erosion between generations, which leads to identical fusions of chromosomes and triggers a transposon-mediated genomic repatterning in the germline of many individuals of a species. The phenotypic outcome of the telomere-triggered transposon activity is the saltatory appearance of nonadaptive characters simultaneously in many individuals. Transgenerational telomere erosion is therefore the material basis of aging at the species level.
Collapse
Affiliation(s)
- Reinhard Stindl
- apo-med-center, Alpharm GesmbH, Plättenstrasse 7-9, 2380, Perchtoldsdorf, Austria,
| |
Collapse
|
23
|
Genome-wide analysis of promoters: clustering by alignment and analysis of regular patterns. PLoS One 2014; 9:e85260. [PMID: 24465517 PMCID: PMC3898993 DOI: 10.1371/journal.pone.0085260] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2013] [Accepted: 11/26/2013] [Indexed: 01/08/2023] Open
Abstract
In this paper we perform a genome-wide analysis of H. sapiens promoters. To this aim, we developed and combined two mathematical methods that allow us to (i) classify promoters into groups characterized by specific global structural features, and (ii) recover, in full generality, any regular sequence in the different classes of promoters. One of the main findings of this analysis is that H. sapiens promoters can be classified into three main groups. Two of them are distinguished by the prevalence of weak or strong nucleotides and are characterized by short compositionally biased sequences, while the most frequent regular sequences in the third group are strongly correlated with transposons. Taking advantage of the generality of these mathematical procedures, we have compared the promoter database of H. sapiens with those of other species. We have found that the above-mentioned features characterize also the evolutionary content appearing in mammalian promoters, at variance with ancestral species in the phylogenetic tree, that exhibit a definitely lower level of differentiation among promoters.
Collapse
|
24
|
Zhang W, Edwards A, Fan W, Fang Z, Deininger P, Zhang K. Inferring the expression variability of human transposable element-derived exons by linear model analysis of deep RNA sequencing data. BMC Genomics 2013; 14:584. [PMID: 23984937 PMCID: PMC3765721 DOI: 10.1186/1471-2164-14-584] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2013] [Accepted: 08/13/2013] [Indexed: 12/14/2022] Open
Abstract
Background The exonization of transposable elements (TEs) has proven to be a significant mechanism for the creation of novel exons. Existing knowledge of the retention patterns of TE exons in mRNAs were mainly established by the analysis of Expressed Sequence Tag (EST) data and microarray data. Results This study seeks to validate and extend previous studies on the expression of TE exons by an integrative statistical analysis of high throughput RNA sequencing data. We collected 26 RNA-seq datasets spanning multiple tissues and cancer types. The exon-level digital expressions (indicating retention rates in mRNAs) were quantified by a double normalized measure, called the rescaled RPKM (Reads Per Kilobase of exon model per Million mapped reads). We analyzed the distribution profiles and the variability (across samples and between tissue/disease groups) of TE exon expressions, and compared them with those of other constitutive or cassette exons. We inferred the effects of four genomic factors, including the location, length, cognate TE family and TE nucleotide proportion (RTE, see Methods section) of a TE exon, on the exons’ expression level and expression variability. We also investigated the biological implications of an assembly of highly-expressed TE exons. Conclusion Our analysis confirmed prior studies from the following four aspects. First, with relatively high expression variability, most TE exons in mRNAs, especially those without exact counterparts in the UCSC RefSeq (Reference Sequence) gene tables, demonstrate low but still detectable expression levels in most tissue samples. Second, the TE exons in coding DNA sequences (CDSs) are less highly expressed than those in 3′ (5′) untranslated regions (UTRs). Third, the exons derived from chronologically ancient repeat elements, such as MIRs, tend to be highly expressed in comparison with those derived from younger TEs. Fourth, the previously observed negative relationship between the lengths of exons and the inclusion levels in transcripts is also true for exonized TEs. Furthermore, our study resulted in several novel findings. They include: (1) for the TE exons with non-zero expression and as shown in most of the studied biological samples, a high TE nucleotide proportion leads to their lower retention rates in mRNAs; (2) the considered genomic features (i.e. a continuous variable such as the exon length or a category indicator such as 3′UTR) influence the expression level and the expression variability (CV) of TE exons in an inverse manner; (3) not only the exons derived from Alu elements but also the exons from the TEs of other families were preferentially established in zinc finger (ZNF) genes.
Collapse
Affiliation(s)
- Wensheng Zhang
- Department of Computer Science, Xavier University of Louisiana, 1 Drexel Drive, New Orleans, LA 70125, USA.
| | | | | | | | | | | |
Collapse
|
25
|
Jacques PÉ, Jeyakani J, Bourque G. The majority of primate-specific regulatory sequences are derived from transposable elements. PLoS Genet 2013; 9:e1003504. [PMID: 23675311 PMCID: PMC3649963 DOI: 10.1371/journal.pgen.1003504] [Citation(s) in RCA: 222] [Impact Index Per Article: 20.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2012] [Accepted: 03/25/2013] [Indexed: 11/18/2022] Open
Abstract
Although emerging evidence suggests that transposable elements (TEs) have contributed novel regulatory elements to the human genome, their global impact on transcriptional networks remains largely uncharacterized. Here we show that TEs have contributed to the human genome nearly half of its active elements. Using DNase I hypersensitivity data sets from ENCODE in normal, embryonic, and cancer cells, we found that 44% of open chromatin regions were in TEs and that this proportion reached 63% for primate-specific regions. We also showed that distinct subfamilies of endogenous retroviruses (ERVs) contributed significantly more accessible regions than expected by chance, with up to 80% of their instances in open chromatin. Based on these results, we further characterized 2,150 TE subfamily-transcription factor pairs that were bound in vivo or enriched for specific binding motifs, and observed that TEs contributing to open chromatin had higher levels of sequence conservation. We also showed that thousands of ERV-derived sequences were activated in a cell type-specific manner, especially in embryonic and cancer cells, and we demonstrated that this activity was associated with cell type-specific expression of neighboring genes. Taken together, these results demonstrate that TEs, and in particular ERVs, have contributed hundreds of thousands of novel regulatory elements to the primate lineage and reshaped the human transcriptional landscape.
Collapse
Affiliation(s)
- Pierre-Étienne Jacques
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore
- Département de Biologie, Université de Sherbrooke, Sherbrooke, Québec, Canada
| | - Justin Jeyakani
- Computational and Systems Biology, Genome Institute of Singapore, Singapore, Singapore
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montréal, Québec, Canada
- McGill University and Génome Québec Innovation Center, Montréal, Québec, Canada
- * E-mail:
| |
Collapse
|