1
|
Buttler CA, Ramirez D, Dowell RD, Chuong EB. An intronic LINE-1 regulates IFNAR1 expression in human immune cells. Mob DNA 2023; 14:20. [PMID: 38037122 PMCID: PMC10688052 DOI: 10.1186/s13100-023-00308-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2023] [Accepted: 11/13/2023] [Indexed: 12/02/2023] Open
Abstract
BACKGROUND Despite their origins as selfish parasitic sequences, some transposons in the human genome have been co-opted to serve as regulatory elements, contributing to the evolution of transcriptional networks. Most well-characterized examples of transposon-derived regulatory elements derive from endogenous retroviruses (ERVs), due to the intrinsic regulatory activity of proviral long terminal repeat regions. However, one subclass of transposable elements, the Long Interspersed Nuclear Elements (LINEs), have been largely overlooked in the search for functional regulatory transposons, and considered to be broadly epigenetically repressed. RESULTS We examined the chromatin state of LINEs by analyzing epigenomic data from human immune cells. Many LINEs are marked by the repressive H3K9me3 modification, but a subset exhibits evidence of enhancer activity in human immune cells despite also showing evidence of epigenetic repression. We hypothesized that these competing forces of repressive and activating epigenetic marks might lead to inducible enhancer activity. We investigated a specific L1M2a element located within the first intron of Interferon Alpha/Beta Receptor 1 (IFNAR1). This element shows epigenetic signatures of B cell-specific enhancer activity, despite being repressed by the Human Silencing Hub (HUSH) complex. CRISPR deletion of the element in B lymphoblastoid cells revealed that the element acts as an enhancer that regulates both steady state and interferon-inducible expression of IFNAR1. CONCLUSIONS Our study experimentally demonstrates that an L1M2a element was co-opted to function as an interferon-inducible enhancer of IFNAR1, creating a feedback loop wherein IFNAR1 is transcriptionally upregulated by interferon signaling. This finding suggests that other LINEs may exhibit cryptic cell type-specific or context-dependent enhancer activity. LINEs have received less attention than ERVs in the effort to understand the contribution of transposons to the regulatory landscape of cellular genomes, but these are likely important, lineage-specific players in the rapid evolution of immune system regulatory networks and deserve further study.
Collapse
Affiliation(s)
- Carmen A Buttler
- Department of Molecular, Cellular, and Developmental Biology and BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, 80309, USA
| | - Daniel Ramirez
- Department of Molecular, Cellular, and Developmental Biology and BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, 80309, USA
| | - Robin D Dowell
- Department of Molecular, Cellular, and Developmental Biology and BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, 80309, USA
| | - Edward B Chuong
- Department of Molecular, Cellular, and Developmental Biology and BioFrontiers Institute, University of Colorado Boulder, Boulder, CO, 80309, USA.
| |
Collapse
|
2
|
Roller M, Stamper E, Villar D, Izuogu O, Martin F, Redmond AM, Ramachanderan R, Harewood L, Odom DT, Flicek P. LINE retrotransposons characterize mammalian tissue-specific and evolutionarily dynamic regulatory regions. Genome Biol 2021; 22:62. [PMID: 33602314 PMCID: PMC7890895 DOI: 10.1186/s13059-021-02260-y] [Citation(s) in RCA: 32] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2020] [Accepted: 01/04/2021] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND To investigate the mechanisms driving regulatory evolution across tissues, we experimentally mapped promoters, enhancers, and gene expression in the liver, brain, muscle, and testis from ten diverse mammals. RESULTS The regulatory landscape around genes included both tissue-shared and tissue-specific regulatory regions, where tissue-specific promoters and enhancers evolved most rapidly. Genomic regions switching between promoters and enhancers were more common across species, and less common across tissues within a single species. Long Interspersed Nuclear Elements (LINEs) played recurrent evolutionary roles: LINE L1s were associated with tissue-specific regulatory regions, whereas more ancient LINE L2s were associated with tissue-shared regulatory regions and with those switching between promoter and enhancer signatures across species. CONCLUSIONS Our analyses of the tissue-specificity and evolutionary stability among promoters and enhancers reveal how specific LINE families have helped shape the dynamic mammalian regulome.
Collapse
Affiliation(s)
- Maša Roller
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Ericca Stamper
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK
- Present address: Harriet L. Wilkes Honors College, Florida Atlantic University, Jupiter, FL, 33458, USA
| | - Diego Villar
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK
- Present address: Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, E1 2AT, UK
| | - Osagie Izuogu
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Fergal Martin
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Aisling M Redmond
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK
- Present address: MRC Cancer Unit, Hutchison-MRC Research Centre, University of Cambridge, Cambridge, CB2 0XZ, UK
| | - Raghavendra Ramachanderan
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK
| | - Louise Harewood
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK
- Present address: Precision Medicine Centre of Excellence, Queen's University Belfast, Belfast, BT9 7AE, UK
| | - Duncan T Odom
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK.
- German Cancer Research Center (DKFZ), Division of Regulatory Genomics and Cancer Evolution, Im Neuenheimer Feld 280, 69120, Heidelberg, Germany.
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
- Cancer Research UK Cambridge Institute, University of Cambridge, Robinson Way, Cambridge, CB2 0RE, UK.
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| |
Collapse
|
3
|
Van der Mude A. Structure encoding in DNA. J Theor Biol 2020; 492:110205. [PMID: 32070719 DOI: 10.1016/j.jtbi.2020.110205] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2019] [Revised: 12/29/2019] [Accepted: 02/14/2020] [Indexed: 12/21/2022]
Abstract
It is proposed that transposons and related long non-coding RNA define the fine structure of body parts. Although morphogens have long been known to direct the formation of many gross structures in early embryonic development, they do not have the necessary precision to define a structure down to the individual cellular level. Using the distinction between procedural and declarative knowledge in information processing as an analogy, it is hypothesized that DNA encodes fine structure in a manner that is different from the genetic code for proteins. The hypothesis states that repeated or near-repeated sequences that are in transposons and non-coding RNA define body part structures. As the cells in a body part go through the epigenetic process of differentiation, the action of methylation serves to inactivate all but the relevant structure definitions and some associated cell type genes. The transposons left active will then physically modify the DNA sequence in the heterochromatin to establish the local context in the three-dimensional body part structure. This brings the encoded definition of the cell type to the histone. The histone code for that cell type starts the regulatory cascade that turns on the genes associated with that particular type of cell, transforming it from a multipotent cell to a fully differentiated cell. This mechanism creates structures in the musculoskeletal system, the organs of the body, the major parts of the brain, and other systems.
Collapse
|
4
|
Clayton EA, Rishishwar L, Huang TC, Gulati S, Ban D, McDonald JF, Jordan IK. An atlas of transposable element-derived alternative splicing in cancer. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190342. [PMID: 32075558 PMCID: PMC7061986 DOI: 10.1098/rstb.2019.0342] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 11/06/2019] [Indexed: 12/18/2022] Open
Abstract
Transposable element (TE)-derived sequences comprise more than half of the human genome, and their presence has been documented to alter gene expression in a number of different ways, including the generation of alternatively spliced transcript isoforms. Alternative splicing has been associated with tumorigenesis for a number of different cancers. The objective of this study was to broadly characterize the role of human TEs in generating alternatively spliced transcript isoforms in cancer. To do so, we screened for the presence of TE-derived sequences co-located with alternative splice sites that are differentially used in normal versus cancer tissues. We analysed a comprehensive set of alternative splice variants characterized for 614 matched normal-tumour tissue pairs across 13 cancer types, resulting in the discovery of 4820 TE-generated alternative splice events distributed among 723 cancer-associated genes. Short interspersed nuclear elements (Alu) and long interspersed nuclear elements (L1) were found to contribute the majority of TE-generated alternative splice sites in cancer genes. A number of cancer-associated genes, including MYH11, WHSC1 and CANT1, were shown to have overexpressed TE-derived isoforms across a range of cancer types. TE-derived isoforms were also linked to cancer-specific fusion transcripts, suggesting a novel mechanism for the generation of transcriptome diversity via trans-splicing mediated by dispersed TE repeats. This article is part of a discussion meeting issue 'Crossroads between transposons and gene regulation'.
Collapse
Affiliation(s)
- Evan A. Clayton
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Lavanya Rishishwar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - Tzu-Chuan Huang
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Saurabh Gulati
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - Dongjo Ban
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - John F. McDonald
- Integrated Cancer Research Center, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
| | - I. King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA
- PanAmerican Bioinformatics Institute, Cali, Colombia
- Applied Bioinformatics Laboratory, Atlanta, GA, USA
| |
Collapse
|
5
|
Specific subfamilies of transposable elements contribute to different domains of T lymphocyte enhancers. Proc Natl Acad Sci U S A 2020; 117:7905-7916. [PMID: 32193341 PMCID: PMC7148579 DOI: 10.1073/pnas.1912008117] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Transposable elements (TEs) compose nearly half of mammalian genomes and provide building blocks for cis-regulatory elements. Using high-throughput sequencing, we show that 84 TE subfamilies are overrepresented, and distributed in a lineage-specific fashion in core and boundary domains of CD8+ T cell enhancers. Endogenous retroviruses are most significantly enriched in core domains with accessible chromatin, and bear recognition motifs for immune-related transcription factors. In contrast, short interspersed elements (SINEs) are preferentially overrepresented in nucleosome-containing boundaries. A substantial proportion of these SINEs harbor a high density of the enhancer-specific histone mark H3K4me1 and carry sequences that match enhancer boundary nucleotide composition. Motifs with regulatory features are better preserved within enhancer-enriched TE copies compared to their subfamily equivalents located in gene deserts. TE-rich and TE-poor enhancers associate with both shared and unique gene groups and are enriched in overlapping functions related to lymphocyte and leukocyte biology. The majority of T cell enhancers are shared with other immune lineages and are accessible in common hematopoietic progenitors. A higher proportion of immune tissue-specific enhancers are TE-rich compared to enhancers specific to other tissues, correlating with higher TE occurrence in immune gene-associated genomic regions. Our results suggest that during evolution, TEs abundant in these regions and carrying motifs potentially beneficial for enhancer architecture and immune functions were particularly frequently incorporated by evolving enhancers. Their putative selection and regulatory cooption may have accelerated the evolution of immune regulatory networks.
Collapse
|
6
|
Yang WR, Ardeljan D, Pacyna CN, Payer LM, Burns KH. SQuIRE reveals locus-specific regulation of interspersed repeat expression. Nucleic Acids Res 2019; 47:e27. [PMID: 30624635 PMCID: PMC6411935 DOI: 10.1093/nar/gky1301] [Citation(s) in RCA: 93] [Impact Index Per Article: 18.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2018] [Revised: 12/18/2018] [Accepted: 01/03/2019] [Indexed: 12/13/2022] Open
Abstract
Transposable elements (TEs) are interspersed repeat sequences that make up much of the human genome. Their expression has been implicated in development and disease. However, TE-derived RNA-seq reads are difficult to quantify. Past approaches have excluded these reads or aggregated RNA expression to subfamilies shared by similar TE copies, sacrificing quantitative accuracy or the genomic context necessary to understand the basis of TE transcription. As a result, the effects of TEs on gene expression and associated phenotypes are not well understood. Here, we present Software for Quantifying Interspersed Repeat Expression (SQuIRE), the first RNA-seq analysis pipeline that provides a quantitative and locus-specific picture of TE expression (https://github.com/wyang17/SQuIRE). SQuIRE is an accurate and user-friendly tool that can be used for a variety of species. We applied SQuIRE to RNA-seq from normal mouse tissues and a Drosophila model of amyotrophic lateral sclerosis. In both model organisms, we recapitulated previously reported TE subfamily expression levels and revealed locus-specific TE expression. We also identified differences in TE transcription patterns relating to transcript type, gene expression and RNA splicing that would be lost with other approaches using subfamily-level analyses. Altogether, our findings illustrate the importance of studying TE transcription with locus-level resolution.
Collapse
Affiliation(s)
- Wan R Yang
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Daniel Ardeljan
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,McKusick-Nathans Institute of Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Clarissa N Pacyna
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Thomas C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, MD, USA
| | - Lindsay M Payer
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,McKusick-Nathans Institute of Genetics, Johns Hopkins University School of Medicine, Baltimore, MD 21205, USA.,Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, Baltimore, MD, USA
| |
Collapse
|
7
|
Jiang JC, Upton KR. Human transposons are an abundant supply of transcription factor binding sites and promoter activities in breast cancer cell lines. Mob DNA 2019; 10:16. [PMID: 31061680 PMCID: PMC6486989 DOI: 10.1186/s13100-019-0158-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Accepted: 04/01/2019] [Indexed: 12/22/2022] Open
Abstract
Background Transposable elements (TE) are commonly regarded as “junk DNA” with no apparent regulatory roles in the human genome. However, a growing body of evidence demonstrates that some TEs exhibit regulatory activities in a range of biological pathways and diseases, with notable examples in bile metabolism and innate immunity. TEs are typically suppressed by epigenetic modifications in healthy somatic tissues, which prevents both undesirable effects of insertional mutagenesis, and also unwanted gene activation. Interestingly, TEs are widely reported to be dysregulated in epithelial cancers, and while much attention has been paid to their effects on genome instability, relatively little has been reported on their effects on gene regulation. Here, we investigated the contribution of TEs to the transcriptional regulation in breast cancer cell lines. Results We found that a subset of TE subfamilies were enriched in oncogenic transcription factor binding sites and also harboured histone marks associated with active transcription, raising the possibility of these subfamilies playing a broad role in breast cancer transcriptional regulation. To directly assess promoter activity in triple negative breast cancer cell lines, we identified four breast cancer-associated genes with putative TE-derived promoters. TE deletion confirmed a contribution to promoter activity in all cases, and for two examples the promoter activity was almost completely contained within the TE. Conclusions Our findings demonstrate that TEs provide abundant oncogenic transcription factor binding sites in breast cancer and that individual TEs contain substantial promoter activity. Our findings provide further evidence for transcriptional regulation of human genes through TE exaptation by demonstrating the regulatory potential of TEs in multiple breast cancer cell lines. Electronic supplementary material The online version of this article (10.1186/s13100-019-0158-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jiayue-Clara Jiang
- School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| | - Kyle R Upton
- School of Chemistry and Molecular Biosciences, The University of Queensland, St Lucia, QLD 4072 Australia
| |
Collapse
|
8
|
Abstract
Transposable elements (TEs) are low-complexity elements (e.g., LINEs, SINEs, SVAs, and HERVs) that make up to two-thirds of the human genome. There is mounting evidence that TEs play an essential role in molecular functions that influence genomic plasticity and gene expression regulation. With the advent of next-generation sequencing approaches, our understanding of the relationship between TEs and psychiatric disorders will greatly improve. In this chapter, the Authors comprehensively summarize the state-of the-art of TE research in animal models and humans supporting a framework in which TEs play a functional role in mechanisms affecting a variety of behaviors, including neurodevelopmental, neuropsychiatric, and neurodegenerative disorders. Finally, the Authors discuss recent therapeutic applications raised from the increasing experimental evidence on TE functional mechanisms.
Collapse
Affiliation(s)
- G Guffanti
- McLean Hospital - Harvard Medical School, Belmont, MA, USA.
| | - A Bartlett
- Department of Psychology, University of Massachusetts, Boston, Boston, MA, USA
| | - P DeCrescenzo
- McLean Hospital - Harvard Medical School, Belmont, MA, USA
| | - F Macciardi
- Department of Psychiatry and Human Behavior, University of California, Irvine, Irvine, CA, USA
| | - R Hunter
- Department of Psychology, University of Massachusetts, Boston, Boston, MA, USA
| |
Collapse
|
9
|
Exaptation at the molecular genetic level. SCIENCE CHINA-LIFE SCIENCES 2018; 62:437-452. [PMID: 30798493 DOI: 10.1007/s11427-018-9447-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/10/2018] [Accepted: 12/01/2018] [Indexed: 12/22/2022]
Abstract
The realization that body parts of animals and plants can be recruited or coopted for novel functions dates back to, or even predates the observations of Darwin. S.J. Gould and E.S. Vrba recognized a mode of evolution of characters that differs from adaptation. The umbrella term aptation was supplemented with the concept of exaptation. Unlike adaptations, which are restricted to features built by selection for their current role, exaptations are features that currently enhance fitness, even though their present role was not a result of natural selection. Exaptations can also arise from nonaptations; these are characters which had previously been evolving neutrally. All nonaptations are potential exaptations. The concept of exaptation was expanded to the molecular genetic level which aided greatly in understanding the enormous potential of neutrally evolving repetitive DNA-including transposed elements, formerly considered junk DNA-for the evolution of genes and genomes. The distinction between adaptations and exaptations is outlined in this review and examples are given. Also elaborated on is the fact that such distinctions are sometimes more difficult to determine; this is a widespread phenomenon in biology, where continua abound and clear borders between states and definitions are rare.
Collapse
|
10
|
Cao Y, Chen G, Wu G, Zhang X, McDermott J, Chen X, Xu C, Jiang Q, Chen Z, Zeng Y, Ai D, Huang Y, Han JDJ. Widespread roles of enhancer-like transposable elements in cell identity and long-range genomic interactions. Genome Res 2018; 29:40-52. [PMID: 30455182 PMCID: PMC6314169 DOI: 10.1101/gr.235747.118] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2018] [Accepted: 11/12/2018] [Indexed: 01/29/2023]
Abstract
A few families of transposable elements (TEs) have been shown to evolve into cis-regulatory elements (CREs). Here, to extend these studies to all classes of TEs in the human genome, we identified widespread enhancer-like repeats (ELRs) and find that ELRs reliably mark cell identities, are enriched for lineage-specific master transcription factor binding sites, and are mostly primate-specific. In particular, elements of MIR and L2 TE families whose abundance co-evolved across chordate genomes, are found as ELRs in most human cell types examined. MIR and L2 elements frequently share long-range intra-chromosomal interactions and binding of physically interacting transcription factors. We validated that eight L2 and nine MIR elements function as enhancers in reporter assays, and among 20 MIR-L2 pairings, one MIR repressed and one boosted the enhancer activity of L2 elements. Our results reveal a previously unappreciated co-evolution and interaction between two TE families in shaping regulatory networks.
Collapse
Affiliation(s)
- Yaqiang Cao
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Guoyu Chen
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Gang Wu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xiaoli Zhang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Joseph McDermott
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Xingwei Chen
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Chi Xu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Quanlong Jiang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Zhaoxiong Chen
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yingying Zeng
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China.,School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Daosheng Ai
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Yi Huang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| | - Jing-Dong J Han
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China
| |
Collapse
|
11
|
Simonti CN, Pavličev M, Capra JA. Transposable Element Exaptation into Regulatory Regions Is Rare, Influenced by Evolutionary Age, and Subject to Pleiotropic Constraints. Mol Biol Evol 2017; 34:2856-2869. [PMID: 28961735 PMCID: PMC5850124 DOI: 10.1093/molbev/msx219] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Transposable element (TE)-derived sequences make up approximately half of most mammalian genomes, and many TEs have been co-opted into gene regulatory elements. However, we lack a comprehensive tissue- and genome-wide understanding of how and when TEs gain regulatory activity in their hosts. We evaluated the prevalence of TE-derived DNA in enhancers and promoters across hundreds of human and mouse cell lines and primary tissues. Promoters are significantly depleted of TEs in all tissues compared with their overall prevalence in the genome (P < 0.001); enhancers are also depleted of TEs, though not as strongly as promoters. The degree of enhancer depletion also varies across contexts (1.5-3×), with reproductive and immune cells showing the highest levels of TE regulatory activity in humans. Overall, in spite of the regulatory potential of many TE sequences, they are significantly less active in gene regulation than expected from their prevalence. TE age is predictive of the likelihood of enhancer activity; TEs originating before the divergence of amniotes are 9.2 times more likely to have enhancer activity than TEs that integrated in great apes. Context-specific enhancers are more likely to be TE-derived than enhancers active in multiple tissues, and young TEs are more likely to overlap context-specific enhancers than old TEs (86% vs. 47%). Once TEs obtain enhancer activity in the host, they have similar functional dynamics to one another and non-TE-derived enhancers, likely driven by pleiotropic constraints. However, a few TE families, most notably endogenous retroviruses, have greater regulatory potential. Our observations suggest a model of regulatory co-option in which TE-derived sequences are initially repressed, after which a small fraction obtains context-specific enhancer activity, with further gains subject to pleiotropic constraints.
Collapse
Affiliation(s)
| | - Mihaela Pavličev
- Center for Prevention of Preterm Birth, Perinatal Institute, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH
| | - John A. Capra
- Vanderbilt Genetics Institute, Vanderbilt University, Nashville, TN
- Department of Biological Sciences, Vanderbilt University, Nashville, TN
| |
Collapse
|
12
|
Abstract
A significant part of eukaryotic genomes is formed by transposable elements (TEs) containing not only genes but also regulatory sequences. Some of the regulatory sequences located within TEs can form secondary structures like hairpins or three-stranded (triplex DNA) and four-stranded (quadruplex DNA) conformations. This review focuses on recent evidence showing that G-quadruplex-forming sequences in particular are often present in specific parts of TEs in plants and humans. We discuss the potential role of these structures in the TE life cycle as well as the impact of G-quadruplexes on replication, transcription, translation, chromatin status, and recombination. The aim of this review is to emphasize that TEs may serve as vehicles for the genomic spread of G-quadruplexes. These non-canonical DNA structures and their conformational switches may constitute another regulatory system that, together with small and long non-coding RNA molecules and proteins, contribute to the complex cellular network resulting in the large diversity of eukaryotes.
Collapse
|
13
|
Abstract
Insulators are regulatory elements that help to organize eukaryotic chromatin via enhancer-blocking and chromatin barrier activity. Although there are several examples of transposable element (TE)-derived insulators, the contribution of TEs to human insulators has not been systematically explored. Mammalian-wide interspersed repeats (MIRs) are a conserved family of TEs that have substantial regulatory capacity and share sequence characteristics with tRNA-related insulators. We sought to evaluate whether MIRs can serve as insulators in the human genome. We applied a bioinformatic screen using genome sequence and functional genomic data from CD4(+) T cells to identify a set of 1,178 predicted MIR insulators genome-wide. These predicted MIR insulators were computationally tested to serve as chromatin barriers and regulators of gene expression in CD4(+) T cells. The activity of predicted MIR insulators was experimentally validated using in vitro and in vivo enhancer-blocking assays. MIR insulators are enriched around genes of the T-cell receptor pathway and reside at T-cell-specific boundaries of repressive and active chromatin. A total of 58% of the MIR insulators predicted here show evidence of T-cell-specific chromatin barrier and gene regulatory activity. MIR insulators appear to be CCCTC-binding factor (CTCF) independent and show a distinct local chromatin environment with marked peaks for RNA Pol III and a number of histone modifications, suggesting that MIR insulators recruit transcriptional complexes and chromatin modifying enzymes in situ to help establish chromatin and regulatory domains in the human genome. The provisioning of insulators by MIRs across the human genome suggests a specific mechanism by which TE sequences can be used to modulate gene regulatory networks.
Collapse
|
14
|
Spouge JL, Mariño-Ramírez L, Sheetlin SL. Searching for repeats, as an example of using the generalised Ruzzo-Tompa algorithm to find optimal subsequences with gaps. INTERNATIONAL JOURNAL OF BIOINFORMATICS RESEARCH AND APPLICATIONS 2014; 10:384-408. [PMID: 24989859 DOI: 10.1504/ijbra.2014.062991] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Some biological sequences contain subsequences of unusual composition; e.g. some proteins contain DNA binding domains, transmembrane regions and charged regions, and some DNA sequences contain repeats. The linear-time Ruzzo-Tompa (RT) algorithm finds subsequences of unusual composition, using a sequence of scores as input and the corresponding 'maximal segments' as output. In principle, permitting gaps in the output subsequences could improve sensitivity. Here, the input of the RT algorithm is generalised to a finite, totally ordered, weighted graph, so the algorithm locates paths of maximal weight through increasing but not necessarily adjacent vertices. By permitting the penalised deletion of unfavourable letters, the generalisation therefore includes gaps. The program RepWords, which finds inexact simple repeats in DNA, exemplifies the general concepts by out-performing a similar extant, ad hoc tool. With minimal programming effort, the generalised Ruzzo-Tompa algorithm could improve the performance of many programs for finding biological subsequences of unusual composition.
Collapse
Affiliation(s)
- John L Spouge
- Computational Biology Branch, National Center for Biotechnology Information, Bethesda, MD 20894, USA
| | - Leonardo Mariño-Ramírez
- Computational Biology Branch, National Center for Biotechnology Information, Bethesda, MD 20894, USA
| | - Sergey L Sheetlin
- Computational Biology Branch, National Center for Biotechnology Information, Bethesda, MD 20894, USA
| |
Collapse
|
15
|
Johnson R, Guigó R. The RIDL hypothesis: transposable elements as functional domains of long noncoding RNAs. RNA (NEW YORK, N.Y.) 2014; 20:959-76. [PMID: 24850885 PMCID: PMC4114693 DOI: 10.1261/rna.044560.114] [Citation(s) in RCA: 197] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
Our genome contains tens of thousands of long noncoding RNAs (lncRNAs), many of which are likely to have genetic regulatory functions. It has been proposed that lncRNA are organized into combinations of discrete functional domains, but the nature of these and their identification remain elusive. One class of sequence elements that is enriched in lncRNA is represented by transposable elements (TEs), repetitive mobile genetic sequences that have contributed widely to genome evolution through a process termed exaptation. Here, we link these two concepts by proposing that exonic TEs act as RNA domains that are essential for lncRNA function. We term such elements Repeat Insertion Domains of LncRNAs (RIDLs). A growing number of RIDLs have been experimentally defined, where TE-derived fragments of lncRNA act as RNA-, DNA-, and protein-binding domains. We propose that these reflect a more general phenomenon of exaptation during lncRNA evolution, where inserted TE sequences are repurposed as recognition sites for both protein and nucleic acids. We discuss a series of genomic screens that may be used in the future to systematically discover RIDLs. The RIDL hypothesis has the potential to explain how functional evolution can keep pace with the rapid gene evolution observed in lncRNA. More practically, TE maps may in the future be used to predict lncRNA function.
Collapse
Affiliation(s)
- Rory Johnson
- Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain
- Corresponding authorE-mail
| | - Roderic Guigó
- Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain
- Universitat Pompeu Fabra (UPF), 08003 Barcelona, Spain
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), 08003 Barcelona, Spain
| |
Collapse
|
16
|
Jjingo D, Conley AB, Wang J, Mariño-Ramírez L, Lunyak VV, Jordan IK. Mammalian-wide interspersed repeat (MIR)-derived enhancers and the regulation of human gene expression. Mob DNA 2014; 5:14. [PMID: 25018785 PMCID: PMC4090950 DOI: 10.1186/1759-8753-5-14] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2013] [Accepted: 04/10/2014] [Indexed: 11/26/2022] Open
Abstract
Background Mammalian-wide interspersed repeats (MIRs) are the most ancient family of transposable elements (TEs) in the human genome. The deep conservation of MIRs initially suggested the possibility that they had been exapted to play functional roles for their host genomes. MIRs also happen to be the only TEs whose presence in-and-around human genes is positively correlated to tissue-specific gene expression. Similar associations of enhancer prevalence within genes and tissue-specific expression, along with MIRs’ previous implication as providing regulatory sequences, suggested a possible link between MIRs and enhancers. Results To test the possibility that MIRs contribute functional enhancers to the human genome, we evaluated the relationship between MIRs and human tissue-specific enhancers in terms of genomic location, chromatin environment, regulatory function, and mechanistic attributes. This analysis revealed MIRs to be highly concentrated in enhancers of the K562 and HeLa human cell-types. Significantly more enhancers were found to be linked to MIRs than would be expected by chance, and putative MIR-derived enhancers are characterized by a chromatin environment highly similar to that of canonical enhancers. MIR-derived enhancers show strong associations with gene expression levels, tissue-specific gene expression and tissue-specific cellular functions, including a number of biological processes related to erythropoiesis. MIR-derived enhancers were found to be a rich source of transcription factor binding sites, underscoring one possible mechanistic route for the element sequences co-option as enhancers. There is also tentative evidence to suggest that MIR-enhancer function is related to the transcriptional activity of non-coding RNAs. Conclusions Taken together, these data reveal enhancers to be an important cis-regulatory platform from which MIRs can exercise a regulatory function in the human genome and help to resolve a long-standing conundrum as to the reason for MIRs’ deep evolutionary conservation.
Collapse
Affiliation(s)
- Daudi Jjingo
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Andrew B Conley
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Jianrong Wang
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Leonardo Mariño-Ramírez
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA ; PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia
| | - Victoria V Lunyak
- PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia ; Buck Institute for Research on Aging, Novato, CA, USA
| | - I King Jordan
- School of Biology, Georgia Institute of Technology, Atlanta, GA, USA ; PanAmerican Bioinformatics Institute, Santa Marta, Magdalena, Colombia
| |
Collapse
|
17
|
Lv J, Liu H, Huang Z, Su J, He H, Xiu Y, Zhang Y, Wu Q. Long non-coding RNA identification over mouse brain development by integrative modeling of chromatin and genomic features. Nucleic Acids Res 2013; 41:10044-61. [PMID: 24038472 PMCID: PMC3905897 DOI: 10.1093/nar/gkt818] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
In silico prediction of genomic long non-coding RNAs (lncRNAs) is prerequisite to the construction and elucidation of non-coding regulatory network. Chromatin modifications marked by chromatin regulators are important epigenetic features, which can be captured by prevailing high-throughput approaches such as ChIP sequencing. We demonstrate that the accuracy of lncRNA predictions can be greatly improved when incorporating high-throughput chromatin modifications over mouse embryonic stem differentiation toward adult Cerebellum by logistic regression with LASSO regularization. The discriminating features include H3K9me3, H3K27ac, H3K4me1, open reading frames and several repeat elements. Importantly, chromatin information is suggested to be complementary to genomic sequence information, highlighting the importance of an integrated model. Applying integrated model, we obtain a list of putative lncRNAs based on uncharacterized fragments from transcriptome assembly. We demonstrate that the putative lncRNAs have regulatory roles in vicinity of known gene loci by expression and Gene Ontology enrichment analysis. We also show that the lncRNA expression specificity can be efficiently modeled by the chromatin data with same developmental stage. The study not only supports the biological hypothesis that chromatin can regulate expression of tissue-specific or developmental stage-specific lncRNAs but also reveals the discriminating features between lncRNA and coding genes, which would guide further lncRNA identifications and characterizations.
Collapse
Affiliation(s)
- Jie Lv
- School of Life Science and Technology, State Key Laboratory of Urban Water Resource and Environment, Harbin Institute of Technology, Harbin 150001, China and College of Bioinformatics Science and Technology, Harbin Medical University, Harbin 150081, China
| | | | | | | | | | | | | | | |
Collapse
|
18
|
de Souza FS, Franchini LF, Rubinstein M. Exaptation of transposable elements into novel cis-regulatory elements: is the evidence always strong? Mol Biol Evol 2013; 30:1239-51. [PMID: 23486611 PMCID: PMC3649676 DOI: 10.1093/molbev/mst045] [Citation(s) in RCA: 117] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
Transposable elements (TEs) are mobile genetic sequences that can jump around the genome from one location to another, behaving as genomic parasites. TEs have been particularly effective in colonizing mammalian genomes, and such heavy TE load is expected to have conditioned genome evolution. Indeed, studies conducted both at the gene and genome levels have uncovered TE insertions that seem to have been co-opted--or exapted--by providing transcription factor binding sites (TFBSs) that serve as promoters and enhancers, leading to the hypothesis that TE exaptation is a major factor in the evolution of gene regulation. Here, we critically review the evidence for exaptation of TE-derived sequences as TFBSs, promoters, enhancers, and silencers/insulators both at the gene and genome levels. We classify the functional impact attributed to TE insertions into four categories of increasing complexity and argue that so far very few studies have conclusively demonstrated exaptation of TEs as transcriptional regulatory regions. We also contend that many genome-wide studies dealing with TE exaptation in recent lineages of mammals are still inconclusive and that the hypothesis of rapid transcriptional regulatory rewiring mediated by TE mobilization must be taken with caution. Finally, we suggest experimental approaches that may help attributing higher-order functions to candidate exapted TEs.
Collapse
Affiliation(s)
- Flávio S.J. de Souza
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular, Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
- Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| | - Lucía F. Franchini
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular, Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
| | - Marcelo Rubinstein
- Instituto de Investigaciones en Ingeniería Genética y Biología Molecular, Consejo Nacional de Investigaciones Científicas y Técnicas, Buenos Aires, Argentina
- Departamento de Fisiología, Biología Molecular y Celular, Facultad de Ciencias Exactas y Naturales, Universidad de Buenos Aires, Buenos Aires, Argentina
| |
Collapse
|
19
|
Huda A, Bushel PR. Widespread Exonization of Transposable Elements in Human Coding Sequences is Associated with Epigenetic Regulation of Transcription. ACTA ACUST UNITED AC 2013; 1. [PMID: 24860841 PMCID: PMC4028971 DOI: 10.4172/2329-8936.1000101] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Background Transposable Elements (TEs) have long been regarded as selfish or junk DNA having little or no role in the regulation or functioning of the human genome. However, over the past several years this view came to be challenged as several studies provided anecdotal as well as global evidence for the contribution of TEs to the regulatory and coding needs of human genes. In this study, we explored the incorporation and epigenetic regulation of coding sequences donated by TEs using gene expression and other ancillary genomics data from two human hematopoietic cell-lines: GM12878 (a lymphoblastoid cell line) and K562 (a Chronic Myelogenous Leukemia cell line). In each cell line, we found several thousand instances of TEs donating coding sequences to human genes. We compared the transcriptome assembly of the RNA sequencing (RNA-Seq) reads with and without the aid of a reference transcriptome and found that the percentage of genes that incorporate TEs in their coding sequences is significantly greater than that obtained from the reference transcriptome assemblies using Refseq and Gencode gene models. We also used histone modifications chromatin immunoprecipitation sequencing (ChIP-Seq) data, Cap Analysis of Gene Expression (CAGE) data and DNAseI Hypersensitivity Site (DHS) data to demonstrate the epigenetic regulation of the TE derived coding sequences. Our results suggest that TEs form a significantly higher percentage of coding sequences than represented in gene annotation databases and these TE derived sequences are epigenetically regulated in accordance with their expression in the two cell types.
Collapse
Affiliation(s)
- Ahsan Huda
- Microarray and Genome Informatics Group, National Institute of Environmental Health Sciences, USA ; Kelly Government Solutions, Inc., USA
| | - Pierre R Bushel
- Microarray and Genome Informatics Group, National Institute of Environmental Health Sciences, USA ; Biostatistics Branch, National Institute of Environmental Health Sciences, Research Triangle Park, NC 27709, USA
| |
Collapse
|
20
|
Kim YJ, Lee J, Han K. Transposable Elements: No More 'Junk DNA'. Genomics Inform 2012; 10:226-33. [PMID: 23346034 PMCID: PMC3543922 DOI: 10.5808/gi.2012.10.4.226] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Revised: 11/16/2012] [Accepted: 11/17/2012] [Indexed: 01/03/2023] Open
Abstract
Since the advent of whole-genome sequencing, transposable elements (TEs), just thought to be 'junk' DNA, have been noticed because of their numerous copies in various eukaryotic genomes. Many studies about TEs have been conducted to discover their functions in their host genomes. Based on the results of those studies, it has been generally accepted that they have a function to cause genomic and genetic variations. However, their infinite functions are not fully elucidated. Through various mechanisms, including de novo TE insertions, TE insertion-mediated deletions, and recombination events, they manipulate their host genomes. In this review, we focus on Alu, L1, human endogenous retrovirus, and short interspersed element/variable number of tandem repeats/Alu (SVA) elements and discuss how they have affected primate genomes, especially the human and chimpanzee genomes, since their divergence.
Collapse
Affiliation(s)
- Yun-Ji Kim
- Department of Nanobiomedical Science, WCU Research Center, Dankook University, Cheonan 330-714, Korea
| | | | | |
Collapse
|
21
|
Conley AB, Jordan IK. Cell type-specific termination of transcription by transposable element sequences. Mob DNA 2012; 3:15. [PMID: 23020800 PMCID: PMC3517506 DOI: 10.1186/1759-8753-3-15] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2012] [Accepted: 08/08/2012] [Indexed: 11/17/2022] Open
Abstract
Background Transposable elements (TEs) encode sequences necessary for their own transposition, including signals required for the termination of transcription. TE sequences within the introns of human genes show an antisense orientation bias, which has been proposed to reflect selection against TE sequences in the sense orientation owing to their ability to terminate the transcription of host gene transcripts. While there is evidence in support of this model for some elements, the extent to which TE sequences actually terminate transcription of human gene across the genome remains an open question. Results Using high-throughput sequencing data, we have characterized over 9,000 distinct TE-derived sequences that provide transcription termination sites for 5,747 human genes across eight different cell types. Rarefaction curve analysis suggests that there may be twice as many TE-derived termination sites (TE-TTS) genome-wide among all human cell types. The local chromatin environment for these TE-TTS is similar to that seen for 3′ UTR canonical TTS and distinct from the chromatin environment of other intragenic TE sequences. However, those TE-TTS located within the introns of human genes were found to be far more cell type-specific than the canonical TTS. TE-TTS were much more likely to be found in the sense orientation than other intragenic TE sequences of the same TE family and TE-TTS in the sense orientation terminate transcription more efficiently than those found in the antisense orientation. Alu sequences were found to provide a large number of relatively weak TTS, whereas LTR elements provided a smaller number of much stronger TTS. Conclusions TE sequences provide numerous termination sites to human genes, and TE-derived TTS are particularly cell type-specific. Thus, TE sequences provide a powerful mechanism for the diversification of transcriptional profiles between cell types and among evolutionary lineages, since most TE-TTS are evolutionarily young. The extent of transcription termination by TEs seen here, along with the preference for sense-oriented TE insertions to provide TTS, is consistent with the observed antisense orientation bias of human TEs.
Collapse
Affiliation(s)
- Andrew B Conley
- School of Biology, Georgia Institute of Technology, 310 Ferst Drive, Atlanta, GA 30332, USA.
| | | |
Collapse
|
22
|
Testori A, Caizzi L, Cutrupi S, Friard O, De Bortoli M, Cora' D, Caselle M. The role of Transposable Elements in shaping the combinatorial interaction of Transcription Factors. BMC Genomics 2012; 13:400. [PMID: 22897927 PMCID: PMC3478180 DOI: 10.1186/1471-2164-13-400] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2012] [Accepted: 06/28/2012] [Indexed: 12/22/2022] Open
Abstract
Background In the last few years several studies have shown that Transposable Elements (TEs) in the human genome are significantly associated with Transcription Factor Binding Sites (TFBSs) and that in several cases their expansion within the genome led to a substantial rewiring of the regulatory network. Another important feature of the regulatory network which has been thoroughly studied is the combinatorial organization of transcriptional regulation. In this paper we combine these two observations and suggest that TEs, besides rewiring the network, also played a central role in the evolution of particular patterns of combinatorial gene regulation. Results To address this issue we searched for TEs overlapping Estrogen Receptor α (ERα) binding peaks in two publicly available ChIP-seq datasets from the MCF7 cell line corresponding to different modalities of exposure to estrogen. We found a remarkable enrichment of a few specific classes of Transposons. Among these a prominent role was played by MIR (Mammalian Interspersed Repeats) transposons. These TEs underwent a dramatic expansion at the beginning of the mammalian radiation and then stabilized. We conjecture that the special affinity of ERα for the MIR class of TEs could be at the origin of the important role assumed by ERα in Mammalians. We then searched for TFBSs within the TEs overlapping ChIP-seq peaks. We found a strong enrichment of a few precise combinations of TFBS. In several cases the corresponding Transcription Factors (TFs) were known cofactors of ERα, thus supporting the idea of a co-regulatory role of TFBS within the same TE. Moreover, most of these correlations turned out to be strictly associated to specific classes of TEs thus suggesting the presence of a well-defined "transposon code" within the regulatory network. Conclusions In this work we tried to shed light into the role of Transposable Elements (TEs) in shaping the regulatory network of higher eukaryotes. To test this idea we focused on a particular transcription factor: the Estrogen Receptor α (ERα) and we found that ERα preferentially targets a well defined set of TEs and that these TEs host combinations of transcriptional regulators involving several of known co-regulators of ERα. Moreover, a significant number of these TEs turned out to be conserved between human and mouse and located in the vicinity (and thus candidate to be regulators) of important estrogen-related genes.
Collapse
Affiliation(s)
- Alessandro Testori
- Center for Molecular Systems Biology, University of Turin, Turin, Candiolo I-10060, Italy.
| | | | | | | | | | | | | |
Collapse
|