1
|
Baar T, Dümcke S, Gressel S, Schwalb B, Dilthey A, Cramer P, Tresch A. RNA transcription and degradation of Alu retrotransposons depends on sequence features and evolutionary history. G3 GENES|GENOMES|GENETICS 2022; 12:6543614. [PMID: 35253846 PMCID: PMC9073682 DOI: 10.1093/g3journal/jkac054] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 02/25/2022] [Indexed: 11/16/2022]
Abstract
Alu elements are one of the most successful groups of RNA retrotransposons and make up 11% of the human genome with over 1 million individual loci. They are linked to genetic defects, increases in sequence diversity, and influence transcriptional activity. Still, their RNA metabolism is poorly understood yet. It is even unclear whether Alu elements are mostly transcribed by RNA Polymerase II or III. We have conducted a transcription shutoff experiment by α-amanitin and metabolic RNA labeling by 4-thiouridine combined with RNA fragmentation (TT-seq) and RNA-seq to shed further light on the origin and life cycle of Alu transcripts. We find that Alu RNAs are more stable than previously thought and seem to originate in part from RNA Polymerase II activity, as previous reports suggest. Their expression however seems to be independent of the transcriptional activity of adjacent genes. Furthermore, we have developed a novel statistical test for detecting the expression of quantitative trait loci in Alu elements that relies on the de Bruijn graph representation of all Alu sequences. It controls for both statistical significance and biological relevance using a tuned k-mer representation, discovering influential sequence features missed by regular motif search. In addition, we discover several point mutations using a generalized linear model, and motifs of interest, which also match transcription factor-binding motifs.
Collapse
Affiliation(s)
- Till Baar
- Institute of Medical Statistics and Computational Biology, Faculty of Medicine, University of Cologne, Cologne 50937, Germany
| | | | - Saskia Gressel
- Department of Molecular Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | - Björn Schwalb
- Department of Molecular Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | - Alexander Dilthey
- Institute of Medical Microbiology and Hospital Hygiene, Medical Faculty, Heinrich-Heine-University Düsseldorf, Düsseldorf 40225, Germany
| | - Patrick Cramer
- Department of Molecular Biology, Max Planck Institute for Biophysical Chemistry, Göttingen 37077, Germany
| | - Achim Tresch
- Institute of Medical Statistics and Computational Biology, Faculty of Medicine, University of Cologne, Cologne 50937, Germany
- CECAD, University of Cologne, Cologne 50931, Germany
- Center for Data and Simulation Science, University of Cologne, Cologne 50923, Germany
| |
Collapse
|
2
|
Meng H, Feng J, Bai T, Jian Z, Chen Y, Wu G. Genome-wide analysis of short interspersed nuclear elements provides insight into gene and genome evolution in citrus. DNA Res 2020; 27:5818487. [PMID: 32271875 PMCID: PMC7315354 DOI: 10.1093/dnares/dsaa004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2019] [Accepted: 04/03/2020] [Indexed: 12/03/2022] Open
Abstract
Short interspersed nuclear elements (SINEs) are non-autonomous retrotransposons that are highly abundant, but not well annotated, in plant genomes. In this study, we identified 41,573 copies of SINEs in seven citrus genomes, including 11,275 full-length copies. The citrus SINEs were distributed among 12 families, with an average full-length rate of 0.27, and were dispersed throughout the chromosomes, preferentially in AT-rich areas. Approximately 18.4% of citrus SINEs were found in close proximity (≤1 kb upstream) to genes, indicating a significant enrichment of SINEs in promoter regions. Citrus SINEs promote gene and genome evolution by offering exons as well as splice sites and start and stop codons, creating novel genes and forming tandem and dispersed repeat structures. Comparative analysis of unique homologous SINE-containing loci (HSCLs) revealed chromosome rearrangements in sweet orange, pummelo, and mandarin, suggesting that unique HSCLs might be valuable for understanding chromosomal abnormalities. This study of SINEs provides us with new perspectives and new avenues by which to understand the evolution of citrus genes and genomes.
Collapse
Affiliation(s)
- Haijun Meng
- College of Horticulture, Henan Agricultural University, Zhengzhou 450002, China
| | - Jiancan Feng
- College of Horticulture, Henan Agricultural University, Zhengzhou 450002, China
| | - Tuanhui Bai
- College of Horticulture, Henan Agricultural University, Zhengzhou 450002, China
| | - Zaihai Jian
- College of Horticulture, Henan Agricultural University, Zhengzhou 450002, China
| | - Yanhui Chen
- College of Horticulture, Henan Agricultural University, Zhengzhou 450002, China
| | - Guoliang Wu
- College of Horticulture, Henan Agricultural University, Zhengzhou 450002, China
| |
Collapse
|
3
|
Schwichtenberg K, Wenke T, Zakrzewski F, Seibt KM, Minoche A, Dohm JC, Weisshaar B, Himmelbauer H, Schmidt T. Diversification, evolution and methylation of short interspersed nuclear element families in sugar beet and related Amaranthaceae species. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2016; 85:229-44. [PMID: 26676716 DOI: 10.1111/tpj.13103] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2015] [Revised: 11/23/2015] [Accepted: 11/26/2015] [Indexed: 05/18/2023]
Abstract
Short interspersed nuclear elements (SINEs) are non-autonomous non-long terminal repeat retrotransposons which are widely distributed in eukaryotic organisms. While SINEs have been intensively studied in animals, only limited information is available about plant SINEs. We analysed 22 SINE families from seven genomes of the Amaranthaceae family and identified 34 806 SINEs, including 19 549 full-length copies. With the focus on sugar beet (Beta vulgaris), we performed a comparative analysis of the diversity, genomic and chromosomal organization and the methylation of SINEs to provide a detailed insight into the evolution and age of Amaranthaceae SINEs. The lengths of consensus sequences of SINEs range from 113 nucleotides (nt) up to 224 nt. The SINEs show dispersed distribution on all chromosomes but were found with higher incidence in subterminal euchromatic chromosome regions. The methylation of SINEs is increased compared with their flanking regions, and the strongest effect is visible for cytosines in the CHH context, indicating an involvement of asymmetric methylation in the silencing of SINEs.
Collapse
Affiliation(s)
| | - Torsten Wenke
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - Falk Zakrzewski
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - Kathrin M Seibt
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| | - André Minoche
- Max Planck Institute for Molecular Genetics, 14195, Berlin, Germany
- Garvan Institute of Medical Research, 2010, Sydney, NSW, Australia
| | - Juliane C Dohm
- Max Planck Institute for Molecular Genetics, 14195, Berlin, Germany
- Department of Biotechnology, University of Natural Resources and Life Sciences (BOKU), 1190, Vienna, Austria
| | - Bernd Weisshaar
- CeBiTec & Department of Biology, University of Bielefeld, 33615, Bielefeld, Germany
| | - Heinz Himmelbauer
- Garvan Institute of Medical Research, 2010, Sydney, NSW, Australia
- Department of Biotechnology, University of Natural Resources and Life Sciences (BOKU), 1190, Vienna, Austria
| | - Thomas Schmidt
- Institute of Botany, Technische Universität Dresden, 01069, Dresden, Germany
| |
Collapse
|
4
|
Walters-Conte KB, Johnson DLE, Johnson WE, O’Brien SJ, Pecon-Slattery J. The dynamic proliferation of CanSINEs mirrors the complex evolution of Feliforms. BMC Evol Biol 2014; 14:137. [PMID: 24947429 PMCID: PMC4084570 DOI: 10.1186/1471-2148-14-137] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2014] [Accepted: 06/11/2014] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Repetitive short interspersed elements (SINEs) are retrotransposons ubiquitous in mammalian genomes and are highly informative markers to identify species and phylogenetic associations. Of these, SINEs unique to the order Carnivora (CanSINEs) yield novel insights on genome evolution in domestic dogs and cats, but less is known about their role in related carnivores. In particular, genome-wide assessment of CanSINE evolution has yet to be completed across the Feliformia (cat-like) suborder of Carnivora. Within Feliformia, the cat family Felidae is composed of 37 species and numerous subspecies organized into eight monophyletic lineages that likely arose 10 million years ago. Using the Felidae family as a reference phylogeny, along with representative taxa from other families of Feliformia, the origin, proliferation and evolution of CanSINEs within the suborder were assessed. RESULTS We identified 93 novel intergenic CanSINE loci in Feliformia. Sequence analyses separated Feliform CanSINEs into two subfamilies, each characterized by distinct RNA polymerase binding motifs and phylogenetic associations. Subfamily I CanSINEs arose early within Feliformia but are no longer under active proliferation. Subfamily II loci are more recent, exclusive to Felidae and show evidence for adaptation to extant RNA polymerase activity. Further, presence/absence distributions of CanSINE loci are largely congruent with taxonomic expectations within Feliformia and the less resolved nodes in the Felidae reference phylogeny present equally ambiguous CanSINE data. SINEs are thought to be nearly impervious to excision from the genome. However, we observed a nearly complete excision of a CanSINEs locus in puma (Puma concolor). In addition, we found that CanSINE proliferation in Felidae frequently targeted existing CanSINE loci for insertion sites, resulting in tandem arrays. CONCLUSIONS We demonstrate the existence of at least two SINE families within the Feliformia suborder, one of which is actively involved in insertional mutagenesis. We find SINEs are powerful markers of speciation and conclude that the few inconsistencies with expected patterns of speciation likely represent incomplete lineage sorting, species hybridization and SINE-mediated genome rearrangement.
Collapse
Affiliation(s)
- Kathryn B Walters-Conte
- Department of Biology, American University, 101 Hurst Hall 4440 Massachusetts Ave, Washington, DC 20016, USA
| | - Diana LE Johnson
- Department of Biological Sciences, The George Washington University, 2036 G St, Washington, DC 20009, USA
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630, USA
| | - Stephen J O’Brien
- Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, 41 A, Sredniy Avenue St., Petersburg 199034, Russia
| | - Jill Pecon-Slattery
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630, USA
| |
Collapse
|
5
|
Transposable elements are a significant contributor to tandem repeats in the human genome. Comp Funct Genomics 2012; 2012:947089. [PMID: 22792041 PMCID: PMC3389668 DOI: 10.1155/2012/947089] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2012] [Revised: 04/10/2012] [Accepted: 04/11/2012] [Indexed: 11/17/2022] Open
Abstract
Sequence repeats are an important phenomenon in the human genome, playing important roles in genomic alteration often with phenotypic consequences. The two major types of repeat elements in the human genome are tandem repeats (TRs) including microsatellites, minisatellites, and satellites and transposable elements (TEs). So far, very little has been known about the relationship between these two types of repeats. In this study, we identified TRs that are derived from TEs either based on sequence similarity or overlapping genomic positions. We then analyzed the distribution of these TRs among TE families/subfamilies. Our study shows that at least 7,276 TRs or 23% of all minisatellites/satellites is derived from TEs, contributing ∼0.32% of the human genome. TRs seem to be generated more likely from younger/more active TEs, and once initiated they are expanded with time via local duplication of the repeat units. The currently postulated mechanisms for origin of TRs can explain only 6% of all TE-derived TRs, indicating the presence of one or more yet to be identified mechanisms for the initiation of such repeats. Our result suggests that TEs are contributing to genome expansion and alteration not only by transposition but also by generating tandem repeats.
Collapse
|
6
|
Klimopoulos A, Sellis D, Almirantis Y. Widespread occurrence of power-law distributions in inter-repeat distances shaped by genome dynamics. Gene 2012; 499:88-98. [PMID: 22370293 DOI: 10.1016/j.gene.2012.02.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Revised: 02/05/2012] [Accepted: 02/06/2012] [Indexed: 11/25/2022]
Abstract
Repetitive DNA sequences derived from transposable elements (TE) are distributed in a non-random way, co-clustering with other classes of repeat elements, genes and other genomic components. In a previous work we reported power-law-like size distributions (linearity in log-log scale) in the spatial arrangement of Alu and LINE1 elements in the human genome. Here we investigate the large-scale features of the spatial arrangement of all principal classes of TEs in 14 genomes from phylogenetically distant organisms by studying the size distribution of inter-repeat distances. Power-law-like size distributions are found to be widespread, extending up to several orders of magnitude. In order to understand the emergence of this distributional pattern, we introduce an evolutionary scenario, which includes (i) Insertions of DNA segments (e.g., more recent repeats) into the considered sequence and (ii) Eliminations of members of the studied TE family. In the proposed model we also incorporate the potential for transposition events (characteristic of the DNA transposons' life-cycle) and segmental duplications. Simulations reproduce the main features of the observed size distributions. Furthermore, we investigate the effects of various genomic features on the presence and extent of power-law size distributions including TE class and age, mode of parental TE transmission, GC content, deletion and recombination rates in the studied genomic region, etc. Our observations corroborate the hypothesis that insertions of genomic material and eliminations of repeats are at the basis of power-laws in inter-repeat distances. The existence of these power-laws could facilitate the formation of the recently proposed "fractal globule" for the confined chromatin organization.
Collapse
Affiliation(s)
- Alexandros Klimopoulos
- National Center for Scientific Research "Demokritos," Institute of Biology, 153 10 Athens, Greece.
| | | | | |
Collapse
|
7
|
Walters-Conte KB, Johnson DLE, Allard MW, Pecon-Slattery J. Carnivore-specific SINEs (Can-SINEs): distribution, evolution, and genomic impact. J Hered 2011; 102 Suppl 1:S2-10. [PMID: 21846743 DOI: 10.1093/jhered/esr051] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Short interspersed nuclear elements (SINEs) are a type of class 1 transposable element (retrotransposon) with features that allow investigators to resolve evolutionary relationships between populations and species while providing insight into genome composition and function. Characterization of a Carnivora-specific SINE family, Can-SINEs, has, has aided comparative genomic studies by providing rare genomic changes, and neutral sequence variants often needed to resolve difficult evolutionary questions. In addition, Can-SINEs constitute a significant source of functional diversity with Carnivora. Publication of the whole-genome sequence of domestic dog, domestic cat, and giant panda serves as a valuable resource in comparative genomic inferences gleaned from Can-SINEs. In anticipation of forthcoming studies bolstered by new genomic data, this review describes the discovery and characterization of Can-SINE motifs as well as describes composition, distribution, and effect on genome function. As the contribution of noncoding sequences to genomic diversity becomes more apparent, SINEs and other transposable elements will play an increasingly large role in mammalian comparative genomics.
Collapse
|
8
|
Belancio VP, Roy-Engel AM, Deininger PL. All y'all need to know 'bout retroelements in cancer. Semin Cancer Biol 2010; 20:200-10. [PMID: 20600922 DOI: 10.1016/j.semcancer.2010.06.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2010] [Revised: 06/14/2010] [Accepted: 06/17/2010] [Indexed: 01/08/2023]
Abstract
Genetic instability is one of the principal hallmarks and causative factors in cancer. Human transposable elements (TE) have been reported to cause human diseases, including several types of cancer through insertional mutagenesis of genes critical for preventing or driving malignant transformation. In addition to retrotransposition-associated mutagenesis, TEs have been found to contribute even more genomic rearrangements through non-allelic homologous recombination. TEs also have the potential to generate a wide range of mutations derivation of which is difficult to directly trace to mobile elements, including double strand breaks that may trigger mutagenic genomic rearrangements. Genome-wide hypomethylation of TE promoters and significantly elevated TE expression in almost all human cancers often accompanied by the loss of critical DNA sensing and repair pathways suggests that the negative impact of mobile elements on genome stability should increase as human tumors evolve. The biological consequences of elevated retroelement expression, such as the rate of their amplification, in human cancers remain obscure, particularly, how this increase translates into disease-relevant mutations. This review is focused on the cellular mechanisms that control human TE-associated mutagenesis in cancer and summarizes the current understanding of TE contribution to genetic instability in human malignancies.
Collapse
Affiliation(s)
- Victoria P Belancio
- Tulane University, Department of Structural and Cellular Biology, School of Medicine, Tulane Cancer Center and Tulane Center for Aging, New Orleans, LA 70112, USA
| | | | | |
Collapse
|
9
|
Akasaki T, Nikaido M, Nishihara H, Tsuchiya K, Segawa S, Okada N. Characterization of a novel SINE superfamily from invertebrates: "Ceph-SINEs" from the genomes of squids and cuttlefish. Gene 2009; 454:8-19. [PMID: 19914361 DOI: 10.1016/j.gene.2009.11.005] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2009] [Revised: 10/30/2009] [Accepted: 11/06/2009] [Indexed: 11/27/2022]
Abstract
Five tRNA-derived short interspersed repetitive elements (SINEs), named SepiaSINE, Sepioth-SINE1, Sepioth-SINE2A, Sepioth-SINE2B and OegopSINE, were isolated from the genomes of three decabrachian species [Sepia officinalis (order Sepiida), Sepiotheuthis lessoniana (suborder Myopsida), and Mastigoteuthis cordiformes (suborder Oegopsida)], by random sequencing and genome screening. In addition, two tRNA-derived SINEs, named IdioSINE1 and IdioSINE2, were further detected from EST (expressed sequence tag) data of Idiosepius paradoxus (order Idiosepiida), using a GenBank FASTA search with a conserved sequence of the SepiaSINE as the query. All the isolated SINEs had a common and unique highly conserved 149-bp sequence in their central structures (Sepioth-SINE2B and IdioSINEs, however, had a continuous 73-bp deletion in the conserved region.), and are therefore grouped as the fourth SINE superfamily "Ceph-SINEs", following the CORE-SINE, V-SINE, and DeuSINE superfamilies. Our analysis suggested that the central conserved region called the "Ceph-domain" might have originated before the diversification of cephalopods (505 myr ago). A sequence alignment of Sepioth-SINE1, Sepioth-SINE2A, and Sepioth-SINE2B demonstrated that Sepioth-SINE2A has a chimeric structure shared with two other SINEs. The above relationship suggests possible template switching in the central conserved domain during reverse transcription for the birth of Sepioth-SINE2A, providing the possibility that the presence of the conserved domain contributed to yield a variety of SINEs during evolution. Furthermore, the distributions of the isolated SINEs showed that order Sepiida, suborders Oegopsida and Myopsida, and order Idiosepiida have their own independent SINE(s), and suggest that order Sepiida can be largely separated into two groups, with clarification of the phylogenetic relatedness between subfamily Sepioteuthinae and the other loliginid squids.
Collapse
Affiliation(s)
- Tetsuya Akasaki
- Department of Biological Science, Graduate School of Bioscience and Biotechnology, Tokyo Institute of Technology, 4259, Nagatsuta-cho, Midori-ku, Yokohama 226-8501, Japan
| | | | | | | | | | | |
Collapse
|
10
|
Zhang K, Fan W, Deininger P, Edwards A, Xu Z, Zhu D. Breaking the computational barrier: a divide-conquer and aggregate based approach for Alu insertion site characterisation. ACTA ACUST UNITED AC 2009; 2:302-22. [PMID: 20090173 DOI: 10.1504/ijcbdd.2009.030763] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Insertion site characterisation of Alu elements is an important problem in primate-specific bioinformatics research. Key characteristics of this challenging problem include: data are not in the pre-defined feature vectors for predictive model construction; without any prior knowledge, can we discover the general patterns that could exist and also make biological insights?; how to obtain the compact yet discriminative patterns given a search space of 4(200)? This paper provides an integrated algorithmic framework for fulfilling the above mining tasks. Compared to the benchmark biological study, our results provide a further refined analysis of the patterns involved in Alu insertion. In particular, we acquire a 200nt predictive profile around the primary insertion site which not only contains the widely accepted consensus, but also suggests a longer pattern (T(7)AA[G'A]AATAA. This pattern provides more insight into the favourable sequence variations allowed for preferred binding and cleavage by the L1 ORF2 endonuclease. The proposed method is general enough that can be also applied to other sequence detection problems, such as microRNA target prediction.
Collapse
Affiliation(s)
- Kun Zhang
- Department of Computer Science, Xavier University of Louisiana, New Orleans, Louisiana 70125, USA.
| | | | | | | | | | | |
Collapse
|
11
|
Gilbert C, Pace JK, Waters PD. Target site analysis of RTE1_LA and its AfroSINE partner in the elephant genome. Gene 2008; 425:1-8. [PMID: 18796327 DOI: 10.1016/j.gene.2008.08.013] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2008] [Revised: 08/18/2008] [Accepted: 08/18/2008] [Indexed: 10/21/2022]
Abstract
SINEs retrotranspose using their partner LINE's enzymatic machinery. It has recently been proposed that AfroSINEs ending with GGTTT 3' tandem repeats were mobilized by RTE elements ending with CAA 3' tandem repeats in the Afrotherian genome. Using sequences from the elephant genome, we show that AfroSINEs derive from RTE ending with GGTTT-like 3' tandem repeats, a subgroup of RTE1_LA that only reached low copy number, and confirm that they were most likely mobilized by RTE ending with CAA(n) tandem repeats (RTE1_LA-CAA(n)). This partnership is supported by sequence similarity between two regions of the elements, overlap in the timing of their activity, common features of their target site consensus that are not shared by other members of the RTE family, and their high copy number. Detailed analyses of pre-insertion loci reveal that like many other apurinic/apyrimidinic endonuclease encoding elements, RTE1_LA-CAA(n) shows loose target site specificity. In addition, the RTE1_LA-CAA(n) target site consensus shares several structural and primary sequence features with that of LINE1, suggesting that these two elements share close functional similarity in the target primed reverse transcription (TPRT) reaction. Interestingly, although globally similar, the target site consensus of AfroSINE(Anc) and RTE1_LA-CAA(n) differ in several aspects. These differences, not observed among all SINE/LINE pairs so far examined, are most likely due to the fact that AfroSINEs and RTE1_LA-CAA(n) are terminated by a different tandem repeat motif. We propose that these differences reflect constraints imposed by base pairing interactions between the mRNA 3' terminal tandem repeats and the target DNA at the onset of TPRT. So in addition to the endonuclease nicking preference, the mRNA of these elements appears to play an important role in integration site choice through a passive, post-nicking, selective process.
Collapse
Affiliation(s)
- Clément Gilbert
- Evolutionary Genomics Group, Department of Botany and Zoology, University of Stellenbosch, Stellenbosch, South Africa.
| | | | | |
Collapse
|
12
|
Hizer SE, Tamulis WG, Robertson LM, Garcia DK. Evidence of multiple retrotransposons in two litopenaeid species. Anim Genet 2008; 39:363-73. [PMID: 18557973 DOI: 10.1111/j.1365-2052.2008.01739.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Retrotransposons encompass a specific class of mobile genetic elements that are widespread across eukaryotic genomes. The impact of the varied types of retrotransposons on these genomes is just beginning to be deciphered. In a step towards understanding their role in litopenaeid shrimp, we have herein identified nine non-LTR retrotransposons, among which several appear to exist outside the standard defined clades. Two Litopenaeus stylirostris elements were discovered through degenerate PCR amplification using previously defined non-LTR degenerate primers, and through primers designed from a RAPD-derived sequence. A third genomic L. stylirostris element was identified using specific priming from an amplification protocol. These three PCR-derived sequences showed conserved domains of the non-LTR reverse transcriptase gene. In silico searching of genome databases and subsequent contig construction yielded six non-LTR retrotransposons (both genomic and expressed) in the Litopenaeus vannamei genome that also exhibited the highly conserved domains found in our PCR-derived sequences. Phylogenetic placement among representatives from all non-LTR clades showed a possibly novel monophyletic group that included five of our nine sequences. This group, which included elements from both L. stylirostris and L. vannamei, appeared most closely related to the highly active RTE clade. Our remaining four sequences placed in the CR1 and I clades of retrotransposons, with one showing strong similarity to ancient Penelope elements. This research describes three newly discovered retrotransposons in the L. stylirostris genome. Phylogenetic analysis clusters these in a monophyletic grouping with retrotransposons previously described from two closely related species, L. vannamei and Penaeus monodon.
Collapse
Affiliation(s)
- S E Hizer
- Department of Biological Sciences, California State University, San Marcos, CA 920296, USA
| | | | | | | |
Collapse
|
13
|
Belancio VP, Hedges DJ, Deininger P. Mammalian non-LTR retrotransposons: for better or worse, in sickness and in health. Genome Res 2008; 18:343-58. [PMID: 18256243 DOI: 10.1101/gr.5558208] [Citation(s) in RCA: 224] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Transposable elements (TEs) have shared an exceptionally long coexistence with their host organisms and have come to occupy a significant fraction of eukaryotic genomes. The bulk of the expansion occurring within mammalian genomes has arisen from the activity of type I retrotransposons, which amplify in a "copy-and-paste" fashion through an RNA intermediate. For better or worse, the sequences of these retrotransposons are now wedded to the genomes of their mammalian hosts. Although there are several reported instances of the positive contribution of mobile elements to their host genomes, these discoveries have occurred alongside growing evidence of the role of TEs in human disease and genetic instability. Here we examine, with a particular emphasis on human retrotransposon activity, several newly discovered aspects of mammalian retrotransposon biology. We consider their potential impact on host biology as well as their ultimate implications for the nature of the TE-host relationship.
Collapse
Affiliation(s)
- Victoria P Belancio
- Tulane Cancer Center and Department of Epidemiology, Tulane University Health Sciences Center, New Orleans, Louisiana 70112, USA
| | | | | |
Collapse
|
14
|
Jurka J, Kapitonov VV, Kohany O, Jurka MV. Repetitive sequences in complex genomes: structure and evolution. Annu Rev Genomics Hum Genet 2007; 8:241-59. [PMID: 17506661 DOI: 10.1146/annurev.genom.8.080706.092416] [Citation(s) in RCA: 238] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Eukaryotic genomes contain vast amounts of repetitive DNA derived from transposable elements (TEs). Large-scale sequencing of these genomes has produced an unprecedented wealth of information about the origin, diversity, and genomic impact of what was once thought to be "junk DNA." This has also led to the identification of two new classes of DNA transposons, Helitrons and Polintons, as well as several new superfamilies and thousands of new families. TEs are evolutionary precursors of many genes, including RAG1, which plays a role in the vertebrate immune system. They are also the driving force in the evolution of epigenetic regulation and have a long-term impact on genomic stability and evolution. Remnants of TEs appear to be overrepresented in transcription regulatory modules and other regions conserved among distantly related species, which may have implications for our understanding of their impact on speciation.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, Mountain View, California 94043, USA.
| | | | | | | |
Collapse
|
15
|
Abstract
Mobile elements have been recognized as powerful tools for phylogenetic and population-level analyses. However, issues regarding potential sources of homoplasy and other misleading events have been raised. We have collected available data for all phylogenetic and population level studies of primates utilizing Alu insertion data and examined them for potentially homoplasious and other misleading events. Very low levels of each potential confounding factor in a phylogenetic or population analysis (i.e., lineage sorting, parallel insertions, and precise excision) were found. Although taxa known to be subject to high levels of these types of events may indeed be subject to problems when using SINE analysis, we propose that most taxa will respond as the order Primates has--by the resolution of several long-standing problems observed using sequence-based methods.
Collapse
Affiliation(s)
- David A Ray
- Department of Biology, West Virginia University, PO Box 6057, Morgantown, West Virginia 26506, USA
| | | | | | | |
Collapse
|
16
|
Hedges DJ, Deininger PL. Inviting instability: Transposable elements, double-strand breaks, and the maintenance of genome integrity. Mutat Res 2006; 616:46-59. [PMID: 17157332 PMCID: PMC1850990 DOI: 10.1016/j.mrfmmm.2006.11.021] [Citation(s) in RCA: 214] [Impact Index Per Article: 11.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
The ubiquity of mobile elements in mammalian genomes poses considerable challenges for the maintenance of genome integrity. The predisposition of mobile elements towards participation in genomic rearrangements is largely a consequence of their interspersed homologous nature. As tracts of nonallelic sequence homology, they have the potential to interact in a disruptive manner during both meiotic recombination and DNA repair processes, resulting in genomic alterations ranging from deletions and duplications to large-scale chromosomal rearrangements. Although the deleterious effects of transposable element (TE) insertion events have been extensively documented, it is arguably through post-insertion genomic instability that they pose the greatest hazard to their host genomes. Despite the periodic generation of important evolutionary innovations, genomic alterations involving TE sequences are far more frequently neutral or deleterious in nature. The potentially negative consequences of this instability are perhaps best illustrated by the >25 human genetic diseases that are attributable to TE-mediated rearrangements. Some of these rearrangements, such as those involving the MLL locus in leukemia and the LDL receptor in familial hypercholesterolemia, represent recurrent mutations that have independently arisen multiple times in human populations. While TE-instability has been a potent force in shaping eukaryotic genomes and a significant source of genetic disease, much concerning the mechanisms governing the frequency and variety of these events remains to be clarified. Here we survey the current state of knowledge regarding the mechanisms underlying mobile element-based genetic instability in mammals. Compared to simpler eukaryotic systems, mammalian cells appear to have several modifications to their DNA-repair ensemble that allow them to better cope with the large amount of interspersed homology that has been generated by TEs. In addition to the disruptive potential of nonallelic sequence homology, we also consider recent evidence suggesting that the endonuclease products of TEs may also play a key role in instigating mammalian genomic instability.
Collapse
Affiliation(s)
- D J Hedges
- Tulane Cancer Center, SL66 and Department of Epidemiology, Tulane University Health Sciences Center, 1430 Tulane Avenue, New Orleans, LA 70112, USA
| | | |
Collapse
|
17
|
Gasior SL, Preston G, Hedges DJ, Gilbert N, Moran JV, Deininger PL. Characterization of pre-insertion loci of de novo L1 insertions. Gene 2006; 390:190-8. [PMID: 17067767 PMCID: PMC1850991 DOI: 10.1016/j.gene.2006.08.024] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2006] [Revised: 08/21/2006] [Accepted: 08/22/2006] [Indexed: 10/24/2022]
Abstract
The human Long Interspersed Element-1 (LINE-1) and the Short Interspersed Element (SINE) Alu comprise 28% of the human genome. They share the same L1-encoded endonuclease for insertion, which recognizes an A+T-rich sequence. Under a simple model of insertion distribution, this nucleotide preference would lead to the prediction that the populations of both elements would be biased towards A+T-rich regions. Genomic L1 elements do show an A+T-rich bias. In contrast, Alu is biased towards G+C-rich regions when compared to the genome average. Several analyses have demonstrated that relatively recent insertions of both elements show less G+C content bias relative to older elements. We have analyzed the repetitive element and G+C composition of more than 100 pre-insertion loci derived from de novo L1 insertions in cultured human cancer cells, which should represent an evolutionarily unbiased set of insertions. An A+T-rich bias is observed in the 50 bp flanking the endonuclease target site, consistent with the known target site for the L1 endonuclease. The L1, Alu, and G+C content of 20 kb of the de novo pre-insertion loci shows a different set of biases than that observed for fixed L1s in the human genome. In contrast to the insertion sites of genomic L1s, the de novo L1 pre-insertion loci are relatively L1-poor, Alu-rich and G+C neutral. Finally, a statistically significant cluster of de novo L1 insertions was localized in the vicinity of the c-myc gene. These results suggest that the initial insertion preference of L1, while A+T-rich in the initial vicinity of the break site, can be influenced by the broader content of the flanking genomic region and have implications for understanding the dynamics of L1 and Alu distributions in the human genome.
Collapse
Affiliation(s)
- Stephen L. Gasior
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
| | - Graeme Preston
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
| | - Dale J. Hedges
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
| | - Nicolas Gilbert
- Institut de Génétique Humaine, CNRS, UPR 1142, 141 rue de la Cardonille, 34396 Montpellier cedex 5, France
| | - John V. Moran
- Departments of Human Genetics and Internal Medicine, 1241 E. Catherine St., University of Michigan Medical School, Ann Arbor, Michigan 48109-0618
| | - Prescott L. Deininger
- Tulane Cancer Center and Dept. of Epidemiology, Tulane University Health Sciences Center SL-66, 1430 Tulane Ave., New Orleans, LA 70112, Phone: (504) 988-6385, Fax: (504) 988-5516,
- *Address for Correspondence: Tulane Cancer Center, SL66, Tulane University Health Sciences Center, 1430 Tulane Ave., New Orleans, LA 70112, 504-988-6385,
| |
Collapse
|
18
|
Cordaux R, Lee J, Dinoso L, Batzer MA. Recently integrated Alu retrotransposons are essentially neutral residents of the human genome. Gene 2006; 373:138-44. [PMID: 16527433 DOI: 10.1016/j.gene.2006.01.020] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2005] [Revised: 01/18/2006] [Accepted: 01/21/2006] [Indexed: 10/24/2022]
Abstract
Alu elements represent the largest family of human mobile elements in copy number. A controversial issue with implications for both Alu biology and human genome evolution is whether selective pressures are affecting Alu elements on a large scale. To address this issue, we analyzed the genomic distribution of the three youngest known human Alu subfamilies (Ya5a2, Ya8 and Yb9) in conjunction with their insertion polymorphism status in the human population, since selection can only act on polymorphic elements. Our results indicate that: (i) polymorphic and fixed recently integrated Alu elements are found in genomic regions whose GC contents are statistically indistinguishable, and (ii) recently integrated Alu elements are inserted randomly, regardless of the GC content of the surrounding genomic DNA. These results provide strong evidence that recently integrated "young" Alu elements are not subject to positive or negative selection on a large scale. Therefore, young Alu elements can be regarded as essentially neutral residents of the human genome. These results also imply that selective processes specifically targeting Alu elements can be ruled out as explanations for the accumulation of Alu elements in GC-rich regions of the human genome.
Collapse
Affiliation(s)
- Richard Cordaux
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | |
Collapse
|
19
|
Gasior SL, Wakeman TP, Xu B, Deininger PL. The human LINE-1 retrotransposon creates DNA double-strand breaks. J Mol Biol 2006; 357:1383-93. [PMID: 16490214 PMCID: PMC4136747 DOI: 10.1016/j.jmb.2006.01.089] [Citation(s) in RCA: 350] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2005] [Revised: 01/25/2006] [Accepted: 01/26/2006] [Indexed: 11/28/2022]
Abstract
Long interspersed element-1 (L1) is an autonomous retroelement that is active in the human genome. The proposed mechanism of insertion for L1 suggests that cleavage of both strands of genomic DNA is required. We demonstrate that L1 expression leads to a high level of double-strand break (DSB) formation in DNA using immunolocalization of gamma-H2AX foci and the COMET assay. Similar to its role in mediating DSB repair in response to radiation, ATM is required for L1-induced gamma-H2AX foci and for L1 retrotransposition. This is the first characterization of a DNA repair response from expression of a non-long terminal repeat (non-LTR) retrotransposon in mammalian cells as well as the first demonstration that a host DNA repair gene is required for successful integration. Notably, the number of L1-induced DSBs is greater than the predicted numbers of successful insertions, suggesting a significant degree of inefficiency during the integration process. This result suggests that the endonuclease activity of endogenously expressed L1 elements could contribute to DSB formation in germ-line and somatic tissues.
Collapse
Affiliation(s)
- Stephen L. Gasior
- Tulane Cancer Center and Department of Epidemiology Tulane University Health Sciences Center, 1430 Tulane Ave., New Orleans, LA 70112 USA
| | - Timothy P. Wakeman
- Stanley S. Scott Cancer Center and Department of Genetics Louisiana State University Health Sciences Center, 533 Bolivar Street, Room 406 New Orleans, LA 70112, USA
| | - Bo Xu
- Stanley S. Scott Cancer Center and Department of Genetics Louisiana State University Health Sciences Center, 533 Bolivar Street, Room 406 New Orleans, LA 70112, USA
| | - Prescott L. Deininger
- Tulane Cancer Center and Department of Epidemiology Tulane University Health Sciences Center, 1430 Tulane Ave., New Orleans, LA 70112 USA
- Corresponding author
| |
Collapse
|
20
|
Jurka J, Gentles AJ. Origin and diversification of minisatellites derived from human Alu sequences. Gene 2005; 365:21-6. [PMID: 16343813 DOI: 10.1016/j.gene.2005.09.029] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2005] [Revised: 08/02/2005] [Accepted: 09/07/2005] [Indexed: 11/25/2022]
Abstract
We analyze minisatellites derived from Alu fragments corresponding approximately to the first 44 bases of human Alu consensus sequences from different subfamilies. The origin of Alu-derived minisatellites appears to have been mediated by short flanking repeats, as first proposed by Haber and Louis [Haber, J.E., Louis, E.J., 1998. Minisatellite origins in yeast and humans. Genomics 48, 132-135.]. We also present evidence for base substitutions and deletions introduced to minisatellites by gene conversion with partially similar but unrelated flanking regions. Segments flanked by short direct repeats are relatively common in different regions of Alu and other repetitive sequences. Our analysis shows that they can be effectively used in comparative studies of the overall sequence context which may contribute to instability of DNA segments flanked by short direct repeats.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, USA.
| | | |
Collapse
|