1
|
Chen D, Cremona MA, Qi Z, Mitra RD, Chiaromonte F, Makova KD. Human L1 Transposition Dynamics Unraveled with Functional Data Analysis. Mol Biol Evol 2021; 37:3576-3600. [PMID: 32722770 DOI: 10.1093/molbev/msaa194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Long INterspersed Elements-1 (L1s) constitute >17% of the human genome and still actively transpose in it. Characterizing L1 transposition across the genome is critical for understanding genome evolution and somatic mutations. However, to date, L1 insertion and fixation patterns have not been studied comprehensively. To fill this gap, we investigated three genome-wide data sets of L1s that integrated at different evolutionary times: 17,037 de novo L1s (from an L1 insertion cell-line experiment conducted in-house), and 1,212 polymorphic and 1,205 human-specific L1s (from public databases). We characterized 49 genomic features-proxying chromatin accessibility, transcriptional activity, replication, recombination, etc.-in the ±50 kb flanks of these elements. These features were contrasted between the three L1 data sets and L1-free regions using state-of-the-art Functional Data Analysis statistical methods, which treat high-resolution data as mathematical functions. Our results indicate that de novo, polymorphic, and human-specific L1s are surrounded by different genomic features acting at specific locations and scales. This led to an integrative model of L1 transposition, according to which L1s preferentially integrate into open-chromatin regions enriched in non-B DNA motifs, whereas they are fixed in regions largely free of purifying selection-depleted of genes and noncoding most conserved elements. Intriguingly, our results suggest that L1 insertions modify local genomic landscape by extending CpG methylation and increasing mononucleotide microsatellite density. Altogether, our findings substantially facilitate understanding of L1 integration and fixation preferences, pave the way for uncovering their role in aging and cancer, and inform their use as mutagenesis tools in genetic studies.
Collapse
Affiliation(s)
- Di Chen
- Intercollege Graduate Degree Program in Genetics, The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
| | - Marzia A Cremona
- Department of Statistics, The Pennsylvania State University, University Park, PA.,Department of Operations and Decision Systems, Université Laval, Québec, Canada
| | - Zongtai Qi
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO
| | - Robi D Mitra
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO
| | - Francesca Chiaromonte
- Department of Statistics, The Pennsylvania State University, University Park, PA.,EMbeDS, Sant'Anna School of Advanced Studies, Pisa, Italy.,The Huck Institutes of the Life Sciences, Center for Medical Genomics, The Pennsylvania State University, University Park, PA
| | - Kateryna D Makova
- The Huck Institutes of the Life Sciences, Center for Medical Genomics, The Pennsylvania State University, University Park, PA.,Department of Biology, The Pennsylvania State University, University Park, PA
| |
Collapse
|
2
|
Klimopoulos A, Sellis D, Almirantis Y. Widespread occurrence of power-law distributions in inter-repeat distances shaped by genome dynamics. Gene 2012; 499:88-98. [PMID: 22370293 DOI: 10.1016/j.gene.2012.02.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Revised: 02/05/2012] [Accepted: 02/06/2012] [Indexed: 11/25/2022]
Abstract
Repetitive DNA sequences derived from transposable elements (TE) are distributed in a non-random way, co-clustering with other classes of repeat elements, genes and other genomic components. In a previous work we reported power-law-like size distributions (linearity in log-log scale) in the spatial arrangement of Alu and LINE1 elements in the human genome. Here we investigate the large-scale features of the spatial arrangement of all principal classes of TEs in 14 genomes from phylogenetically distant organisms by studying the size distribution of inter-repeat distances. Power-law-like size distributions are found to be widespread, extending up to several orders of magnitude. In order to understand the emergence of this distributional pattern, we introduce an evolutionary scenario, which includes (i) Insertions of DNA segments (e.g., more recent repeats) into the considered sequence and (ii) Eliminations of members of the studied TE family. In the proposed model we also incorporate the potential for transposition events (characteristic of the DNA transposons' life-cycle) and segmental duplications. Simulations reproduce the main features of the observed size distributions. Furthermore, we investigate the effects of various genomic features on the presence and extent of power-law size distributions including TE class and age, mode of parental TE transmission, GC content, deletion and recombination rates in the studied genomic region, etc. Our observations corroborate the hypothesis that insertions of genomic material and eliminations of repeats are at the basis of power-laws in inter-repeat distances. The existence of these power-laws could facilitate the formation of the recently proposed "fractal globule" for the confined chromatin organization.
Collapse
Affiliation(s)
- Alexandros Klimopoulos
- National Center for Scientific Research "Demokritos," Institute of Biology, 153 10 Athens, Greece.
| | | | | |
Collapse
|
3
|
Kedar PS, Stefanick DF, Horton JK, Wilson SH. Increased PARP-1 association with DNA in alkylation damaged, PARP-inhibited mouse fibroblasts. Mol Cancer Res 2012; 10:360-8. [PMID: 22246237 DOI: 10.1158/1541-7786.mcr-11-0477] [Citation(s) in RCA: 50] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Treatment of base excision repair-proficient mouse fibroblasts with the DNA alkylating agent methyl methanesulfonate (MMS) and a small molecule inhibitor of PARP-1 results in a striking cell killing phenotype, as previously reported. Earlier studies showed that the mechanism of cell death is apoptosis and requires DNA replication, expression of PARP-1, and an intact S-phase checkpoint cell signaling system. It is proposed that activity-inhibited PARP-1 becomes immobilized at DNA repair intermediates, and that this blocks DNA repair and interferes with DNA replication, eventually promoting an S-phase checkpoint and G(2)-M block. Here we report studies designed to evaluate the prediction that inhibited PARP-1 remains DNA associated in cells undergoing repair of alkylation-induced damage. Using chromatin immunoprecipitation with anti-PARP-1 antibody and qPCR for DNA quantification, a higher level of DNA was found associated with PARP-1 in cells treated with MMS plus PARP inhibitor than in cells without inhibitor treatment. These results have implications for explaining the extreme hypersensitivity phenotype after combination treatment with MMS and a PARP inhibitor.
Collapse
Affiliation(s)
- Padmini S Kedar
- Laboratory of Structural Biology, National Institute of Environmental Health Sciences, NIH, 111 T.W. Alexander Drive, Research Triangle Park, NC 27709, USA
| | | | | | | |
Collapse
|
4
|
Hirakawa M, Nishihara H, Kanehisa M, Okada N. Characterization and evolutionary landscape of AmnSINE1 in Amniota genomes. Gene 2008; 441:100-10. [PMID: 19166919 DOI: 10.1016/j.gene.2008.12.009] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2008] [Revised: 11/29/2008] [Accepted: 12/04/2008] [Indexed: 11/18/2022]
Abstract
Discovery of a large number of conserved non-coding elements (CNEs) in vertebrate genomes provides a cornerstone to elucidate molecular mechanisms of macroevolution. Extensive comparative genomics has proven that transposons such as short interspersed elements (SINEs) were an important source of CNEs. We recently characterized AmnSINE1, a SINE family in Amniota genomes, some of which are present in CNEs, and demonstrated that two AmnSINE1 loci play an important role in mammalian-specific brain development by functioning as an enhancer (Sasaki et al. Proc. Natl. Acad. Sci. USA 2008). To get more information about AmnSINE1s, we here performed a multi-species search for AmnSINE1, and revealed the distribution and evolutionary history of these SINEs in amniote genomes. The number of AmnSINE1 regions in amniotes ranged from 160 to 1200; the number in the eutherians were under 500 and the largest was that in chicken. Phylogenetic analysis established that each AmnSINE1 locus has evolved uniquely, primarily since the divergence of mammals from reptiles. These results support the notion that AmnSINE1s were amplified as an ancient retroposon in a common ancestor of Amniota and subsequently have survived for 300 Myr because of functions acquired by mutation-coupled exaptation prior mammalian radiation. On the basis of sequence homology and conserved synteny, we detected the orthologs of AmnSINE1 for candidates of further enhancer analysis, which are more conserved than two loci that were shown to have been involved in mammalian brain development. The present work provides a comprehensive data set to test the role of AmnSINE1s, many of which were exapted and contributed to mammalian macroevolution.
Collapse
Affiliation(s)
- Mika Hirakawa
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, Gokasho, Uji, Kyoto 611-0011, Japan
| | | | | | | |
Collapse
|
5
|
Lee JY, Ji Z, Tian B. Phylogenetic analysis of mRNA polyadenylation sites reveals a role of transposable elements in evolution of the 3'-end of genes. Nucleic Acids Res 2008; 36:5581-90. [PMID: 18757892 PMCID: PMC2553571 DOI: 10.1093/nar/gkn540] [Citation(s) in RCA: 90] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
mRNA polyadenylation is an essential step for the maturation of almost all eukaryotic mRNAs, and is tightly coupled with termination of transcription in defining the 3′-end of genes. Large numbers of human and mouse genes harbor alternative polyadenylation sites [poly(A) sites] that lead to mRNA variants containing different 3′-untranslated regions (UTRs) and/or encoding distinct protein sequences. Here, we examined the conservation and divergence of different types of alternative poly(A) sites across human, mouse, rat and chicken. We found that the 3′-most poly(A) sites tend to be more conserved than upstream ones, whereas poly(A) sites located upstream of the 3′-most exon, also termed intronic poly(A) sites, tend to be much less conserved. Genes with longer evolutionary history are more likely to have alternative polyadenylation, suggesting gain of poly(A) sites through evolution. We also found that nonconserved poly(A) sites are associated with transposable elements (TEs) to a much greater extent than conserved ones, albeit less frequently utilized. Different classes of TEs have different characteristics in their association with poly(A) sites via exaptation of TE sequences into polyadenylation elements. Our results establish a conservation pattern for alternative poly(A) sites in several vertebrate species, and indicate that the 3′-end of genes can be dynamically modified by TEs through evolution.
Collapse
Affiliation(s)
- Ju Youn Lee
- Graduate School of Biomedical Sciences and Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ 07103, USA
| | | | | |
Collapse
|
6
|
Rayko E, Jabbari K, Bernardi G. The evolution of introns in human duplicated genes. Gene 2006; 365:41-7. [PMID: 16356663 DOI: 10.1016/j.gene.2005.09.038] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2005] [Revised: 07/07/2005] [Accepted: 09/07/2005] [Indexed: 11/17/2022]
Abstract
In previous work [Jabbari, K., Rayko, E., Bernardi, G., 2003. The major shifts of human duplicated genes. Gene 317, 203-208], we investigated the fate of ancient duplicated genes after the compositional transitions that occurred between the genomes of cold- and warm-blooded vertebrates. We found that the majority of duplicated copies were transposed to the "ancestral genome core", the gene-dense genome compartment that underwent a GC enrichment at the compositional transitions. Here, we studied the consequences of the events just outlined on the introns of duplicated genes. We found that, while intron number was highly conserved, total intron size (the sum of intron sizes within any given gene) was smaller in the GC-rich copies compared to the GC-poor copies, especially in dispersed copies (i.e., copies located on different chromosomes or chromosome arms). GC-rich copies also showed higher densities of CpG islands and Alus, whereas GC-poor copies were characterized by higher densities of LINEs. The features of the copies that underwent the compositional transition and became GC-richer are suggestive of, or related to, functional changes.
Collapse
Affiliation(s)
- Edda Rayko
- Laboratoire de Génétique Moléculaire, Institut Jacques Monod, 2 Place Jussieu, F-75005 Paris, France.
| | | | | |
Collapse
|
7
|
Pavlícek A, Jabbari K, Paces J, Paces V, Hejnar JV, Bernardi G. Similar integration but different stability of Alus and LINEs in the human genome. Gene 2001; 276:39-45. [PMID: 11591470 DOI: 10.1016/s0378-1119(01)00645-x] [Citation(s) in RCA: 89] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Abstract
Alus and LINEs (LINE1) are widespread classes of repeats that are very unevenly distributed in the human genome. The majority of GC-poor LINEs reside in the GC-poor isochores whereas GC-rich Alus are mostly present in GC-rich isochores. The discovery that LINES and Alus share similar target site duplication and a common AT-rich insertion site specificity raised the question as to why these two families of repeats show such a different distribution in the genome. This problem was investigated here by studying the isochore distributions of subfamilies of LINES and Alus characterized by different degrees of divergence from the consensus sequences, and of Alus, LINEs and pseudogenes located on chromosomes 21 and 22. Young Alus are more frequent in the GC-poor part of the genome than old Alus. This suggests that the gradual accumulation of Alus in GC-rich isochores has occurred because of their higher stability in compositionally matching chromosomal regions. Densities of Alus and LINEs increase and decrease, respectively, with increasing GC levels, except for the telomeric regions of the analyzed chromosomes. In addition to LINEs, processed pseudogenes are also more frequent in GC-poor isochores. Finally, the present results on Alu and LINE stability/exclusion predict significant losses of Alu DNA from the GC-poor isochores during evolution, a phenomenon apparently due to negative selection against sequences that differ from the isochore composition.
Collapse
Affiliation(s)
- A Pavlícek
- Institute of Molecular Genetics, Academy of Sciences of the Czech Republic, Flemingovo 2, CZ-16637, Prague, Czech Republic
| | | | | | | | | | | |
Collapse
|
8
|
Abstract
We characterized short interspersed elements (SINEs), of the CORE-suprafamily in egg-laying (monotremes), pouched (marsupials) and placental mammals. Five families of these repeats distinguished by the presence of distinct LINE-related 3'-segments shared tRNA-like promoter and the central core region. The putative active elements were reconstructed from the alignment of genomic repeats representing molecular fossils of sequences that amplified in the past and since then underwent multiple mutations. Their mode of proliferation by retroposition was indicated by the presence of: (1) internal RNA PolIII promoter; (2) simple sequence repeated tail; (3) direct repeats; and (4) subfamilies recording the evolution of elements. The copy number of CORE-SINEs in placental genomes was estimated at about 300,000; they were highly divergent and apparently ceased to amplify before radiation of these lineages. On the other hand, among almost half a million fossil elements present in marsupials and monotremes, the youngest subfamilies could still be retropositionally active. CORE-SINEs terminate in sequence repeats of a few nucleotides similar to their 3'-segment LINE-homologues, CR1, L2 and Bov-B. These three LINE elements fall into clades distinct from that of L1 elements which, similar to their co-amplifying SINEs, end in a poly(A) tail. We propose a model in which new CORE-families, with distinct 3'-segments, are created at the RNA level due to template switching between LINE and CORE-RNA during reverse transcription. The proposed mechanism suggests that such an adaptation to the changing amplification machinery facilitated the survival and prosperity of CORE-elements over long evolutionary periods in different lineages.
Collapse
Affiliation(s)
- N Gilbert
- Centre de recherche de l'Hôpital Sainte-Justine, Centre de cancérologie Charles Bruneau, Montréal, H3T 1C5, Canada
| | | |
Collapse
|
9
|
Abstract
The bulk of the human genome is ultimately derived from transposable elements. Observations in the past year lead to some new and surprising ideas on functions and consequences of these elements and their remnants in our genome. The many new examples of human genes derived from single transposon insertions highlight the large contribution of selfish DNA to genomic evolution.
Collapse
Affiliation(s)
- A F Smit
- Axys Pharmaceuticals, Inc., La Jolla, 92037-1029, USA.
| |
Collapse
|