1
|
Shin W, Mun S, Han K. Human Endogenous Retrovirus-K (HML-2)-Related Genetic Variation: Human Genome Diversity and Disease. Genes (Basel) 2023; 14:2150. [PMID: 38136972 PMCID: PMC10742618 DOI: 10.3390/genes14122150] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 11/23/2023] [Accepted: 11/26/2023] [Indexed: 12/24/2023] Open
Abstract
Human endogenous retroviruses (HERVs) comprise a significant portion of the human genome, making up roughly 8%, a notable comparison to the 2-3% represented by coding sequences. Numerous studies have underscored the critical role and importance of HERVs, highlighting their diverse and extensive influence on the evolution of the human genome and establishing their complex correlation with various diseases. Among HERVs, the HERV-K (HML-2) subfamily has recently attracted significant attention, integrating into the human genome after the divergence between humans and chimpanzees. Its insertion in the human genome has received considerable attention due to its structural and functional characteristics and the time of insertion. Originating from ancient exogenous retroviruses, these elements succeeded in infecting germ cells, enabling vertical transmission and existing as proviruses within the genome. Remarkably, these sequences have retained the capacity to form complete viral sequences, exhibiting activity in transcription and translation. The HERV-K (HML-2) subfamily is the subject of active debate about its potential positive or negative effects on human genome evolution and various pathologies. This review summarizes the variation, regulation, and diseases in human genome evolution arising from the influence of HERV-K (HML-2).
Collapse
Affiliation(s)
- Wonseok Shin
- NGS Clinical Laboratory, Division of Cancer Research, Dankook University Hospital, Cheonan 31116, Republic of Korea;
- Smart Animal Bio Institute, Dankook University, Cheonan 31116, Republic of Korea;
| | - Seyoung Mun
- Smart Animal Bio Institute, Dankook University, Cheonan 31116, Republic of Korea;
- College of Science & Technology, Dankook University, Cheonan 31116, Republic of Korea
- Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Republic of Korea
| | - Kyudong Han
- Smart Animal Bio Institute, Dankook University, Cheonan 31116, Republic of Korea;
- Center for Bio-Medical Engineering Core Facility, Dankook University, Cheonan 31116, Republic of Korea
- Department of Microbiology, College of Science & Technology, Dankook University, Cheonan 31116, Republic of Korea
- Department of Bioconvergence Engineering, Dankook University, Yongin 16890, Republic of Korea
- R&D Center, HuNBiome Co., Ltd., Seoul 08507, Republic of Korea
| |
Collapse
|
2
|
Cheng KCL, Frost JM, Sánchez-Luque FJ, García-Canãdas M, Taylor D, Yang WR, Irayanar B, Sampath S, Patani H, Agger K, Helin K, Ficz G, Burns KH, Ewing A, García-Pérez JL, Branco MR. Vitamin C activates young LINE-1 elements in mouse embryonic stem cells via H3K9me3 demethylation. Epigenetics Chromatin 2023; 16:39. [PMID: 37845773 PMCID: PMC10578016 DOI: 10.1186/s13072-023-00514-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Accepted: 10/06/2023] [Indexed: 10/18/2023] Open
Abstract
BACKGROUND Vitamin C (vitC) enhances the activity of 2-oxoglutarate-dependent dioxygenases, including TET enzymes, which catalyse DNA demethylation, and Jumonji-domain histone demethylases. The epigenetic remodelling promoted by vitC improves the efficiency of induced pluripotent stem cell derivation, and is required to attain a ground-state of pluripotency in embryonic stem cells (ESCs) that closely mimics the inner cell mass of the early blastocyst. However, genome-wide DNA and histone demethylation can lead to upregulation of transposable elements (TEs), and it is not known how vitC addition in culture media affects TE expression in pluripotent stem cells. RESULTS Here we show that vitC increases the expression of several TE families, including evolutionarily young LINE-1 (L1) elements, in mouse ESCs. We find that TET activity is dispensable for L1 upregulation, and that instead it occurs largely as a result of H3K9me3 loss mediated by KDM4A/C histone demethylases. Despite increased L1 levels, we did not detect increased somatic insertion rates in vitC-treated cells. Notably, treatment of human ESCs with vitC also increases L1 protein levels, albeit through a distinct, post-transcriptional mechanism. CONCLUSION VitC directly modulates the expression of mouse L1s and other TEs through epigenetic mechanisms, with potential for downstream effects related to the multiple emerging roles of L1s in cellular function.
Collapse
Affiliation(s)
- Kevin C L Cheng
- Blizard Institute, Faculty of Medicine and Dentistry, QMUL, London, E1 2AT, UK
- Computational Biology Program, Ontario Institute for Cancer Research, Toronto, ON, Canada
| | - Jennifer M Frost
- Blizard Institute, Faculty of Medicine and Dentistry, QMUL, London, E1 2AT, UK
| | - Francisco J Sánchez-Luque
- Institute of Parasitology and Biomedicine "Lopez-Neyra" (IPBLN), Spanish National Research Council (CSIC), PTS Granada, Granada, Spain
| | - Marta García-Canãdas
- Pfizer-University of Granada-Andalusian Government Centre for Genomics and Oncological Research (GENYO), PTS Granada, Granada, Spain
| | - Darren Taylor
- Blizard Institute, Faculty of Medicine and Dentistry, QMUL, London, E1 2AT, UK
- MRC London Institute of Medical Sciences, London, W12 0NN, UK
| | - Wan R Yang
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Branavy Irayanar
- Blizard Institute, Faculty of Medicine and Dentistry, QMUL, London, E1 2AT, UK
| | - Swetha Sampath
- Blizard Institute, Faculty of Medicine and Dentistry, QMUL, London, E1 2AT, UK
| | - Hemalvi Patani
- Barts Cancer Institute, Faculty of Medicine and Dentistry, QMUL, London, EC1M 6BQ, UK
| | - Karl Agger
- The Novo Nordisk Foundation Center for Stem Cell Biology (DanStem), University of Copenhagen, Copenhagen, Denmark
| | - Kristian Helin
- The Novo Nordisk Foundation Center for Stem Cell Biology (DanStem), University of Copenhagen, Copenhagen, Denmark
- The Institute of Cancer Research, London, UK
| | - Gabriella Ficz
- Barts Cancer Institute, Faculty of Medicine and Dentistry, QMUL, London, EC1M 6BQ, UK
| | - Kathleen H Burns
- Department of Oncologic Pathology, Dana-Farber Cancer Institute, Boston, MA, 02215, USA
| | - Adam Ewing
- Mater Research Institute, University of Queensland, Woolloongabba, QLD, 4102, Australia
| | - José L García-Pérez
- Pfizer-University of Granada-Andalusian Government Centre for Genomics and Oncological Research (GENYO), PTS Granada, Granada, Spain
| | - Miguel R Branco
- Blizard Institute, Faculty of Medicine and Dentistry, QMUL, London, E1 2AT, UK.
| |
Collapse
|
3
|
Devine SE. Emerging Opportunities to Study Mobile Element Insertions and Their Source Elements in an Expanding Universe of Sequenced Human Genomes. Genes (Basel) 2023; 14:1923. [PMID: 37895272 PMCID: PMC10606232 DOI: 10.3390/genes14101923] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2023] [Revised: 09/29/2023] [Accepted: 09/30/2023] [Indexed: 10/29/2023] Open
Abstract
Three mobile element classes, namely Alu, LINE-1 (L1), and SVA elements, remain actively mobile in human genomes and continue to produce new mobile element insertions (MEIs). Historically, MEIs have been discovered and studied using several methods, including: (1) Southern blots, (2) PCR (including PCR display), and (3) the detection of MEI copies from young subfamilies. We are now entering a new phase of MEI discovery where these methods are being replaced by whole genome sequencing and bioinformatics analysis to discover novel MEIs. We expect that the universe of sequenced human genomes will continue to expand rapidly over the next several years, both with short-read and long-read technologies. These resources will provide unprecedented opportunities to discover MEIs and study their impact on human traits and diseases. They also will allow the MEI community to discover and study the source elements that produce these new MEIs, which will facilitate our ability to study source element regulation in various tissue contexts and disease states. This, in turn, will allow us to better understand MEI mutagenesis in humans and the impact of this mutagenesis on human biology.
Collapse
Affiliation(s)
- Scott E Devine
- Institute for Genome Sciences, Department of Medicine, and Greenebaum Comprehensive Cancer Center, University of Maryland School of Medicine, Baltimore, MD 21201, USA
| |
Collapse
|
4
|
A study of transposable element-associated structural variations (TASVs) using a de novo-assembled Korean genome. Exp Mol Med 2021; 53:615-630. [PMID: 33833373 PMCID: PMC8102501 DOI: 10.1038/s12276-021-00586-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2019] [Revised: 01/26/2021] [Accepted: 01/27/2021] [Indexed: 12/13/2022] Open
Abstract
Advances in next-generation sequencing (NGS) technology have made personal genome sequencing possible, and indeed, many individual human genomes have now been sequenced. Comparisons of these individual genomes have revealed substantial genomic differences between human populations as well as between individuals from closely related ethnic groups. Transposable elements (TEs) are known to be one of the major sources of these variations and act through various mechanisms, including de novo insertion, insertion-mediated deletion, and TE–TE recombination-mediated deletion. In this study, we carried out de novo whole-genome sequencing of one Korean individual (KPGP9) via multiple insert-size libraries. The de novo whole-genome assembly resulted in 31,305 scaffolds with a scaffold N50 size of 13.23 Mb. Furthermore, through computational data analysis and experimental verification, we revealed that 182 TE-associated structural variation (TASV) insertions and 89 TASV deletions contributed 64,232 bp in sequence gain and 82,772 bp in sequence loss, respectively, in the KPGP9 genome relative to the hg19 reference genome. We also verified structural differences associated with TASVs by comparative analysis with TASVs in recent genomes (AK1 and TCGA genomes) and reported their details. Here, we constructed a new Korean de novo whole-genome assembly and provide the first study, to our knowledge, focused on the identification of TASVs in an individual Korean genome. Our findings again highlight the role of TEs as a major driver of structural variations in human individual genomes. A novel strategy for genome analysis offers insights into the distribution and impact on genome variation of transposable elements, DNA sequences that can replicate and relocate themselves at different chromosomal regions. These sequences, also known as ‘jumping genes’, comprise up to 50% of the genome, but it has proven challenging to map them with existing techniques. Seyoung Mun of Dankook University, Cheonan, South Korea, and coworkers have developed a sequencing and computational analysis strategy that allowed them to accurately map transposable elements across the genome of a Korean individual. These data revealed hundreds of insertion and deletion events relative to an existing reference map of the genome, showing significant alterations in the chromosomal structure. The authors speculate that such widespread transposition events could potentially contribute to individual differences in gene expression and risk of disease.
Collapse
|
5
|
Loh JW, Ha H, Lin T, Sun N, Burns KH, Xing J. Integrated Mobile Element Scanning (ME-Scan) method for identifying multiple types of polymorphic mobile element insertions. Mob DNA 2020; 11:12. [PMID: 32110248 PMCID: PMC7035633 DOI: 10.1186/s13100-020-00207-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2019] [Accepted: 02/14/2020] [Indexed: 01/29/2023] Open
Abstract
Background Mobile elements are ubiquitous components of mammalian genomes and constitute more than half of the human genome. Polymorphic mobile element insertions (pMEIs) are a major source of human genomic variation and are gaining research interest because of their involvement in gene expression regulation, genome integrity, and disease. Results Building on our previous Mobile Element Scanning (ME-Scan) protocols, we developed an integrated ME-Scan protocol to identify three major active families of human mobile elements, AluYb, L1HS, and SVA. This approach selectively amplifies insertion sites of currently active retrotransposons for Illumina sequencing. By pooling the libraries together, we can identify pMEIs from all three mobile element families in one sequencing run. To demonstrate the utility of the new ME-Scan protocol, we sequenced 12 human parent-offspring trios. Our results showed high sensitivity (> 90%) and accuracy (> 95%) of the protocol for identifying pMEIs in the human genome. In addition, we also tested the feasibility of identifying somatic insertions using the protocol. Conclusions The integrated ME-Scan protocol is a cost-effective way to identify novel pMEIs in the human genome. In addition, by developing the protocol to detect three mobile element families, we demonstrate the flexibility of the ME-Scan protocol. We present instructions for the library design, a sequencing protocol, and a computational pipeline for downstream analyses as a complete framework that will allow researchers to easily adapt the ME-Scan protocol to their own projects in other genomes.
Collapse
Affiliation(s)
- Jui Wan Loh
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA
| | - Hongseok Ha
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA.,2Human Genetic Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, 08854 NJ USA
| | - Timothy Lin
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA
| | - Nawei Sun
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA.,2Human Genetic Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, 08854 NJ USA
| | - Kathleen H Burns
- 3Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, 21205 MD USA
| | - Jinchuan Xing
- 1Department of Genetics, Rutgers, the State University of New Jersey, Piscataway, NJ 08854 USA.,2Human Genetic Institute of New Jersey, Rutgers, the State University of New Jersey, Piscataway, 08854 NJ USA
| |
Collapse
|
6
|
|
7
|
Sanchez-Luque FJ, Kempen MJHC, Gerdes P, Vargas-Landin DB, Richardson SR, Troskie RL, Jesuadian JS, Cheetham SW, Carreira PE, Salvador-Palomeque C, García-Cañadas M, Muñoz-Lopez M, Sanchez L, Lundberg M, Macia A, Heras SR, Brennan PM, Lister R, Garcia-Perez JL, Ewing AD, Faulkner GJ. LINE-1 Evasion of Epigenetic Repression in Humans. Mol Cell 2019; 75:590-604.e12. [PMID: 31230816 DOI: 10.1016/j.molcel.2019.05.024] [Citation(s) in RCA: 92] [Impact Index Per Article: 18.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2018] [Revised: 04/08/2019] [Accepted: 05/15/2019] [Indexed: 02/07/2023]
Abstract
Epigenetic silencing defends against LINE-1 (L1) retrotransposition in mammalian cells. However, the mechanisms that repress young L1 families and how L1 escapes to cause somatic genome mosaicism in the brain remain unclear. Here we report that a conserved Yin Yang 1 (YY1) transcription factor binding site mediates L1 promoter DNA methylation in pluripotent and differentiated cells. By analyzing 24 hippocampal neurons with three distinct single-cell genomic approaches, we characterized and validated a somatic L1 insertion bearing a 3' transduction. The source (donor) L1 for this insertion was slightly 5' truncated, lacked the YY1 binding site, and was highly mobile when tested in vitro. Locus-specific bisulfite sequencing revealed that the donor L1 and other young L1s with mutated YY1 binding sites were hypomethylated in embryonic stem cells, during neurodifferentiation, and in liver and brain tissue. These results explain how L1 can evade repression and retrotranspose in the human body.
Collapse
Affiliation(s)
- Francisco J Sanchez-Luque
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia; GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain.
| | - Marie-Jeanne H C Kempen
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia; MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine (IGMM), University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Patricia Gerdes
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Dulce B Vargas-Landin
- Australian Research Council Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, the University of Western Australia, Perth, WA 6009, Australia; Harry Perkins Institute of Medical Research, Perth, WA 6009, Australia
| | - Sandra R Richardson
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Robin-Lee Troskie
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - J Samuel Jesuadian
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Seth W Cheetham
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Patricia E Carreira
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Carmen Salvador-Palomeque
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Marta García-Cañadas
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain
| | - Martin Muñoz-Lopez
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain
| | - Laura Sanchez
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain
| | - Mischa Lundberg
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Angela Macia
- Department of Pediatrics/Rady Children's Hospital San Diego, School of Medicine, University of California, San Diego, La Jolla, CA, USA
| | - Sara R Heras
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain; Department of Biochemistry and Molecular Biology II, Faculty of Pharmacy, University of Granada, Campus Universitario de Cartuja, 18071 Granada, Spain
| | - Paul M Brennan
- Edinburgh Cancer Research Centre, Western General Hospital, Edinburgh, EH4 2XR, UK
| | - Ryan Lister
- Australian Research Council Centre of Excellence in Plant Energy Biology, School of Molecular Sciences, the University of Western Australia, Perth, WA 6009, Australia; Harry Perkins Institute of Medical Research, Perth, WA 6009, Australia
| | - Jose L Garcia-Perez
- GENYO Centre for Genomics and Oncological Research, Pfizer University of Granada, Andalusian Regional Government, Avda Ilustración, 114, PTS Granada 18016, Spain; MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine (IGMM), University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Adam D Ewing
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia
| | - Geoffrey J Faulkner
- Mater Research Institute, University of Queensland, TRI Building, Woolloongabba, QLD 4102, Australia; Queensland Brain Institute, University of Queensland, Brisbane, QLD 4072, Australia.
| |
Collapse
|
8
|
Hron T, Fabryova H, Elleder D. Insight into the epigenetic landscape of a currently endogenizing gammaretrovirus in mule deer (Odocoileus hemionus). Genomics 2019; 112:886-896. [PMID: 31175981 DOI: 10.1016/j.ygeno.2019.06.003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2019] [Revised: 04/26/2019] [Accepted: 06/03/2019] [Indexed: 01/22/2023]
Abstract
Endogenous retroviruses (ERVs) constitute a significant part of vertebrate genomes. They originated from past retroviral infections and some of them retain transcriptional activity. The key mechanism avoiding uncontrolled ERV transcription is DNA methylation-mediated epigenetic silencing. Despite numerous studies describing the involvement of ERV activity in cellular processes, epigenetic regulation of ERVs is still poorly understood. We previously described a cervid endogenous retrovirus (CrERV) in the mule deer genome. This virus exhibits massive insertional polymorphism, suggesting recent activity. Here we employed NGS-based strategy to determine the methylation pattern of CrERV integrations in four mule deer. Besides the vast majority of methylated integrations, we identified a tiny fraction of demethylated proviral copies. These copies represent evolutionary older integrations located near gene promoters. In general, our work is a first attempt to characterize the epigenetic landscape of insertionally polymorphic ERV on a whole-genome scale and offers insight into its interactions with a host.
Collapse
Affiliation(s)
- Tomas Hron
- Institute of Molecular Genetics, The Czech Academy of Sciences, Videnska 1083, Prague, 14220, Czech Republic; Faculty of Science, Charles University, Albertov 6, 128 43 Praha 2, Czech Republic.
| | - Helena Fabryova
- Institute of Molecular Genetics, The Czech Academy of Sciences, Videnska 1083, Prague, 14220, Czech Republic; Faculty of Science, Charles University, Albertov 6, 128 43 Praha 2, Czech Republic
| | - Daniel Elleder
- Institute of Molecular Genetics, The Czech Academy of Sciences, Videnska 1083, Prague, 14220, Czech Republic.
| |
Collapse
|
9
|
Li W, Lin L, Malhotra R, Yang L, Acharya R, Poss M. A computational framework to assess genome-wide distribution of polymorphic human endogenous retrovirus-K In human populations. PLoS Comput Biol 2019; 15:e1006564. [PMID: 30921327 PMCID: PMC6456218 DOI: 10.1371/journal.pcbi.1006564] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2018] [Revised: 04/09/2019] [Accepted: 03/05/2019] [Indexed: 12/11/2022] Open
Abstract
Human Endogenous Retrovirus type K (HERV-K) is the only HERV known to be insertionally polymorphic; not all individuals have a retrovirus at a specific genomic location. It is possible that HERV-Ks contribute to human disease because people differ in both number and genomic location of these retroviruses. Indeed viral transcripts, proteins, and antibody against HERV-K are detected in cancers, auto-immune, and neurodegenerative diseases. However, attempts to link a polymorphic HERV-K with any disease have been frustrated in part because population prevalence of HERV-K provirus at each polymorphic site is lacking and it is challenging to identify closely related elements such as HERV-K from short read sequence data. We present an integrated and computationally robust approach that uses whole genome short read data to determine the occupation status at all sites reported to contain a HERV-K provirus. Our method estimates the proportion of fixed length genomic sequence (k-mers) from whole genome sequence data matching a reference set of k-mers unique to each HERV-K locus and applies mixture model-based clustering of these values to account for low depth sequence data. Our analysis of 1000 Genomes Project Data (KGP) reveals numerous differences among the five KGP super-populations in the prevalence of individual and co-occurring HERV-K proviruses; we provide a visualization tool to easily depict the proportion of the KGP populations with any combination of polymorphic HERV-K provirus. Further, because HERV-K is insertionally polymorphic, the genome burden of known polymorphic HERV-K is variable in humans; this burden is lowest in East Asian (EAS) individuals. Our study identifies population-specific sequence variation for HERV-K proviruses at several loci. We expect these resources will advance research on HERV-K contributions to human diseases. Human Endogenous Retrovirus type K (HERV-K) is the youngest of retrovirus families in the human genome and is the only group of endogenous retroviruses that has polymorphic members; a locus containing a HERV-K can be occupied in one individual but empty in others. HERV-Ks could contribute to disease risk or pathogenesis but linking one of the known polymorphic HERV-K to a specific disease has been difficult. We develop an easy to use method that reveals the considerable variation existing among global populations in the prevalence of individual and co-occurring polymorphic HERV-K, and in the number of HERV-K that any individual has in their genome. Our study provides a reference of diversity for the currently known polymorphic HERV-K in global populations and tools needed to determine the profile of all known polymorphic HERV-K in the genome of any patient population.
Collapse
Affiliation(s)
- Weiling Li
- The School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, PA, United States of America
| | - Lin Lin
- Department of Statistics, The Pennsylvania State University, University Park, PA, United States of America
| | - Raunaq Malhotra
- The School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, PA, United States of America
| | - Lei Yang
- Department of Biology, The Pennsylvania State University, University Park, PA, United States of America
| | - Raj Acharya
- The School of Electrical Engineering and Computer Science, The Pennsylvania State University, University Park, PA, United States of America
- School of Informatics, Computing and Engineering, Indiana University, Bloomington, IN, United States of America
| | - Mary Poss
- Department of Biology, The Pennsylvania State University, University Park, PA, United States of America
- Department of Veterinary and Biomedical Sciences, The Pennsylvania State University, University Park, PA, United States of America
- * E-mail:
| |
Collapse
|
10
|
Dynamic Methylation of an L1 Transduction Family during Reprogramming and Neurodifferentiation. Mol Cell Biol 2019; 39:MCB.00499-18. [PMID: 30692270 PMCID: PMC6425141 DOI: 10.1128/mcb.00499-18] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2018] [Accepted: 01/11/2019] [Indexed: 01/28/2023] Open
Abstract
The retrotransposon LINE-1 (L1) is a significant source of endogenous mutagenesis in humans. In each individual genome, a few retrotransposition-competent L1s (RC-L1s) can generate new heritable L1 insertions in the early embryo, primordial germ line, and germ cells. L1 retrotransposition can also occur in the neuronal lineage and cause somatic mosaicism. Although DNA methylation mediates L1 promoter repression, the temporal pattern of methylation applied to individual RC-L1s during neurogenesis is unclear. Here, we identified a de novo L1 insertion in a human induced pluripotent stem cell (hiPSC) line via retrotransposon capture sequencing (RC-seq). The L1 insertion was full-length and carried 5' and 3' transductions. The corresponding donor RC-L1 was part of a large and recently active L1 transduction family and was highly mobile in a cultured-cell L1 retrotransposition reporter assay. Notably, we observed distinct and dynamic DNA methylation profiles for the de novo L1 and members of its extended transduction family during neuronal differentiation. These experiments reveal how a de novo L1 insertion in a pluripotent stem cell is rapidly recognized and repressed, albeit incompletely, by the host genome during neurodifferentiation, while retaining potential for further retrotransposition.
Collapse
|
11
|
Steranka JP, Tang Z, Grivainis M, Huang CRL, Payer LM, Rego FOR, Miller TLA, Galante PAF, Ramaswami S, Heguy A, Fenyö D, Boeke JD, Burns KH. Transposon insertion profiling by sequencing (TIPseq) for mapping LINE-1 insertions in the human genome. Mob DNA 2019; 10:8. [PMID: 30899333 PMCID: PMC6407172 DOI: 10.1186/s13100-019-0148-5] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Accepted: 01/14/2019] [Indexed: 12/14/2022] Open
Abstract
Background Transposable elements make up a significant portion of the human genome. Accurately locating these mobile DNAs is vital to understand their role as a source of structural variation and somatic mutation. To this end, laboratories have developed strategies to selectively amplify or otherwise enrich transposable element insertion sites in genomic DNA. Results Here we describe a technique, Transposon Insertion Profiling by sequencing (TIPseq), to map Long INterspersed Element 1 (LINE-1, L1) retrotransposon insertions in the human genome. This method uses vectorette PCR to amplify species-specific L1 (L1PA1) insertion sites followed by paired-end Illumina sequencing. In addition to providing a step-by-step molecular biology protocol, we offer users a guide to our pipeline for data analysis, TIPseqHunter. Our recent studies in pancreatic and ovarian cancer demonstrate the ability of TIPseq to identify invariant (fixed), polymorphic (inherited variants), as well as somatically-acquired L1 insertions that distinguish cancer genomes from a patient’s constitutional make-up. Conclusions TIPseq provides an approach for amplifying evolutionarily young, active transposable element insertion sites from genomic DNA. Our rationale and variations on this protocol may be useful to those mapping L1 and other mobile elements in complex genomes. Electronic supplementary material The online version of this article (10.1186/s13100-019-0148-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Jared P Steranka
- 1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA.,2McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Zuojian Tang
- 3Department for Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016 USA.,4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Mark Grivainis
- 3Department for Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016 USA.,4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Cheng Ran Lisa Huang
- 2McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Lindsay M Payer
- 1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| | - Fernanda O R Rego
- 5Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil
| | - Thiago Luiz Araujo Miller
- 5Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil.,Departamento de Bioquímica, Instituto de Química, Universidade de São Paul, São Paulo, Brazil
| | - Pedro A F Galante
- 5Centro de Oncologia Molecular, Hospital Sírio-Libanês, São Paulo, Brazil
| | - Sitharam Ramaswami
- 7Genome Technology Center, Division of Advanced Research Technologies, NYU Langone Health, New York, NY USA
| | - Adriana Heguy
- 7Genome Technology Center, Division of Advanced Research Technologies, NYU Langone Health, New York, NY USA
| | - David Fenyö
- 3Department for Biochemistry and Molecular Pharmacology, NYU Langone Health, New York, NY 10016 USA.,4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Jef D Boeke
- 4Institute for Systems Genetics, NYU Langone Health, New York, NY 10016 USA
| | - Kathleen H Burns
- 1Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA.,2McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD 21205 USA
| |
Collapse
|
12
|
Shin W, Mun S, Kim J, Lee W, Park DG, Choi S, Lee TY, Cha S, Han K. Novel Discovery of LINE-1 in a Korean Individual by a Target Enrichment Method. Mol Cells 2019; 42:87-95. [PMID: 30699287 PMCID: PMC6354063 DOI: 10.14348/molcells.2018.0351] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Revised: 10/10/2018] [Accepted: 10/26/2018] [Indexed: 11/27/2022] Open
Abstract
Long interspersed element-1 (LINE-1 or L1) is an autonomous retrotransposon, which is capable of inserting into a new region of genome. Previous studies have reported that these elements lead to genomic variations and altered functions by affecting gene expression and genetic networks. Mounting evidence strongly indicates that genetic diseases or various cancers can occur as a result of retrotransposition events that involve L1s. Therefore, the development of methodologies to study the structural variations and interpersonal insertion polymorphisms by L1 element-associated changes in an individual genome is invaluable. In this study, we applied a systematic approach to identify human-specific L1s (i.e., L1Hs) through the bioinformatics analysis of high-throughput next-generation sequencing data. We identified 525 candidates that could be inferred to carry non-reference L1Hs in a Korean individual genome (KPGP9). Among them, we randomly selected 40 candidates and validated that approximately 92.5% of non-reference L1Hs were inserted into a KPGP9 genome. In addition, unlike conventional methods, our relatively simple and expedited approach was highly reproducible in confirming the L1 insertions. Taken together, our findings strongly support that the identification of non-reference L1Hs by our novel target enrichment method demonstrates its future application to genomic variation studies on the risk of cancer and genetic disorders.
Collapse
Affiliation(s)
- Wonseok Shin
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Seyoung Mun
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Junse Kim
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Wooseok Lee
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| | - Dong-Guk Park
- Department of Surgery, Dankook University College of Medicine, Cheonan 31116,
Korea
| | - Seungkyu Choi
- Department of Pathology, Dankook University College of Medicine, Cheonan 31116,
Korea
| | - Tae Yoon Lee
- Department of Technology Education and Department of Biomedical Engineering, Chungnam National University, Daejeon 34134,
Korea
| | - Seunghee Cha
- Department of Oral and Maxillofacial Diagnostic Sciences, University of Florida College of Dentistry, Gainesville, FL 32610,
USA
| | - Kyudong Han
- Department of Nanobiomedical Science & BK21 PLUS NBM Global Research Center for Regenerative Medicine, Dankook University, Cheonan 31116,
Korea
| |
Collapse
|
13
|
|
14
|
Wallace AD, Wendt GA, Barcellos LF, de Smith AJ, Walsh KM, Metayer C, Costello JF, Wiemels JL, Francis SS. To ERV Is Human: A Phenotype-Wide Scan Linking Polymorphic Human Endogenous Retrovirus-K Insertions to Complex Phenotypes. Front Genet 2018; 9:298. [PMID: 30154825 PMCID: PMC6102640 DOI: 10.3389/fgene.2018.00298] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2018] [Accepted: 07/16/2018] [Indexed: 12/13/2022] Open
Abstract
Approximately 8% of the human genome is comprised of endogenous retroviral insertions (ERVs) originating from historic retroviral integration into germ cells. The function of ERVs as regulators of gene expression is well established. Less well studied are insertional polymorphisms of ERVs and their contribution to the heritability of complex phenotypes. The most recent integration of ERV, HERV-K, is expressed in a range of complex human conditions from cancer to neurologic diseases. Using an in-house computational pipeline and whole-genome sequencing data from the diverse 1,000 Genomes Phase 3 population (n = 2,504), we identified 46 polymorphic HERV-K insertions that are tagged by adjacent single nucleotide polymorphisms (SNPs). To test the potential role of polymorphic HERV-K in the heritability of complex diseases, existing databases were queried for enrichment of established relationships between the HERV-K insertion-associated SNPs (hiSNPs), and tissue specific gene expression and disease phenotypes. Overall, hiSNPs for the 46 polymorphic HERV-K sites were statistically enriched (p < 1.0E-16) for eQTLs across 44 human tissues. Fifteen of the 46 HERV-K insertions had hiSNPs annotated in the EMBL-EBI GWAS Catalog and cumulatively associated with >100 phenotypes. Experimental factor ontology enrichment analysis suggests that polymorphic HERV-K specifically contribute to neurologic and immunologic disease phenotypes, including traits related to intra cranial volume (FDR 2.00E-09), Parkinson's disease (FDR 1.80E-09), and autoimmune diseases (FDR 1.80E-09). These results provide strong candidates for context-specific study of polymorphic HERV-K insertions in disease-related traits, serving as a roadmap for future studies of the heritability of complex disease.
Collapse
Affiliation(s)
- Amelia D Wallace
- Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - George A Wendt
- Division of Epidemiology, School of Community Health Sciences, University of Nevada, Reno, NV, United States
| | - Lisa F Barcellos
- Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - Adam J de Smith
- Department of Epidemiology and Biostatistics, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Kyle M Walsh
- Department of Neurosurgery, Duke University, Durham, NC, United States
| | - Catherine Metayer
- Division of Epidemiology, School of Public Health, University of California, Berkeley, Berkeley, CA, United States
| | - Joseph F Costello
- Department of Neurosurgery, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Joseph L Wiemels
- Department of Epidemiology and Biostatistics, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States.,Department of Neurosurgery, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| | - Stephen S Francis
- Division of Epidemiology, School of Community Health Sciences, University of Nevada, Reno, NV, United States.,Department of Epidemiology and Biostatistics, Helen Diller Comprehensive Cancer Center, University of California, San Francisco, San Francisco, CA, United States
| |
Collapse
|
15
|
Wang L, Jordan IK. Transposable element activity, genome regulation and human health. Curr Opin Genet Dev 2018; 49:25-33. [PMID: 29505964 DOI: 10.1016/j.gde.2018.02.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2017] [Revised: 01/30/2018] [Accepted: 02/13/2018] [Indexed: 12/21/2022]
Abstract
A convergence of novel genome analysis technologies is enabling population genomic studies of human transposable elements (TEs). Population surveys of human genome sequences have uncovered thousands of individual TE insertions that segregate as common genetic variants, i.e. TE polymorphisms. These recent TE insertions provide an important source of naturally occurring human genetic variation. Investigators are beginning to leverage population genomic data sets to execute genome-scale association studies for assessing the phenotypic impact of human TE polymorphisms. For example, the expression quantitative trait loci (eQTL) analytical paradigm has recently been used to uncover hundreds of associations between human TE insertion variants and gene expression levels. These include population-specific gene regulatory effects as well as coordinated changes to gene regulatory networks. In addition, analyses of linkage disequilibrium patterns with previously characterized genome-wide association study (GWAS) trait variants have uncovered TE insertion polymorphisms that are likely causal variants for a variety of common complex diseases. Gene regulatory mechanisms that underlie specific disease phenotypes have been proposed for a number of these trait associated TE polymorphisms. These new population genomic approaches hold great promise for understanding how ongoing TE activity contributes to functionally relevant genetic variation within and between human populations.
Collapse
Affiliation(s)
- Lu Wang
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; PanAmerican Bioinformatics Institute, Cali, Colombia
| | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; PanAmerican Bioinformatics Institute, Cali, Colombia.
| |
Collapse
|
16
|
Kvikstad EM, Piazza P, Taylor JC, Lunter G. A high throughput screen for active human transposable elements. BMC Genomics 2018; 19:115. [PMID: 29390960 PMCID: PMC5796560 DOI: 10.1186/s12864-018-4485-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 01/16/2018] [Indexed: 11/30/2022] Open
Abstract
Background Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host’s genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on disease. Here we present a targeted sequencing protocol and computational pipeline to identify polymorphic and novel TE insertions using next-generation sequencing: TE-NGS. The method simultaneously targets the three subfamilies that are responsible for the majority of recent TE activity (L1HS, AluYa5/8, and AluYb8/9) thereby obviating the need for multiple experiments and reducing the amount of input material required. Results Here we describe the laboratory protocol and detection algorithm, and a benchmark experiment for the reference genome NA12878. We demonstrate a substantial enrichment for on-target fragments, and high sensitivity and precision to both reference and NA12878-specific insertions. We report 17 previously unreported loci for this individual which are supported by orthogonal long-read evidence, and we identify 1470 polymorphic and novel TEs in 12 additional samples that were previously undocumented in databases of insertion polymorphisms. Conclusions We anticipate that future applications of TE-NGS alongside exome sequencing of patients with sporadic disease will reduce the number of unresolved cases, and improve estimates of the contribution of TEs to human genetic disease. Electronic supplementary material The online version of this article (10.1186/s12864-018-4485-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Erika M Kvikstad
- Wellcome Trust Centre for Human Genetics, Oxford, UK. .,National Institute for Health Research Comprehensive Biomedical Research Centre, Oxford, UK.
| | - Paolo Piazza
- Wellcome Trust Centre for Human Genetics, Oxford, UK.,Department of Medicine, Imperial College London, London, UK
| | - Jenny C Taylor
- Wellcome Trust Centre for Human Genetics, Oxford, UK.,National Institute for Health Research Comprehensive Biomedical Research Centre, Oxford, UK
| | - Gerton Lunter
- Wellcome Trust Centre for Human Genetics, Oxford, UK
| |
Collapse
|
17
|
Zhang S, Kelleher ES. Targeted identification of TE insertions in a Drosophila genome through hemi-specific PCR. Mob DNA 2017; 8:10. [PMID: 28775768 PMCID: PMC5534036 DOI: 10.1186/s13100-017-0092-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2017] [Accepted: 07/10/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Transposable elements (TEs) are major components of eukaryotic genomes and drivers of genome evolution, producing intraspecific polymorphism and interspecific differences through mobilization and non-homologous recombination. TE insertion sites are often highly variable within species, creating a need for targeted genome re-sequencing (TGS) methods to identify TE insertion sites. METHODS We present a hemi-specific PCR approach for TGS of P-elements in Drosophila genomes on the Illumina platform. We also present a computational framework for identifying new insertions from TGS reads. Finally, we describe a new method for estimating the frequency of TE insertions from WGS data, which is based precise insertion sites provided by TGS annotations. RESULTS By comparing our results to TE annotations based on whole genome re-sequencing (WGS) data for the same Drosophilamelanogaster strain, we demonstrate that TGS is powerful for identifying true insertions, even in repeat-rich heterochromatic regions. We also demonstrate that TGS offers enhanced annotation of precise insertion sites, which facilitates estimation of TE insertion frequency. CONCLUSIONS TGS by hemi-specific PCR is a powerful approach for identifying TE insertions of particular TE families in species with a high-quality reference genome, at greatly reduced cost as compared to WGS. It may therefore be ideal for population genomic studies of particular TE families. Additionally, TGS and WGS can be used as complementary approaches, with TGS annotations identifying more annotated insertions with greater precision for a target TE family, and WGS data allowing for estimates of TE insertion frequencies, and a broader picture of the location of non-target TEs across the genome.
Collapse
Affiliation(s)
- Shuo Zhang
- Department of Biology and Biochemistry, University of Houston, 3455 Cullen Blvd. Suite 342, Houston, TX 77204 USA
| | - Erin S. Kelleher
- Department of Biology and Biochemistry, University of Houston, 3455 Cullen Blvd. Suite 342, Houston, TX 77204 USA
| |
Collapse
|
18
|
Feusier J, Witherspoon DJ, Scott Watkins W, Goubert C, Sasani TA, Jorde LB. Discovery of rare, diagnostic AluYb8/9 elements in diverse human populations. Mob DNA 2017; 8:9. [PMID: 28770012 PMCID: PMC5531096 DOI: 10.1186/s13100-017-0093-0] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2017] [Accepted: 07/17/2017] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Polymorphic human Alu elements are excellent tools for assessing population structure, and new retrotransposition events can contribute to disease. Next-generation sequencing has greatly increased the potential to discover Alu elements in human populations, and various sequencing and bioinformatics methods have been designed to tackle the problem of detecting these highly repetitive elements. However, current techniques for Alu discovery may miss rare, polymorphic Alu elements. Combining multiple discovery approaches may provide a better profile of the polymorphic Alu mobilome. AluYb8/9 elements have been a focus of our recent studies as they are young subfamilies (~2.3 million years old) that contribute ~30% of recent polymorphic Alu retrotransposition events. Here, we update our ME-Scan methods for detecting Alu elements and apply these methods to discover new insertions in a large set of individuals with diverse ancestral backgrounds. RESULTS We identified 5,288 putative Alu insertion events, including several hundred novel AluYb8/9 elements from 213 individuals from 18 diverse human populations. Hundreds of these loci were specific to continental populations, and 23 non-reference population-specific loci were validated by PCR. We provide high-quality sequence information for 68 rare AluYb8/9 elements, of which 11 have hallmarks of an active source element. Our subfamily distribution of rare AluYb8/9 elements is consistent with previous datasets, and may be representative of rare loci. We also find that while ME-Scan and low-coverage, whole-genome sequencing (WGS) detect different Alu elements in 41 1000 Genomes individuals, the two methods yield similar population structure results. CONCLUSION Current in-silico methods for Alu discovery may miss rare, polymorphic Alu elements. Therefore, using multiple techniques can provide a more accurate profile of Alu elements in individuals and populations. We improved our false-negative rate as an indicator of sample quality for future ME-Scan experiments. In conclusion, we demonstrate that ME-Scan is a good supplement for next-generation sequencing methods and is well-suited for population-level analyses.
Collapse
Affiliation(s)
- Julie Feusier
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - David J. Witherspoon
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - W. Scott Watkins
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - Clément Goubert
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - Thomas A. Sasani
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| | - Lynn B. Jorde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT USA
| |
Collapse
|
19
|
Abstract
Transposable elements give rise to interspersed repeats, sequences that comprise most of our genomes. These mobile DNAs have been historically underappreciated - both because they have been presumed to be unimportant, and because their high copy number and variability pose unique technical challenges. Neither impediment now seems steadfast. Interest in the human mobilome has never been greater, and methods enabling its study are maturing at a fast pace. This Review describes the activity of transposable elements in human cancers, particularly long interspersed element-1 (LINE-1). LINE-1 sequences are self-propagating, protein-coding retrotransposons, and their activity results in somatically acquired insertions in cancer genomes. Altered expression of transposable elements and animation of genomic LINE-1 sequences appear to be hallmarks of cancer, and can be responsible for driving mutations in tumorigenesis.
Collapse
Affiliation(s)
- Kathleen H Burns
- Johns Hopkins University School of Medicine, Baltimore, Maryland 21205, USA
| |
Collapse
|
20
|
Goubert C, Henri H, Minard G, Valiente Moro C, Mavingui P, Vieira C, Boulesteix M. High-throughput sequencing of transposable element insertions suggests adaptive evolution of the invasive Asian tiger mosquito towards temperate environments. Mol Ecol 2017; 26:3968-3981. [DOI: 10.1111/mec.14184] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2016] [Revised: 05/06/2017] [Accepted: 05/08/2017] [Indexed: 12/21/2022]
Affiliation(s)
- Clement Goubert
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
- Department of Human Genetics; University of Utah; Salt Lake City UT USA
| | - Helene Henri
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
| | - Guillaume Minard
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Ecologie Microbienne; UMR CNRS 5557; UMR INRA 1418; Villeurbanne France
- Department of Biosciences; Metapopulation Research Center; University of Helsinki; Helsinki Finland
| | - Claire Valiente Moro
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Ecologie Microbienne; UMR CNRS 5557; UMR INRA 1418; Villeurbanne France
| | - Patrick Mavingui
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Ecologie Microbienne; UMR CNRS 5557; UMR INRA 1418; Villeurbanne France
- UMR PIMIT; INSERM 1187, CNRS 9192, IRD 249, Plateforme Technologique CYROI; Universite de La Reunion; Sainte-Clotilde Reunion
| | - Cristina Vieira
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
| | - Matthieu Boulesteix
- Université de Lyon; Lyon France
- Université Lyon 1; Villeurbanne France
- Laboratoire de Biometrie et Biologie Evolutive; UMR CNRS 5558; Villeurbanne France
| |
Collapse
|
21
|
Transposable elements in cancer. NATURE REVIEWS. CANCER 2017. [PMID: 28642606 DOI: 10.1038/nrc.2017.35+[doi]] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Transposable elements give rise to interspersed repeats, sequences that comprise most of our genomes. These mobile DNAs have been historically underappreciated - both because they have been presumed to be unimportant, and because their high copy number and variability pose unique technical challenges. Neither impediment now seems steadfast. Interest in the human mobilome has never been greater, and methods enabling its study are maturing at a fast pace. This Review describes the activity of transposable elements in human cancers, particularly long interspersed element-1 (LINE-1). LINE-1 sequences are self-propagating, protein-coding retrotransposons, and their activity results in somatically acquired insertions in cancer genomes. Altered expression of transposable elements and animation of genomic LINE-1 sequences appear to be hallmarks of cancer, and can be responsible for driving mutations in tumorigenesis.
Collapse
|
22
|
Structural variants caused by Alu insertions are associated with risks for many human diseases. Proc Natl Acad Sci U S A 2017; 114:E3984-E3992. [PMID: 28465436 DOI: 10.1073/pnas.1704117114] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023] Open
Abstract
Interspersed repeat sequences comprise much of our DNA, although their functional effects are poorly understood. The most commonly occurring repeat is the Alu short interspersed element. New Alu insertions occur in human populations, and have been responsible for several instances of genetic disease. In this study, we sought to determine if there are instances of polymorphic Alu insertion variants that function in a common variant, common disease paradigm. We cataloged 809 polymorphic Alu elements mapping to 1,159 loci implicated in disease risk by genome-wide association study (GWAS) (P < 10-8). We found that Alu insertion variants occur disproportionately at GWAS loci (P = 0.013). Moreover, we identified 44 of these Alu elements in linkage disequilibrium (r2 > 0.7) with the trait-associated SNP. This figure represents a >20-fold increase in the number of polymorphic Alu elements associated with human phenotypes. This work provides a broader perspective on how structural variants in repetitive DNAs may contribute to human disease.
Collapse
|
23
|
Kryatova MS, Steranka JP, Burns KH, Payer LM. Insertion and deletion polymorphisms of the ancient AluS family in the human genome. Mob DNA 2017; 8:6. [PMID: 28450901 PMCID: PMC5402677 DOI: 10.1186/s13100-017-0089-9] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2017] [Accepted: 04/04/2017] [Indexed: 01/09/2023] Open
Abstract
Background Polymorphic Alu elements account for 17% of structural variants in the human genome. The majority of these belong to the youngest AluY subfamilies, and most structural variant discovery efforts have focused on identifying Alu polymorphisms from these currently retrotranspositionally active subfamilies. In this report we analyze polymorphisms from the evolutionarily older AluS subfamily, whose peak activity was tens of millions of years ago. We annotate the AluS polymorphisms, assess their likely mechanism of origin, and evaluate their contribution to structural variation in the human genome. Results Of 52 previously reported polymorphic AluS elements ascertained for this study, 48 were confirmed to belong to the AluS subfamily using high stringency subfamily classification criteria. Of these, the majority (77%, 37/48) appear to be deletion polymorphisms. Two polymorphic AluS elements (4%) have features of non-classical Alu insertions and one polymorphic AluS element (2%) likely inserted by a mechanism involving internal priming. Seven AluS polymorphisms (15%) appear to have arisen by the classical target-primed reverse transcription (TPRT) retrotransposition mechanism. These seven TPRT products are 3′ intact with 3′ poly-A tails, and are flanked by target site duplications; L1 ORF2p endonuclease cleavage sites were also observed, providing additional evidence that these are L1 ORF2p endonuclease-mediated TPRT insertions. Further sequence analysis showed strong conservation of both the RNA polymerase III promoter and SRP9/14 binding sites, important for mediating transcription and interaction with retrotransposition machinery, respectively. This conservation of functional features implies that some of these are fairly recent insertions since they have not diverged significantly from their respective retrotranspositionally competent source elements. Conclusions Of the polymorphic AluS elements evaluated in this report, 15% (7/48) have features consistent with TPRT-mediated insertion, thus suggesting that some AluS elements have been more active recently than previously thought, or that fixation of AluS insertion alleles remains incomplete. These data expand the potential significance of polymorphic AluS elements in contributing to structural variation in the human genome. Future discovery efforts focusing on polymorphic AluS elements are likely to identify more such polymorphisms, and approaches tailored to identify deletion alleles may be warranted. Electronic supplementary material The online version of this article (doi:10.1186/s13100-017-0089-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Maria S Kryatova
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| | - Jared P Steranka
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| | - Lindsay M Payer
- Department of Pathology, Johns Hopkins University School of Medicine, Miller Research Building (MRB) Room 447, 733 North Broadway, Baltimore, MD 21205 USA
| |
Collapse
|
24
|
Human transposon insertion profiling: Analysis, visualization and identification of somatic LINE-1 insertions in ovarian cancer. Proc Natl Acad Sci U S A 2017; 114:E733-E740. [PMID: 28096347 DOI: 10.1073/pnas.1619797114] [Citation(s) in RCA: 64] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Mammalian genomes are replete with interspersed repeats reflecting the activity of transposable elements. These mobile DNAs are self-propagating, and their continued transposition is a source of both heritable structural variation as well as somatic mutation in human genomes. Tailored approaches to map these sequences are useful to identify insertion alleles. Here, we describe in detail a strategy to amplify and sequence long interspersed element-1 (LINE-1, L1) retrotransposon insertions selectively in the human genome, transposon insertion profiling by next-generation sequencing (TIPseq). We also report the development of a machine-learning-based computational pipeline, TIPseqHunter, to identify insertion sites with high precision and reliability. We demonstrate the utility of this approach to detect somatic retrotransposition events in high-grade ovarian serous carcinoma.
Collapse
|
25
|
Rishishwar L, Wang L, Clayton EA, Mariño-Ramírez L, McDonald JF, Jordan IK. Population and clinical genetics of human transposable elements in the (post) genomic era. Mob Genet Elements 2017; 7:1-20. [PMID: 28228978 PMCID: PMC5305044 DOI: 10.1080/2159256x.2017.1280116] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Revised: 01/03/2017] [Accepted: 01/04/2017] [Indexed: 10/26/2022] Open
Abstract
Recent technological developments-in genomics, bioinformatics and high-throughput experimental techniques-are providing opportunities to study ongoing human transposable element (TE) activity at an unprecedented level of detail. It is now possible to characterize genome-wide collections of TE insertion sites for multiple human individuals, within and between populations, and for a variety of tissue types. Comparison of TE insertion site profiles between individuals captures the germline activity of TEs and reveals insertion site variants that segregate as polymorphisms among human populations, whereas comparison among tissue types ascertains somatic TE activity that generates cellular heterogeneity. In this review, we provide an overview of these new technologies and explore their implications for population and clinical genetic studies of human TEs. We cover both recent published results on human TE insertion activity as well as the prospects for future TE studies related to human evolution and health.
Collapse
Affiliation(s)
- Lavanya Rishishwar
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; PanAmerican Bioinformatics Institute, Cali, Colombia; Applied Bioinformatics Laboratory, Atlanta, GA, USA
| | - Lu Wang
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; PanAmerican Bioinformatics Institute, Cali, Colombia
| | - Evan A Clayton
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; Ovarian Cancer Institute, Atlanta, GA, USA
| | - Leonardo Mariño-Ramírez
- PanAmerican Bioinformatics Institute, Cali, Colombia; National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - John F McDonald
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; Ovarian Cancer Institute, Atlanta, GA, USA
| | - I King Jordan
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA; PanAmerican Bioinformatics Institute, Cali, Colombia; Applied Bioinformatics Laboratory, Atlanta, GA, USA
| |
Collapse
|
26
|
Muñoz-Lopez M, Vilar-Astasio R, Tristan-Ramos P, Lopez-Ruiz C, Garcia-Pérez JL. Study of Transposable Elements and Their Genomic Impact. Methods Mol Biol 2016; 1400:1-19. [PMID: 26895043 DOI: 10.1007/978-1-4939-3372-3_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Transposable elements (TEs) have been considered traditionally as junk DNA, i.e., DNA sequences that despite representing a high proportion of genomes had no evident cellular functions. However, over the last decades, it has become undeniable that not only TE-derived DNA sequences have (and had) a fundamental role during genome evolution, but also TEs have important implications in the origin and evolution of many genomic disorders. This concise review provides a brief overview of the different types of TEs that can be found in genomes, as well as a list of techniques and methods used to study their impact and mobilization. Some of these techniques will be covered in detail in this Method Book.
Collapse
Affiliation(s)
- Martin Muñoz-Lopez
- Department of Human DNA Variability, Pfizer/University of Granada and Andalusian Regional Government Center for Genomics and Oncology (GENYO), Avda Ilustracion 114, PTS Granada, 18016, Granada, Spain.
| | - Raquel Vilar-Astasio
- Department of Human DNA Variability, Pfizer/University of Granada and Andalusian Regional Government Center for Genomics and Oncology (GENYO), Avda Ilustracion 114, PTS Granada, 18016, Granada, Spain
| | - Pablo Tristan-Ramos
- Department of Human DNA Variability, Pfizer/University of Granada and Andalusian Regional Government Center for Genomics and Oncology (GENYO), Avda Ilustracion 114, PTS Granada, 18016, Granada, Spain
| | - Cesar Lopez-Ruiz
- Department of Human DNA Variability, Pfizer/University of Granada and Andalusian Regional Government Center for Genomics and Oncology (GENYO), Avda Ilustracion 114, PTS Granada, 18016, Granada, Spain
| | - Jose L Garcia-Pérez
- -Genyo (Center for Genomics and Oncological Research), Pfizer/Universidad de Granada/Junta de Andalucia. PTS Granada, Spain-Institute of Genetics and Molecular Medicine (IGMM), University of Edinburgh,, Edinburgh, UK
| |
Collapse
|
27
|
Tan S, Cardoso-Moreira M, Shi W, Zhang D, Huang J, Mao Y, Jia H, Zhang Y, Chen C, Shao Y, Leng L, Liu Z, Huang X, Long M, Zhang YE. LTR-mediated retroposition as a mechanism of RNA-based duplication in metazoans. Genome Res 2016; 26:1663-1675. [PMID: 27934698 PMCID: PMC5131818 DOI: 10.1101/gr.204925.116] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2016] [Accepted: 10/18/2016] [Indexed: 01/09/2023]
Abstract
In a broad range of taxa, genes can duplicate through an RNA intermediate in a process mediated by retrotransposons (retroposition). In mammals, L1 retrotransposons drive retroposition, but the elements responsible for retroposition in other animals have yet to be identified. Here, we examined young retrocopies from various animals that still retain the sequence features indicative of the underlying retroposition mechanism. In Drosophila melanogaster, we identified and de novo assembled 15 polymorphic retrocopies and found that all retroposed loci are chimeras of internal retrocopies flanked by discontinuous LTR retrotransposons. At the fusion points between the mRNAs and the LTR retrotransposons, we identified shared short similar sequences that suggest the involvement of microsimilarity-dependent template switches. By expanding our approach to mosquito, zebrafish, chicken, and mammals, we identified in all these species recently originated retrocopies with a similar chimeric structure and shared microsimilarities at the fusion points. We also identified several retrocopies that combine the sequences of two or more parental genes, demonstrating LTR-retroposition as a novel mechanism of exon shuffling. Finally, we found that LTR-mediated retrocopies are immediately cotranscribed with their flanking LTR retrotransposons. Transcriptional profiling coupled with sequence analyses revealed that the sense-strand transcription of the retrocopies often lead to the origination of in-frame proteins relative to the parental genes. Overall, our data show that LTR-mediated retroposition is highly conserved across a wide range of animal taxa; combined with previous work from plants and yeast, it represents an ancient and ongoing mechanism continuously shaping gene content evolution in eukaryotes.
Collapse
Affiliation(s)
- Shengjun Tan
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | | | - Wenwen Shi
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Dan Zhang
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jiawei Huang
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yanan Mao
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Hangxing Jia
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yaqiong Zhang
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Chunyan Chen
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yi Shao
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Liang Leng
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
| | - Zhonghua Liu
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xun Huang
- State Key Laboratory of Molecular Developmental Biology, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing 100101, China
| | - Manyuan Long
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois 60637, USA
| | - Yong E Zhang
- Key Laboratory of Zoological Systematics and Evolution and State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing 100101, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
28
|
Zampella JG, Rodić N, Yang WR, Huang CRL, Welch J, Gnanakkan VP, Cornish TC, Boeke JD, Burns KH. A map of mobile DNA insertions in the NCI-60 human cancer cell panel. Mob DNA 2016; 7:20. [PMID: 27807467 PMCID: PMC5087121 DOI: 10.1186/s13100-016-0078-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2016] [Accepted: 10/21/2016] [Indexed: 11/13/2022] Open
Abstract
Background The National Cancer Institute-60 (NCI-60) cell lines are among the most widely used models of human cancer. They provide a platform to integrate DNA sequence information, epigenetic data, RNA and protein expression, and pharmacologic susceptibilities in studies of cancer cell biology. Genome-wide studies of the complete panel have included exome sequencing, karyotyping, and copy number analyses but have not targeted repetitive sequences. Interspersed repeats derived from mobile DNAs are a significant source of heritable genetic variation, and insertions of active elements can occur somatically in malignancy. Method We used Transposon Insertion Profiling by microarray (TIP-chip) to map Long INterspersed Element-1 (LINE-1, L1) and Alu Short INterspersed Element (SINE) insertions in cancer genes in NCI-60 cells. We focused this discovery effort on annotated Cancer Gene Index loci. Results We catalogued a total of 749 and 2,100 loci corresponding to candidate LINE-1 and Alu insertion sites, respectively. As expected, these numbers encompass previously known insertions, polymorphisms shared in unrelated tumor cell lines, as well as unique, potentially tumor-specific insertions. We also conducted association analyses relating individual insertions to a variety of cellular phenotypes. Conclusions These data provide a resource for investigators with interests in specific cancer gene loci or mobile element insertion effects more broadly. Our data underscore that significant genetic variation in cancer genomes is owed to LINE-1 and Alu retrotransposons. Our findings also indicate that as large numbers of cancer genomes become available, it will be possible to associate individual transposable element insertion variants with molecular and phenotypic features of these malignancies. Electronic supplementary material The online version of this article (doi:10.1186/s13100-016-0078-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- John G Zampella
- Department of Dermatology, Johns Hopkins University School of Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Nemanja Rodić
- Department of Pathology, Johns Hopkins University School of Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Wan Rou Yang
- Department of Pathology, Johns Hopkins University School of Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Cheng Ran Lisa Huang
- McKusick-Nathans Institute of Genetic Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Jane Welch
- McKusick-Nathans Institute of Genetic Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Veena P Gnanakkan
- McKusick-Nathans Institute of Genetic Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Toby C Cornish
- Department of Pathology, Johns Hopkins University School of Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| | - Jef D Boeke
- McKusick-Nathans Institute of Genetic Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA ; High Throughput (HiT) Biology Center, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA ; Present address: Institute for Systems Genetics, NYU Langone University School of Medicine, New York, NY 10016 USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA ; McKusick-Nathans Institute of Genetic Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA ; High Throughput (HiT) Biology Center, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA ; The Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 733 North Broadway, Miller Research Building Room 469, Baltimore, MD 21205 USA
| |
Collapse
|
29
|
Erwin JA, Paquola ACM, Singer T, Gallina I, Novotny M, Quayle C, Bedrosian TA, Alves FIA, Butcher CR, Herdy JR, Sarkar A, Lasken RS, Muotri AR, Gage FH. L1-associated genomic regions are deleted in somatic cells of the healthy human brain. Nat Neurosci 2016; 19:1583-1591. [PMID: 27618310 DOI: 10.1038/nn.4388] [Citation(s) in RCA: 125] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2015] [Accepted: 08/09/2016] [Indexed: 02/08/2023]
Abstract
The healthy human brain is a mosaic of varied genomes. Long interspersed element-1 (LINE-1 or L1) retrotransposition is known to create mosaicism by inserting L1 sequences into new locations of somatic cell genomes. Using a machine learning-based, single-cell sequencing approach, we discovered that somatic L1-associated variants (SLAVs) are composed of two classes: L1 retrotransposition insertions and retrotransposition-independent L1-associated variants. We demonstrate that a subset of SLAVs comprises somatic deletions generated by L1 endonuclease cutting activity. Retrotransposition-independent rearrangements in inherited L1s resulted in the deletion of proximal genomic regions. These rearrangements were resolved by microhomology-mediated repair, which suggests that L1-associated genomic regions are hotspots for somatic copy number variants in the brain and therefore a heritable genetic contributor to somatic mosaicism. We demonstrate that SLAVs are present in crucial neural genes, such as DLG2 (also called PSD93), and affect 44-63% of cells of the cells in the healthy brain.
Collapse
Affiliation(s)
- Jennifer A Erwin
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | - Apuã C M Paquola
- The Salk Institute for Biological Studies, La Jolla, California, USA.,Department of Cellular &Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, California, USA
| | - Tatjana Singer
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | - Iryna Gallina
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | - Mark Novotny
- J. Craig Venter Institute, La Jolla, California, USA
| | - Carolina Quayle
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | - Tracy A Bedrosian
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | - Francisco I A Alves
- University of São Paulo, Departamento de Microbiologia, Instituto de Ciências Biomédicas, São Paulo, Brazil
| | | | - Joseph R Herdy
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | - Anindita Sarkar
- The Salk Institute for Biological Studies, La Jolla, California, USA
| | | | - Alysson R Muotri
- Department of Cellular &Molecular Medicine, University of California San Diego, School of Medicine, La Jolla, California, USA.,Department of Pediatrics, Rady Children's Hospital, San Diego, California, USA
| | - Fred H Gage
- The Salk Institute for Biological Studies, La Jolla, California, USA
| |
Collapse
|
30
|
Ha H, Loh JW, Xing J. Identification of polymorphic SVA retrotransposons using a mobile element scanning method for SVA (ME-Scan-SVA). Mob DNA 2016; 7:15. [PMID: 27478512 PMCID: PMC4967303 DOI: 10.1186/s13100-016-0072-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2016] [Accepted: 07/21/2016] [Indexed: 12/28/2022] Open
Abstract
Background Mobile element insertions are a major source of human genomic variation. SVA (SINE-R/VNTR/Alu) is the youngest retrotransposon family in the human genome and a number of diseases are known to be caused by SVA insertions. However, inter-individual genomic variations generated by SVA insertions and their impacts have not been studied extensively due to the difficulty in identifying polymorphic SVA insertions. Results To systematically identify SVA insertions at the population level and assess their genomic impact, we developed a mobile element scanning (ME-Scan) protocol we called ME-Scan-SVA. Using a nested SVA-specific PCR enrichment method, ME-Scan-SVA selectively amplify the 5′ end of SVA elements and their flanking genomic regions. To demonstrate the utility of the protocol, we constructed and sequenced a ME-Scan-SVA library of 21 individuals and analyzed the data using a new analysis pipeline designed for the protocol. Overall, the method achieved high SVA-specificity and over >90 % of the sequenced reads are from SVA insertions. The method also had high sensitivity (>90 %) for fixed SVA insertions that contain the SVA-specific primer-binding sites in the reference genome. Using candidate locus selection criteria that are expected to have a 90 % sensitivity, we identified 151 and 29 novel polymorphic SVA candidates under relaxed and stringent cutoffs, respectively (average 12 and 2 per individual). For six polymorphic SVAs that we were able to validate by PCR, the average individual genotype accuracy is 92 %, demonstrating a high accuracy of the computational genotype calling pipeline. Conclusions The new approach allows identifying novel SVA insertions using high-throughput sequencing. It is cost-effective and can be applied in large-scale population study. It also can be applied for detecting potential active SVA elements, and somatic SVA retrotransposition events in different tissues or developmental stages. Electronic supplementary material The online version of this article (doi:10.1186/s13100-016-0072-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Hongseok Ha
- Department of Genetics, The State University of New Jersey, Piscataway, 08854 NJ USA ; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, 08854 NJ USA
| | - Jui Wan Loh
- Department of Genetics, The State University of New Jersey, Piscataway, 08854 NJ USA
| | - Jinchuan Xing
- Department of Genetics, The State University of New Jersey, Piscataway, 08854 NJ USA ; Human Genetic Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, 08854 NJ USA
| |
Collapse
|
31
|
Nazaryan-Petersen L, Bertelsen B, Bak M, Jønson L, Tommerup N, Hancks DC, Tümer Z. Germline Chromothripsis Driven by L1-Mediated Retrotransposition and Alu/Alu Homologous Recombination. Hum Mutat 2016; 37:385-95. [PMID: 26929209 DOI: 10.1002/humu.22953] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 01/03/2016] [Indexed: 12/20/2022]
Abstract
Chromothripsis (CTH) is a phenomenon where multiple localized double-stranded DNA breaks result in complex genomic rearrangements. Although the DNA-repair mechanisms involved in CTH have been described, the mechanisms driving the localized "shattering" process remain unclear. High-throughput sequence analysis of a familial germline CTH revealed an inserted SVAE retrotransposon associated with a 110-kb deletion displaying hallmarks of L1-mediated retrotransposition. Our analysis suggests that the SVAE insertion did not occur prior to or after, but concurrent with the CTH event. We also observed L1-endonuclease potential target sites in other breakpoints. In addition, we found four Alu elements flanking the 110-kb deletion and associated with an inversion. We suggest that chromatin looping mediated by homologous Alu elements may have brought distal DNA regions into close proximity facilitating DNA cleavage by catalytically active L1-endonuclease. Our data provide the first evidence that active and inactive human retrotransposons can serve as endogenous mutagens driving CTH in the germline.
Collapse
Affiliation(s)
- Lusine Nazaryan-Petersen
- Applied Human Molecular Genetics, Kennedy Center, Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Glostrup, 2600, Denmark.,Department of Cellular and Molecular Medicine (ICMM), Faculty of Health Science, University of Copenhagen, Copenhagen, N. 2200, Denmark
| | - Birgitte Bertelsen
- Applied Human Molecular Genetics, Kennedy Center, Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Glostrup, 2600, Denmark
| | - Mads Bak
- Department of Cellular and Molecular Medicine, Faculty of Health Science, University of Copenhagen, Copenhagen, N. 2200, Denmark
| | - Lars Jønson
- Center for Genomic Medicine, Copenhagen University Hospital, Rigshospitalet, Copenhagen, O. 2100, Denmark
| | - Niels Tommerup
- Department of Cellular and Molecular Medicine, Faculty of Health Science, University of Copenhagen, Copenhagen, N. 2200, Denmark
| | - Dustin C Hancks
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah, 84112
| | - Zeynep Tümer
- Applied Human Molecular Genetics, Kennedy Center, Department of Clinical Genetics, Copenhagen University Hospital, Rigshospitalet, Glostrup, 2600, Denmark
| |
Collapse
|
32
|
Abstract
Mammalian genomes harbor autonomous retrotransposons coding for the proteins required for their own mobilization, and nonautonomous retrotransposons, such as the human SVA element, which are transcribed but do not have any coding capacity. Mobilization of nonautonomous retrotransposons depends on the recruitment of the protein machinery encoded by autonomous retrotransposons. Here, we summarize the experimental details of SVA trans-mobilization assays which address multiple questions regarding the biology of both nonautonomous SVA elements and autonomous LINE-1 (L1) retrotransposons. The assay evaluates if and to what extent a noncoding SVA element is mobilized in trans by the L1-encoded protein machinery, the structural organization of the resulting marked de novo insertions, if they mimic endogenous SVA insertions and what the roles of individual domains of the nonautonomous retrotransposon for SVA mobilization are. Furthermore, the highly sensitive trans-mobilization assay can be used to verify the presence of otherwise barely detectable endogenously expressed functional L1 proteins via their marked SVA trans-mobilizing activity.
Collapse
Affiliation(s)
- Anja Bock
- Division of Medical Biotechnology, Paul-Ehrlich-Institut, Paul-Ehrlich-Strasse 51-59, 63225, Langen, Germany
| | - Gerald G Schumann
- Division of Medical Biotechnology, Paul-Ehrlich-Institut, Paul-Ehrlich-Strasse 51-59, 63225, Langen, Germany.
| |
Collapse
|
33
|
Wildschutte JH, Baron A, Diroff NM, Kidd JM. Discovery and characterization of Alu repeat sequences via precise local read assembly. Nucleic Acids Res 2015; 43:10292-307. [PMID: 26503250 PMCID: PMC4666360 DOI: 10.1093/nar/gkv1089] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Accepted: 10/08/2015] [Indexed: 12/03/2022] Open
Abstract
Alu insertions have contributed to >11% of the human genome and ∼30–35 Alu subfamilies remain actively mobile, yet the characterization of polymorphic Alu insertions from short-read data remains a challenge. We build on existing computational methods to combine Alu detection and de novo assembly of WGS data as a means to reconstruct the full sequence of insertion events from Illumina paired end reads. Comparison with published calls obtained using PacBio long-reads indicates a false discovery rate below 5%, at the cost of reduced sensitivity due to the colocation of reference and non-reference repeats. We generate a highly accurate call set of 1614 completely assembled Alu variants from 53 samples from the Human Genome Diversity Project (HGDP) panel. We utilize the reconstructed alternative insertion haplotypes to genotype 1010 fully assembled insertions, obtaining >99% agreement with genotypes obtained by PCR. In our assembled sequences, we find evidence of premature insertion mechanisms and observe 5′ truncation in 16% of AluYa5 and AluYb8 insertions. The sites of truncation coincide with stem-loop structures and SRP9/14 binding sites in the Alu RNA, implicating L1 ORF2p pausing in the generation of 5′ truncations. Additionally, we identified variable AluJ and AluS elements that likely arose due to non-retrotransposition mechanisms.
Collapse
Affiliation(s)
- Julia H Wildschutte
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Alayna Baron
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Nicolette M Diroff
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| | - Jeffrey M Kidd
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, MI 48109, USA Department of Computational Medicine and Bioinformatics, University of Michigan Medical School, Ann Arbor, MI 48109, USA
| |
Collapse
|
34
|
Reinert K, Langmead B, Weese D, Evers DJ. Alignment of Next-Generation Sequencing Reads. Annu Rev Genomics Hum Genet 2015; 16:133-51. [DOI: 10.1146/annurev-genom-090413-025358] [Citation(s) in RCA: 82] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Knut Reinert
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; ,
| | - Ben Langmead
- Department of Computer Science and Center for Computational Biology, Johns Hopkins University, Baltimore, Maryland 21218;
| | - David Weese
- Department of Mathematics and Computer Science, Freie Universität Berlin, 14195 Berlin, Germany; ,
| | | |
Collapse
|
35
|
Miousse IR, Chalbot MCG, Lumen A, Ferguson A, Kavouras IG, Koturbash I. Response of transposable elements to environmental stressors. MUTATION RESEARCH. REVIEWS IN MUTATION RESEARCH 2015; 765:19-39. [PMID: 26281766 PMCID: PMC4544780 DOI: 10.1016/j.mrrev.2015.05.003] [Citation(s) in RCA: 93] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2015] [Revised: 05/27/2015] [Accepted: 05/28/2015] [Indexed: 12/21/2022]
Abstract
Transposable elements (TEs) comprise a group of repetitive sequences that bring positive, negative, as well as neutral effects to the host organism. Earlier considered as "junk DNA," TEs are now well-accepted driving forces of evolution and critical regulators of the expression of genetic information. Their activity is regulated by epigenetic mechanisms, including methylation of DNA and histone modifications. The loss of epigenetic control over TEs, exhibited as loss of DNA methylation and decondensation of the chromatin structure, may result in TEs reactivation, initiation of their insertional mutagenesis (retrotransposition) and has been reported in numerous human diseases, including cancer. Accumulating evidence suggests that these alterations are not the simple consequences of the disease, but often may drive the pathogenesis, as they can be detected early during disease development. Knowledge derived from the in vitro, in vivo, and epidemiological studies, clearly demonstrates that exposure to ubiquitous environmental stressors, many of which are carcinogens or suspected carcinogens, are capable of causing alterations in methylation and expression of TEs and initiate retrotransposition events. Evidence summarized in this review suggests that TEs are the sensitive endpoints for detection of effects caused by such environmental stressors, as ionizing radiation (terrestrial, space, and UV-radiation), air pollution (including particulate matter [PM]-derived and gaseous), persistent organic pollutants, and metals. Furthermore, the significance of these effects is characterized by their early appearance, persistence and presence in both, target organs and peripheral blood. Altogether, these findings suggest that TEs may potentially be introduced into safety and risk assessment and serve as biomarkers of exposure to environmental stressors. Furthermore, TEs also show significant potential to become invaluable surrogate biomarkers in clinic and possible targets for therapeutic modalities for disease treatment and prevention.
Collapse
Affiliation(s)
- Isabelle R Miousse
- Department of Environmental and Occupational Health, Fay W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
| | - Marie-Cecile G Chalbot
- Department of Environmental and Occupational Health, Fay W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
| | - Annie Lumen
- Division of Biochemical Toxicology, National Center for Toxicological Research, US Food and Drug Administration, Jefferson, AR 72079, USA.
| | - Alesia Ferguson
- Department of Environmental and Occupational Health, Fay W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
| | - Ilias G Kavouras
- Department of Environmental and Occupational Health, Fay W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
| | - Igor Koturbash
- Department of Environmental and Occupational Health, Fay W. Boozman College of Public Health, University of Arkansas for Medical Sciences, Little Rock, AR 72205, USA.
| |
Collapse
|
36
|
Platt RN, Zhang Y, Witherspoon DJ, Xing J, Suh A, Keith MS, Jorde LB, Stevens RD, Ray DA. Targeted Capture of Phylogenetically Informative Ves SINE Insertions in Genus Myotis. Genome Biol Evol 2015; 7:1664-75. [PMID: 26014613 PMCID: PMC4494050 DOI: 10.1093/gbe/evv099] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Identification of retrotransposon insertions in nonmodel taxa can be technically challenging and costly. This has inhibited progress in understanding retrotransposon insertion dynamics outside of a few well-studied species. To address this problem, we have extended a retrotransposon-based capture and sequence method (ME-Scan [mobile element scanning]) to identify insertions belonging to the Ves family of short interspersed elements (SINEs) across seven species of the bat genus Myotis. We identified between 120,000 and 143,000 SINE insertions in six taxa lacking a draft genome by comparing to the M. lucifugus reference genome. On average, each Ves insertion was sequenced to 129.6 × coverage. When mapped back to the M. lucifugus reference genome, all insertions were confidently assigned within a 10-bp window. Polymorphic Ves insertions were identified in each taxon based on their mapped locations. Using cross-species comparisons and the identified insertion positions, a presence–absence matrix was created for approximately 796,000 insertions. Dollo parsimony analysis of more than 85,000 phylogenetically informative insertions recovered strongly supported, monophyletic clades that correspond with the biogeography of each taxa. This phylogeny is similar to previously published mitochondrial phylogenies, with the exception of the placement of M. vivesi. These results support the utility of our variation on ME-Scan to identify polymorphic retrotransposon insertions in taxa without a reference genome and for large-scale retrotransposon-based phylogenetics.
Collapse
Affiliation(s)
- Roy N Platt
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University Department of Biological Sciences, Texas Tech University
| | - Yuhua Zhang
- Bionomics Research & Technology Center, Environmental and Occupational Health Science Institute, Rutgers, The State University of New Jersey
| | | | - Jinchuan Xing
- Department of Genetics, Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey
| | - Alexander Suh
- Department of Evolutionary Biology, Uppsala University, Sweden
| | - Megan S Keith
- Department of Biological Sciences, Texas Tech University
| | - Lynn B Jorde
- Department of Human Genetics, University of Utah Health Sciences Center
| | - Richard D Stevens
- Department of Natural Resources Management and the Museum of Texas Tech University
| | - David A Ray
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University Department of Biological Sciences, Texas Tech University
| |
Collapse
|
37
|
Ray DA, Pagan HJ, Platt RN, Kroll AR, Schaack S, Stevens RD. Differential SINE evolution in vesper and non-vesper bats. Mob DNA 2015; 6:10. [PMID: 25991928 PMCID: PMC4436864 DOI: 10.1186/s13100-015-0038-4] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 04/15/2015] [Indexed: 12/31/2022] Open
Abstract
Background Short interspersed elements (SINEs) have a powerful influence on genome evolution and can be useful markers for phylogenetic inference and population genetic analyses. In this study, we examined survey sequence and whole genome data to determine the evolutionary dynamics of Ves SINEs in the genomes of 11 bats, nine from Vespertilionidae. Results We identified 41 subfamilies of Ves and linked several to specific lineages. We also revealed substantial differences among lineages including the observation that Ves accumulation and Ves subfamily diversity is significantly higher in vesper as opposed to non-vesper bats. This is especially interesting when one considers the increased transposable element diversity of vesper bats in general. Conclusions Our data suggest that survey sequencing and genome mining are valuable tools to investigate SINE evolution among related lineages and can provide substantial information about the ability of SINEs to proliferate in diverse genomes. This method would also be a useful first step in determining which subfamilies would be the best to target when developing SINEs as markers for phylogenetic and population genetic analyses. Electronic supplementary material The online version of this article (doi:10.1186/s13100-015-0038-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- David A Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409 USA
| | - Heidi Jt Pagan
- Harbor Branch Oceanographic Institute, Florida Atlantic University, Fort Pierce, FL USA
| | - Roy N Platt
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409 USA
| | - Ashley R Kroll
- Department of Biology, Reed College, Portland, OR 97202 USA
| | - Sarah Schaack
- Department of Biology, Reed College, Portland, OR 97202 USA
| | - Richard D Stevens
- Department of Natural Resources Management and the Museum, Texas Tech University, Lubbock, TX 79409 USA
| |
Collapse
|
38
|
Library Construction for High-Throughput Mobile Element Identification and Genotyping. Methods Mol Biol 2015; 1589:1-15. [PMID: 26025622 DOI: 10.1007/7651_2015_265] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Mobile genetic elements are discrete DNA elements that can move around and copy themselves in a genome. As a ubiquitous component of the genome, mobile elements contribute to both genetic and epigenetic variation. Therefore, it is important to determine the genome-wide distribution of mobile elements. Here we present a targeted high-throughput sequencing protocol called Mobile Element Scanning (ME-Scan) for genome-wide mobile element detection. We will describe oligonucleotides design, sequencing library construction, and computational analysis for the ME-Scan protocol.
Collapse
|
39
|
Computational and Statistical Analyses of Insertional Polymorphic Endogenous Retroviruses in a Non-Model Organism. COMPUTATION 2014. [DOI: 10.3390/computation2040221] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
|
40
|
Bergman CM. A proposal for the reference-based annotation of de novo transposable element insertions. Mob Genet Elements 2014; 2:51-54. [PMID: 22754753 PMCID: PMC3383450 DOI: 10.4161/mge.19479] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
Understanding the causes and consequences of transposable element (TE) activity in the genomic era requires sophisticated bioinformatics approaches to accurately identify individual insertion sites. Next-generation sequencing technology now makes it possible to rapidly identify new TE insertions using resequencing data, opening up new possibilities to study the nature of TE-induced mutation and the target site preferences of different TE families. While the identification of new TE insertion sites is seemingly a simple task, the mechanisms of transposition present unique challenges for the annotation of de novo transposable element insertions mapped to a reference genome. Here I discuss these challenges and propose a framework for the annotation of de novo TE insertions that accommodates known mechanisms of TE insertion and established coordinate systems for genome annotation.
Collapse
Affiliation(s)
- Casey M Bergman
- Faculty of Life Sciences; University of Manchester; Manchester, UK
| |
Collapse
|
41
|
Wu J, Lee WP, Ward A, Walker JA, Konkel MK, Batzer MA, Marth GT. Tangram: a comprehensive toolbox for mobile element insertion detection. BMC Genomics 2014; 15:795. [PMID: 25228379 PMCID: PMC4180832 DOI: 10.1186/1471-2164-15-795] [Citation(s) in RCA: 47] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2014] [Accepted: 09/03/2014] [Indexed: 11/10/2022] Open
Abstract
Background Mobile elements (MEs) constitute greater than 50% of the human genome as a result of repeated insertion events during human genome evolution. Although most of these elements are now fixed in the population, some MEs, including ALU, L1, SVA and HERV-K elements, are still actively duplicating. Mobile element insertions (MEIs) have been associated with human genetic disorders, including Crohn’s disease, hemophilia, and various types of cancer, motivating the need for accurate MEI detection methods. To comprehensively identify and accurately characterize these variants in whole genome next-generation sequencing (NGS) data, a computationally efficient detection and genotyping method is required. Current computational tools are unable to call MEI polymorphisms with sufficiently high sensitivity and specificity, or call individual genotypes with sufficiently high accuracy. Results Here we report Tangram, a computationally efficient MEI detection program that integrates read-pair (RP) and split-read (SR) mapping signals to detect MEI events. By utilizing SR mapping in its primary detection module, a feature unique to this software, Tangram is able to pinpoint MEI breakpoints with single-nucleotide precision. To understand the role of MEI events in disease, it is essential to produce accurate individual genotypes in clinical samples. Tangram is able to determine sample genotypes with very high accuracy. Using simulations and experimental datasets, we demonstrate that Tangram has superior sensitivity, specificity, breakpoint resolution and genotyping accuracy, when compared to other, recently developed MEI detection methods. Conclusions Tangram serves as the primary MEI detection tool in the 1000 Genomes Project, and is implemented as a highly portable, memory-efficient, easy-to-use C++ computer program, built under an open-source development model.
Collapse
Affiliation(s)
| | | | | | | | | | | | - Gabor T Marth
- Department of Human Genetics and USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, Utah, USA.
| |
Collapse
|
42
|
Mobile DNA elements in the generation of diversity and complexity in the brain. Nat Rev Neurosci 2014; 15:497-506. [PMID: 25005482 DOI: 10.1038/nrn3730] [Citation(s) in RCA: 180] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
Mobile elements are DNA sequences that can change their position (retrotranspose) within the genome. Although its biological function is largely unappreciated, DNA derived from mobile elements comprises nearly half of the human genome. It has long been thought that neuronal genomes are invariable; however, recent studies have demonstrated that mobile elements actively retrotranspose during neurogenesis, thereby creating genomic diversity between neurons. In addition, mounting data demonstrate that mobile elements are misregulated in certain neurological disorders, including Rett syndrome and schizophrenia.
Collapse
|
43
|
Helman E, Lawrence MS, Stewart C, Sougnez C, Getz G, Meyerson M. Somatic retrotransposition in human cancer revealed by whole-genome and exome sequencing. Genome Res 2014; 24:1053-63. [PMID: 24823667 PMCID: PMC4079962 DOI: 10.1101/gr.163659.113] [Citation(s) in RCA: 161] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2013] [Accepted: 04/03/2014] [Indexed: 01/27/2023]
Abstract
Retrotransposons constitute a major source of genetic variation, and somatic retrotransposon insertions have been reported in cancer. Here, we applied TranspoSeq, a computational framework that identifies retrotransposon insertions from sequencing data, to whole genomes from 200 tumor/normal pairs across 11 tumor types as part of The Cancer Genome Atlas (TCGA) Pan-Cancer Project. In addition to novel germline polymorphisms, we find 810 somatic retrotransposon insertions primarily in lung squamous, head and neck, colorectal, and endometrial carcinomas. Many somatic retrotransposon insertions occur in known cancer genes. We find that high somatic retrotransposition rates in tumors are associated with high rates of genomic rearrangement and somatic mutation. Finally, we developed TranspoSeq-Exome to interrogate an additional 767 tumor samples with hybrid-capture exome data and discovered 35 novel somatic retrotransposon insertions into exonic regions, including an insertion into an exon of the PTEN tumor suppressor gene. The results of this large-scale, comprehensive analysis of retrotransposon movement across tumor types suggest that somatic retrotransposon insertions may represent an important class of structural variation in cancer.
Collapse
Affiliation(s)
- Elena Helman
- Harvard-MIT Division of Health Sciences & Technology, Cambridge, Massachusetts 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, Masachusetts 02142, USA
| | | | - Chip Stewart
- Broad Institute of MIT and Harvard, Cambridge, Masachusetts 02142, USA
| | - Carrie Sougnez
- Broad Institute of MIT and Harvard, Cambridge, Masachusetts 02142, USA
| | - Gad Getz
- Broad Institute of MIT and Harvard, Cambridge, Masachusetts 02142, USA
- Massachusetts General Hospital, Boston, Massachusetts 02114, USA
| | - Matthew Meyerson
- Harvard-MIT Division of Health Sciences & Technology, Cambridge, Massachusetts 02139, USA
- Broad Institute of MIT and Harvard, Cambridge, Masachusetts 02142, USA
- Center for Cancer Genome Discovery and Department of Medical Oncology, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
- Department of Pathology, Brigham & Women’s Hospital and Harvard Medical School, Boston, Massachusetts 02115, USA
| |
Collapse
|
44
|
Guffanti G, Gaudi S, Fallon JH, Sobell J, Potkin SG, Pato C, Macciardi F. Transposable elements and psychiatric disorders. Am J Med Genet B Neuropsychiatr Genet 2014; 165B:201-16. [PMID: 24585726 DOI: 10.1002/ajmg.b.32225] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/03/2013] [Accepted: 01/21/2014] [Indexed: 12/15/2022]
Abstract
Transposable Elements (TEs) or transposons are low-complexity elements (e.g., LINEs, SINEs, SVAs, and HERVs) that make up to two-thirds of the human genome. There is mounting evidence that TEs play an essential role in genomic architecture and regulation related to both normal function and disease states. Recently, the identification of active TEs in several different human brain regions suggests that TEs play a role in normal brain development and adult physiology and quite possibly in psychiatric disorders. TEs have been implicated in hemophilia, neurofibromatosis, and cancer. With the advent of next-generation whole-genome sequencing approaches, our understanding of the relationship between TEs and psychiatric disorders will greatly improve. We will review the biology of TEs and early evidence for TE involvement in psychiatric disorders.
Collapse
Affiliation(s)
- Guia Guffanti
- Department of Psychiatry, Columbia University, New York, New York
| | | | | | | | | | | | | |
Collapse
|
45
|
Abyzov A, Iskow R, Gokcumen O, Radke DW, Balasubramanian S, Pei B, Habegger L, Lee C, Gerstein M. Analysis of variable retroduplications in human populations suggests coupling of retrotransposition to cell division. Genome Res 2013; 23:2042-52. [PMID: 24026178 PMCID: PMC3847774 DOI: 10.1101/gr.154625.113] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In primates and other animals, reverse transcription of mRNA followed by genomic integration creates retroduplications. Expressed retroduplications are either “retrogenes” coding for functioning proteins, or expressed “processed pseudogenes,” which can function as noncoding RNAs. To date, little is known about the variation in retroduplications in terms of their presence or absence across individuals in the human population. We have developed new methodologies that allow us to identify “novel” retroduplications (i.e., those not present in the reference genome), to find their insertion points, and to genotype them. Using these methods, we catalogued and analyzed 174 retroduplication variants in almost one thousand humans, which were sequenced as part of Phase 1 of The 1000 Genomes Project Consortium. The accuracy of our data set was corroborated by (1) multiple lines of sequencing evidence for retroduplication (e.g., depth of coverage in exons vs. introns), (2) experimental validation, and (3) the fact that we can reconstruct a correct phylogenetic tree of human subpopulations based solely on retroduplications. We also show that parent genes of retroduplication variants tend to be expressed at the M-to-G1 transition in the cell cycle and that M-to-G1 expressed genes have more copies of fixed retroduplications than genes expressed at other times. These findings suggest that cell division is coupled to retrotransposition and, perhaps, is even a requirement for it.
Collapse
|
46
|
Bonchev G, Parisod C. Transposable elements and microevolutionary changes in natural populations. Mol Ecol Resour 2013; 13:765-75. [DOI: 10.1111/1755-0998.12133] [Citation(s) in RCA: 54] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2013] [Revised: 05/31/2013] [Accepted: 06/04/2013] [Indexed: 11/27/2022]
Affiliation(s)
- Georgi Bonchev
- Laboratory of evolutionary botany Institute of biology University of Neuchâtel Rue Emile‐Argand 11 CH‐2000 Neuchâtel Switzerland
- Institute of plant physiology and genetics Bulgarian academy of sciences G. Bonchev Street, Bldg 24 Sofia 1113 Bulgaria
| | - Christian Parisod
- Laboratory of evolutionary botany Institute of biology University of Neuchâtel Rue Emile‐Argand 11 CH‐2000 Neuchâtel Switzerland
| |
Collapse
|
47
|
The use of RelocaTE and unassembled short reads to produce high-resolution snapshots of transposable element generated diversity in rice. G3-GENES GENOMES GENETICS 2013; 3:949-57. [PMID: 23576519 PMCID: PMC3689806 DOI: 10.1534/g3.112.005348] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
Abstract
Transposable elements (TEs) are dynamic components of genomes that often vary in copy number among members of the same species. With the advent of next-generation sequencing TE insertion-site polymorphism can be examined at an unprecedented level of detail when combined with easy-to-use bioinformatics software. Here we report a new tool, RelocaTE, that rapidly identifies specific TE insertions that are either polymorphic or shared between a reference and unassembled next-generation sequencing reads. Furthermore, a novel companion tool, CharacTErizer, exploits the depth of coverage to classify genotypes of nonreference insertions as homozygous, heterozygous or, when analyzing an active TE family, as rare somatic insertion or excision events. It does this by comparing the numbers of RelocaTE aligned reads to reads that map to the same genomic position without the TE. Although RelocaTE and CharacTErizer can be used for any TE, they were developed to analyze the very active mPing element which is undergoing massive amplification in specific strains of Oryza sativa (rice). Three individuals of one of these strains, A123, were resequenced and analyzed for mPing insertion site polymorphisms. The majority of mPing insertions found (~97%) are not present in the reference, and two siblings from a self-crossed of this strain were found to share only ~90% of their insertions. Private insertions are primarily heterozygous but include both homozygous and predicted somatic insertions. The reliability of the predicted genotypes was validated by polymerase chain reaction.
Collapse
|
48
|
Witherspoon DJ, Zhang Y, Xing J, Watkins WS, Ha H, Batzer MA, Jorde LB. Mobile element scanning (ME-Scan) identifies thousands of novel Alu insertions in diverse human populations. Genome Res 2013; 23:1170-81. [PMID: 23599355 PMCID: PMC3698510 DOI: 10.1101/gr.148973.112] [Citation(s) in RCA: 58] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Alu retrotransposons are the most numerous and active mobile elements in humans, causing genetic disease and creating genomic diversity. Mobile element scanning (ME-Scan) enables comprehensive and affordable identification of mobile element insertions (MEI) using targeted high-throughput sequencing of multiplexed MEI junction libraries. In a single experiment, ME-Scan identifies nearly all AluYb8 and AluYb9 elements, with high sensitivity for both rare and common insertions, in 169 individuals of diverse ancestry. ME-Scan detects heterozygous insertions in single individuals with 91% sensitivity. Insertion presence or absence states determined by ME-Scan are 95% concordant with those determined by locus-specific PCR assays. By sampling diverse populations from Africa, South Asia, and Europe, we are able to identify 5799 Alu insertions, including 2524 novel ones, some of which occur in exons. Sub-Saharan populations and a Pygmy group in particular carry numerous intermediate-frequency Alu insertions that are absent in non-African groups. There is a significant dearth of exon-interrupting insertions among common Alu polymorphisms, but the density of singleton Alu insertions is constant across exonic and nonexonic regions. In one case, a validated novel singleton Alu interrupts a protein-coding exon of FAM187B. This implies that exonic Alu insertions are generally deleterious and thus eliminated by natural selection, but not so quickly that they cannot be observed as extremely rare variants.
Collapse
Affiliation(s)
- David J Witherspoon
- Department of Human Genetics, Eccles Institute of Human Genetics, University of Utah, Salt Lake City, Utah 84112, USA.
| | | | | | | | | | | | | |
Collapse
|
49
|
Babatz TD, Burns KH. Functional impact of the human mobilome. Curr Opin Genet Dev 2013; 23:264-70. [PMID: 23523050 DOI: 10.1016/j.gde.2013.02.007] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2012] [Revised: 02/07/2013] [Accepted: 02/14/2013] [Indexed: 02/02/2023]
Abstract
The human genome is replete with interspersed repetitive sequences derived from the propagation of mobile DNA elements. Three families of human retrotransposons remain active today: LINE1, Alu, and SVA elements. Since 1988, de novo insertions at previously recognized disease loci have been shown to generate highly penetrant alleles in Mendelian disorders. Only recently has the extent of germline-transmitted retrotransposon insertion polymorphism (RIP) in human populations been fully realized. Also exciting are recent studies of somatic retrotransposition in human tissues and reports of tumor-specific insertions, suggesting roles in tissue heterogeneity and tumorigenesis. Here we discuss mobile elements in human disease with an emphasis on exciting developments from the last several years.
Collapse
Affiliation(s)
- Timothy D Babatz
- McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, Baltimore, MD, United States
| | | |
Collapse
|
50
|
Xing J, Witherspoon DJ, Jorde LB. Mobile element biology: new possibilities with high-throughput sequencing. Trends Genet 2013; 29:280-9. [PMID: 23312846 DOI: 10.1016/j.tig.2012.12.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2012] [Revised: 11/20/2012] [Accepted: 12/11/2012] [Indexed: 12/29/2022]
Abstract
Mobile elements comprise more than half of the human genome, but until recently their large-scale detection was time consuming and challenging. With the development of new high-throughput sequencing (HTS) technologies, the complete spectrum of mobile element variation in humans can now be identified and analyzed. Thousands of new mobile element insertions (MEIs) have been discovered, yielding new insights into mobile element biology, evolution, and genomic variation. Here, we review several high-throughput methods, with an emphasis on techniques that specifically target MEIs in humans. We highlight recent applications of these methods in evolutionary studies and in the analysis of somatic alterations in human normal and tumor tissues.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Genetics, Human Genetic Institute of New Jersey, Rutgers, State University of New Jersey, Piscataway, NJ 08854, USA
| | | | | |
Collapse
|