1
|
McKerrow W, Tang Z, Steranka JP, Payer LM, Boeke JD, Keefe D, Fenyö D, Burns KH, Liu C. Human transposon insertion profiling by sequencing (TIPseq) to map LINE-1 insertions in single cells. Philos Trans R Soc Lond B Biol Sci 2020; 375:20190335. [PMID: 32075555 PMCID: PMC7061987 DOI: 10.1098/rstb.2019.0335] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Long interspersed element-1 (LINE-1, L1) sequences, which comprise about 17% of human genome, are the product of one of the most active types of mobile DNAs in modern humans. LINE-1 insertion alleles can cause inherited and de novo genetic diseases, and LINE-1-encoded proteins are highly expressed in some cancers. Genome-wide LINE-1 mapping in single cells could be useful for defining somatic and germline retrotransposition rates, and for enabling studies to characterize tumour heterogeneity, relate insertions to transcriptional and epigenetic effects at the cellular level, or describe cellular phylogenies in development. Our laboratories have reported a genome-wide LINE-1 insertion site mapping method for bulk DNA, named transposon insertion profiling by sequencing (TIPseq). There have been significant barriers applying LINE-1 mapping to single cells, owing to the chimeric artefacts and features of repetitive sequences. Here, we optimize a modified TIPseq protocol and show its utility for LINE-1 mapping in single lymphoblastoid cells. Results from single-cell TIPseq experiments compare well to known LINE-1 insertions found by whole-genome sequencing and TIPseq on bulk DNA. Among the several approaches we tested, whole-genome amplification by multiple displacement amplification followed by restriction enzyme digestion, vectorette ligation and LINE-1-targeted PCR had the best assay performance. This article is part of a discussion meeting issue 'Crossroads between transposons and gene regulation'.
Collapse
Affiliation(s)
- Wilson McKerrow
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, USA
| | - Zuojian Tang
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, USA
| | - Jared P Steranka
- Department of Pathology, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA
| | - Lindsay M Payer
- Department of Pathology, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA
| | - Jef D Boeke
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, USA
| | - David Keefe
- Department of Obstetrics and Gynecology, New York University Langone School of Medicine, 462 First Avenue, New York, NY 10016, USA.,Department of Cell Biology, New York University Langone School of Medicine, 462 First Avenue, New York, NY 10016, USA
| | - David Fenyö
- Institute for Systems Genetics and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine, New York, USA
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA.,McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA.,High Throughput (HiT) Biology Center, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA.,Sidney Kimmel Comprehensive Cancer Center, Johns Hopkins University School of Medicine, 401N Broadway, Baltimore, MD 21231, USA
| | - Chunhong Liu
- Department of Pathology, Johns Hopkins University School of Medicine, 733N Broadway, Baltimore, MD 21205, USA
| |
Collapse
|
2
|
Walters-Conte KB, Johnson DLE, Johnson WE, O’Brien SJ, Pecon-Slattery J. The dynamic proliferation of CanSINEs mirrors the complex evolution of Feliforms. BMC Evol Biol 2014; 14:137. [PMID: 24947429 PMCID: PMC4084570 DOI: 10.1186/1471-2148-14-137] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2014] [Accepted: 06/11/2014] [Indexed: 01/22/2023] Open
Abstract
BACKGROUND Repetitive short interspersed elements (SINEs) are retrotransposons ubiquitous in mammalian genomes and are highly informative markers to identify species and phylogenetic associations. Of these, SINEs unique to the order Carnivora (CanSINEs) yield novel insights on genome evolution in domestic dogs and cats, but less is known about their role in related carnivores. In particular, genome-wide assessment of CanSINE evolution has yet to be completed across the Feliformia (cat-like) suborder of Carnivora. Within Feliformia, the cat family Felidae is composed of 37 species and numerous subspecies organized into eight monophyletic lineages that likely arose 10 million years ago. Using the Felidae family as a reference phylogeny, along with representative taxa from other families of Feliformia, the origin, proliferation and evolution of CanSINEs within the suborder were assessed. RESULTS We identified 93 novel intergenic CanSINE loci in Feliformia. Sequence analyses separated Feliform CanSINEs into two subfamilies, each characterized by distinct RNA polymerase binding motifs and phylogenetic associations. Subfamily I CanSINEs arose early within Feliformia but are no longer under active proliferation. Subfamily II loci are more recent, exclusive to Felidae and show evidence for adaptation to extant RNA polymerase activity. Further, presence/absence distributions of CanSINE loci are largely congruent with taxonomic expectations within Feliformia and the less resolved nodes in the Felidae reference phylogeny present equally ambiguous CanSINE data. SINEs are thought to be nearly impervious to excision from the genome. However, we observed a nearly complete excision of a CanSINEs locus in puma (Puma concolor). In addition, we found that CanSINE proliferation in Felidae frequently targeted existing CanSINE loci for insertion sites, resulting in tandem arrays. CONCLUSIONS We demonstrate the existence of at least two SINE families within the Feliformia suborder, one of which is actively involved in insertional mutagenesis. We find SINEs are powerful markers of speciation and conclude that the few inconsistencies with expected patterns of speciation likely represent incomplete lineage sorting, species hybridization and SINE-mediated genome rearrangement.
Collapse
Affiliation(s)
- Kathryn B Walters-Conte
- Department of Biology, American University, 101 Hurst Hall 4440 Massachusetts Ave, Washington, DC 20016, USA
| | - Diana LE Johnson
- Department of Biological Sciences, The George Washington University, 2036 G St, Washington, DC 20009, USA
| | - Warren E Johnson
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630, USA
| | - Stephen J O’Brien
- Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, 41 A, Sredniy Avenue St., Petersburg 199034, Russia
| | - Jill Pecon-Slattery
- Smithsonian Conservation Biology Institute, National Zoological Park, Front Royal, VA 22630, USA
| |
Collapse
|
3
|
Kamath PL, Elleder D, Bao L, Cross PC, Powell JH, Poss M. The population history of endogenous retroviruses in mule deer (Odocoileus hemionus). J Hered 2013; 105:173-87. [PMID: 24336966 DOI: 10.1093/jhered/est088] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Mobile elements are powerful agents of genomic evolution and can be exceptionally informative markers for investigating species and population-level evolutionary history. While several studies have utilized retrotransposon-based insertional polymorphisms to resolve phylogenies, few population studies exist outside of humans. Endogenous retroviruses are LTR-retrotransposons derived from retroviruses that have become stably integrated in the host genome during past infections and transmitted vertically to subsequent generations. They offer valuable insight into host-virus co-evolution and a unique perspective on host evolutionary history because they integrate into the genome at a discrete point in time. We examined the evolutionary history of a cervid endogenous gammaretrovirus (CrERVγ) in mule deer (Odocoileus hemionus). We sequenced 14 CrERV proviruses (CrERV-in1 to -in14), and examined the prevalence and distribution of 13 proviruses in 262 deer among 15 populations from Montana, Wyoming, and Utah. CrERV absence in white-tailed deer (O. virginianus), identical 5' and 3' long terminal repeat (LTR) sequences, insertional polymorphism, and CrERV divergence time estimates indicated that most endogenization events occurred within the last 200000 years. Population structure inferred from CrERVs (F ST = 0.008) and microsatellites (θ = 0.01) was low, but significant, with Utah, northwestern Montana, and a Helena herd being particularly differentiated. Clustering analyses indicated regional structuring, and non-contiguous clustering could often be explained by known translocations. Cluster ensemble results indicated spatial localization of viruses, specifically in deer from northeastern and western Montana. This study demonstrates the utility of endogenous retroviruses to elucidate and provide novel insight into both ERV evolutionary history and the history of contemporary host populations.
Collapse
Affiliation(s)
- Pauline L Kamath
- the US Geological Survey, Northern Rocky Mountain Science Center, Bozeman, MT 59715
| | | | | | | | | | | |
Collapse
|
4
|
Li J, Akagi K, Hu Y, Trivett AL, Hlynialuk CJ, Swing DA, Volfovsky N, Morgan TC, Golubeva Y, Stephens RM, Smith DE, Symer DE. Mouse endogenous retroviruses can trigger premature transcriptional termination at a distance. Genome Res 2012; 22:870-84. [PMID: 22367191 PMCID: PMC3337433 DOI: 10.1101/gr.130740.111] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2011] [Accepted: 02/09/2012] [Indexed: 01/15/2023]
Abstract
Endogenous retrotransposons have caused extensive genomic variation within mammalian species, but the functional implications of such mobilization are mostly unknown. We mapped thousands of endogenous retrovirus (ERV) germline integrants in highly divergent, previously unsequenced mouse lineages, facilitating a comparison of gene expression in the presence or absence of local insertions. Polymorphic ERVs occur relatively infrequently in gene introns and are particularly depleted from genes involved in embryogenesis or that are highly expressed in embryonic stem cells. Their genomic distribution implies ongoing negative selection due to deleterious effects on gene expression and function. A polymorphic, intronic ERV at Slc15a2 triggers up to 49-fold increases in premature transcriptional termination and up to 39-fold reductions in full-length transcripts in adult mouse tissues, thereby disrupting protein expression and functional activity. Prematurely truncated transcripts also occur at Polr1a, Spon1, and up to ∼5% of other genes when intronic ERV polymorphisms are present. Analysis of expression quantitative trait loci (eQTLs) in recombinant BxD mouse strains demonstrated very strong genetic associations between the polymorphic ERV in cis and disrupted transcript levels. Premature polyadenylation is triggered at genomic distances up to >12.5 kb upstream of the ERV, both in cis and between alleles. The parent of origin of the ERV is associated with variable expression of nonterminated transcripts and differential DNA methylation at its 5'-long terminal repeat. This study defines an unexpectedly strong functional impact of ERVs in disrupting gene transcription at a distance and demonstrates that ongoing retrotransposition can contribute significantly to natural phenotypic diversity.
Collapse
Affiliation(s)
- Jingfeng Li
- Human Cancer Genetics Program and Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | - Keiko Akagi
- Human Cancer Genetics Program and Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | - Yongjun Hu
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109, USA
| | | | - Christopher J.W. Hlynialuk
- Human Cancer Genetics Program and Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| | - Deborah A. Swing
- Mouse Cancer Genetics Program, National Cancer Institute, Frederick, Maryland 21702, USA
| | - Natalia Volfovsky
- Advanced Biomedical Computing Center, Information Systems Program and
| | - Tamara C. Morgan
- Histotechnology Laboratory, SAIC-Frederick, Inc., National Cancer Institute, Frederick, Maryland 21702, USA
| | - Yelena Golubeva
- Histotechnology Laboratory, SAIC-Frederick, Inc., National Cancer Institute, Frederick, Maryland 21702, USA
| | | | - David E. Smith
- Department of Pharmaceutical Sciences, University of Michigan, Ann Arbor, Michigan 48109, USA
| | - David E. Symer
- Human Cancer Genetics Program and Department of Molecular Virology, Immunology and Medical Genetics, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
- Department of Internal Medicine and Department of Biomedical Informatics, The Ohio State University Comprehensive Cancer Center, Columbus, Ohio 43210, USA
| |
Collapse
|
5
|
Sequence periodic pattern of HERV LTRs: A matrix simulation algorithm. J Biosci 2012; 37:19-24. [DOI: 10.1007/s12038-012-9182-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
|
6
|
Ray DA, Batzer MA. Reading TE leaves: new approaches to the identification of transposable element insertions. Genome Res 2011; 21:813-20. [PMID: 21632748 PMCID: PMC3106314 DOI: 10.1101/gr.110528.110] [Citation(s) in RCA: 49] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Transposable elements (TEs) are a tremendous source of genome instability and genetic variation. Of particular interest to investigators of human biology and human evolution are retrotransposon insertions that are recent and/or polymorphic in the human population. As a consequence, the ability to assay large numbers of polymorphic TEs in a given genome is valuable. Five recent manuscripts each propose methods to scan whole human genomes to identify, map, and, in some cases, genotype polymorphic retrotransposon insertions in multiple human genomes simultaneously. These technologies promise to revolutionize our ability to analyze human genomes for TE-based variation important to studies of human variability and human disease. Furthermore, the approaches hold promise for researchers interested in nonhuman genomic variability. Herein, we explore the methods reported in the manuscripts and discuss their applications to aspects of human biology and the biology of other organisms.
Collapse
Affiliation(s)
- David A. Ray
- Department of Biochemistry and Molecular Biology, Mississippi State University, Mississippi State, Mississippi 39762, USA
| | - Mark A. Batzer
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana 70803, USA
| |
Collapse
|
7
|
Roos C, Zinner D, Kubatko LS, Schwarz C, Yang M, Meyer D, Nash SD, Xing J, Batzer MA, Brameier M, Leendertz FH, Ziegler T, Perwitasari-Farajallah D, Nadler T, Walter L, Osterholz M. Nuclear versus mitochondrial DNA: evidence for hybridization in colobine monkeys. BMC Evol Biol 2011; 11:77. [PMID: 21435245 PMCID: PMC3068967 DOI: 10.1186/1471-2148-11-77] [Citation(s) in RCA: 101] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2010] [Accepted: 03/24/2011] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND Colobine monkeys constitute a diverse group of primates with major radiations in Africa and Asia. However, phylogenetic relationships among genera are under debate, and recent molecular studies with incomplete taxon-sampling revealed discordant gene trees. To solve the evolutionary history of colobine genera and to determine causes for possible gene tree incongruences, we combined presence/absence analysis of mobile elements with autosomal, X chromosomal, Y chromosomal and mitochondrial sequence data from all recognized colobine genera. RESULTS Gene tree topologies and divergence age estimates derived from different markers were similar, but differed in placing Piliocolobus/Procolobus and langur genera among colobines. Although insufficient data, homoplasy and incomplete lineage sorting might all have contributed to the discordance among gene trees, hybridization is favored as the main cause of the observed discordance. We propose that African colobines are paraphyletic, but might later have experienced female introgression from Piliocolobus/Procolobus into Colobus. In the late Miocene, colobines invaded Eurasia and diversified into several lineages. Among Asian colobines, Semnopithecus diverged first, indicating langur paraphyly. However, unidirectional gene flow from Semnopithecus into Trachypithecus via male introgression followed by nuclear swamping might have occurred until the earliest Pleistocene. CONCLUSIONS Overall, our study provides the most comprehensive view on colobine evolution to date and emphasizes that analyses of various molecular markers, such as mobile elements and sequence data from multiple loci, are crucial to better understand evolutionary relationships and to trace hybridization events. Our results also suggest that sex-specific dispersal patterns, promoted by a respective social organization of the species involved, can result in different hybridization scenarios.
Collapse
Affiliation(s)
- Christian Roos
- Primate Genetics Laboratory, German Primate Center, Göttingen, Germany.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
8
|
Affiliation(s)
- Miriam K Konkel
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Bldg., Baton Rouge, LA 70803, USA
| | - Jerilyn A Walker
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Bldg., Baton Rouge, LA 70803, USA
| | - Mark A Batzer
- Department of Biological Sciences, Louisiana State University, 202 Life Sciences Bldg., Baton Rouge, LA 70803, USA
| |
Collapse
|
9
|
Raaum RL, Wang AB, Al-Meeri AM, Mulligan CJ. Efficient population assignment and outlier detection in human populations using biallelic markers chosen by principal component-based rankings. Biotechniques 2010; 48:449-54. [PMID: 20569219 DOI: 10.2144/000113426] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Whole-genome studies of genetic variation are now performed routinely and have accelerated the identification of disease-associated allelic variants, positive selection, recombination, and structural variation. However, these studies are sensitive to the presence of outlier data from individuals of different ancestry than the rest of the sample. Currently, the most common method of excluding outlier individuals is to collect a population sample and exclude outliers after genome-wide data have been collected. Here we show that a small collection of 20-27 polymorphic Alu insertions, selected using a principal component-based method with genetic ancestry estimates, may be used to easily assign Africans, East Asians, and Europeans to their population of origin. In addition, we show that samples from a geographically and genetically intermediate population (in our study, samples from India) can be identified within the original sample of Africans, East Asians, and Europeans. Finally, we show that outlier individuals from neighboring geographic regions (in our study, Yemen and sub-Saharan Africa) can be identified. These results will be of value in preselection of samples for more in-depth analysis as well as customized identification of maximally informative polymorphic markers for regional studies.
Collapse
Affiliation(s)
- Ryan L Raaum
- Department of Anthropology, Lehman College, The City University of New York, The Bronx, NY, USA.
| | | | | | | |
Collapse
|
10
|
Thompson ML, Gauna AE, Williams ML, Ray DA. Multiple chicken repeat 1 lineages in the genomes of oestroid flies. Gene 2009; 448:40-5. [PMID: 19716865 DOI: 10.1016/j.gene.2009.08.010] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2009] [Revised: 08/03/2009] [Accepted: 08/14/2009] [Indexed: 11/24/2022]
Abstract
Retrotransposons including CR1 (chicken repeat 1) elements are important factors in genome evolution. They also mobilize in a genome in a way that makes them useful for phylogenetic analysis and species identification. This study was designed to identify lineages of CR1 elements in the genomes of forensically important oestroid flies and to further characterize one family, Sbul.CR1B. CR1 fragments from several taxa were amplified, cloned, sequenced and analyzed to identify different lineages of elements. A variety of retrotransposon families were recovered that exhibit similarity to known retrotransposon families. A number of these lineages may have given rise to taxon-specific subfamilies that have been recently active in oestroid fly genomes. One element from Sarcophaga bullata was analyzed in detail to reconstruct a partial Open Reading Frame containing both the reverse transcriptase (RT) and endonuclease (EN) domains. These domains were used to identify conserved amino acid regions in the recovered consensus via comparison to known non-LTR retrotransposons. Phylogenetic analysis of the RT domain revealed the recovered ORF in S. bullata compares favorably with previously documented CR1-like elements. This work will serve as the basis for additional analyses targeted at developing a simple, efficient marker system for the identification of forensically important carrion flies.
Collapse
|
11
|
Marchani EE, Xing J, Witherspoon DJ, Jorde LB, Rogers AR. Estimating the age of retrotransposon subfamilies using maximum likelihood. Genomics 2009; 94:78-82. [PMID: 19379804 DOI: 10.1016/j.ygeno.2009.04.002] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2009] [Revised: 04/10/2009] [Accepted: 04/11/2009] [Indexed: 11/29/2022]
Abstract
We present a maximum likelihood model to estimate the age of retrotransposon subfamilies. This method is designed around a master gene model which assumes a constant retrotransposition rate. The statistical properties of this model and an ad hoc estimation procedure are compared using two simulated data sets. We also test whether each estimation procedure is robust to violation of the master gene model. According to our results, both estimation procedures are accurate under the master gene model. While both methods tend to overestimate ages under the intermediate model, the maximum likelihood estimate is significantly less inflated than the ad hoc estimate. We estimate the ages of two subfamilies of human-specific LINE-I insertions using both estimation procedures. By calculating confidence intervals around the maximum likelihood estimate, our model can both provide an estimate of retrotransposon subfamily age and describe the range of subfamily ages consistent with the data.
Collapse
Affiliation(s)
- Elizabeth E Marchani
- Division of Medical Genetics, University of Washington, BOX 357720, Seattle, WA 98195, USA.
| | | | | | | | | |
Collapse
|
12
|
Xing J, Witherspoon DJ, Ray DA, Batzer MA, Jorde LB. Mobile DNA elements in primate and human evolution. AMERICAN JOURNAL OF PHYSICAL ANTHROPOLOGY 2008; Suppl 45:2-19. [PMID: 18046749 DOI: 10.1002/ajpa.20722] [Citation(s) in RCA: 106] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Roughly 50% of the primate genome consists of mobile, repetitive DNA sequences such as Alu and LINE1 elements. The causes and evolutionary consequences of mobile element insertion, which have received considerable attention during the past decade, are reviewed in this article. Because of their unique mutational mechanisms, these elements are highly useful for answering phylogenetic questions. We demonstrate how they have been used to help resolve a number of questions in primate phylogeny, including the human-chimpanzee-gorilla trichotomy and New World primate phylogeny. Alu and LINE1 element insertion polymorphisms have also been analyzed in human populations to test hypotheses about human evolution and population affinities and to address forensic issues. Finally, these elements have had impacts on the genome itself. We review how they have influenced fundamental ongoing processes like nonhomologous recombination, genomic deletion, and X chromosome inactivation.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, UT 84112, USA
| | | | | | | | | |
Collapse
|
13
|
Wang J, Song L, Grover D, Azrak S, Batzer MA, Liang P. dbRIP: a highly integrated database of retrotransposon insertion polymorphisms in humans. Hum Mutat 2006; 27:323-9. [PMID: 16511833 PMCID: PMC1855216 DOI: 10.1002/humu.20307] [Citation(s) in RCA: 148] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Retrotransposons constitute over 40% of the human genome and play important roles in the evolution of the genome. Since certain types of retrotransposons, particularly members of the Alu, L1, and SVA families, are still active, their recent and ongoing propagation generates a unique and important class of human genomic diversity/polymorphism (for the presence and absence of an insertion) with some elements known to cause genetic diseases. So far, over 2,300, 500, and 80 Alu, L1, and SVA insertions, respectively, have been reported to be polymorphic and many more are yet to be discovered. We present here the Database of Retrotransposon Insertion Polymorphisms (dbRIP; http://falcon.roswellpark.org:9090), a highly integrated and interactive database of human retrotransposon insertion polymorphisms (RIPs). dbRIP currently contains a nonredundant list of 1,625, 407, and 63 polymorphic Alu, L1, and SVA elements, respectively, or a total of 2,095 RIPs. In dbRIP, we deploy the utilities and annotated data of the genome browser developed at the University of California at Santa Cruz (UCSC) for user-friendly queries and integrative browsing of RIPs along with all other genome annotation information. Users can query the database by a variety of means and have access to the detailed information related to a RIP, including detailed insertion sequences and genotype data. dbRIP represents the first database providing comprehensive, integrative, and interactive compilation of RIP data, and it will be a useful resource for researchers working in the area of human genetics.
Collapse
Affiliation(s)
- Jianxin Wang
- Department of Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York
| | - Lei Song
- Department of Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York
| | - Deepak Grover
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for BioModular Multi-scale Systems, Louisiana State University, Baton Rouge, Louisiana
| | - Sami Azrak
- Department of Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York
| | - Mark A. Batzer
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for BioModular Multi-scale Systems, Louisiana State University, Baton Rouge, Louisiana
| | - Ping Liang
- Department of Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York
- * Correspondence to: Dr. Ping Liang, Department of Cancer Genetics, Roswell Park Cancer Institute, Elm & Carlton Streets, Bu¡alo, NY 14263. E-mail:
| |
Collapse
|
14
|
Salem AH, Batzer MA. Distribution of the HIV resistance CCR5-Delta32 allele among Egyptians and Syrians. Mutat Res 2006; 616:175-80. [PMID: 17166523 DOI: 10.1016/j.mrfmmm.2006.11.024] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
A mutant allele of the beta-chemokine receptor gene CCR5 bearing a 32-basepair (bp) deletion that prevents cell invasion by the primary transmitting strain of HIV-1 has recently been characterized. Individuals homozygous for the mutation are resistant to infection, even after repeated high-risk exposure, but this resistance appears not absolute, as isolated cases of HIV-positive deletion homozygotes are emerging. The consequence of the heterozygous state is not clear, but it may delay the progression to AIDS in infected individuals. In order to evaluate the frequency distribution of CCR5-Delta32 polymorphism among Egyptians, a total of 200 individuals (154 from Ismailia and 46 from Sinai) were tested. Only two heterozygous individuals from Ismailia carried the CCR5-Delta32 allele (0.6%), and no homozygous (Delta32/Delta32) individuals were detected among the tested samples. The presence of the CCR5-Delta32 allele among Egyptians may be attributed to the admixture with people of European descent. Thus we conclude that the protective deletion CCR5-Delta32 is largely absent in the Egyptian population.
Collapse
Affiliation(s)
- Abdel-Halim Salem
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Multiscale Systems, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | |
Collapse
|
15
|
Abstract
Mobile elements are commonly referred to as selfish repetitive DNA sequences. However, mobile elements represent a unique and underutilized group of molecular markers. Several of their characteristics make them ideally suited for use as tools in forensic genomic applications. These include their nature as essentially homoplasy-free characters, they are identical by descent, the ancestral state of any insertion is known to be the absence of the element, and many mobile element insertions are lineage specific. In this review, we provide an overview of mobile element biology and describe the application of certain mobile elements, especially the SINEs and other retrotransposons, to forensic genomics. These tools include quantitative species-specific DNA detection, analysis of complex biomaterials, and the inference of geographic origin of human DNA samples.
Collapse
Affiliation(s)
- David A Ray
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | |
Collapse
|
16
|
Abstract
Mobile elements represent a unique and under-utilized set of tools for molecular ecologists. They are essentially homoplasy-free characters with the ability to be genotyped in a simple and efficient manner. Interpretation of the data generated using mobile elements can be simple compared to other genetic markers. They exist in a wide variety of taxa and are useful over a wide selection of temporal ranges within those taxa. Furthermore, their mode of evolution instills them with another advantage over other types of multilocus genotype data: the ability to determine loci applicable to a range of time spans in the history of a taxon. In this review, I discuss the application of mobile element markers, especially short interspersed elements (SINEs), to phylogenetic and population data, with an emphasis on potential applications to molecular ecology.
Collapse
Affiliation(s)
- David A Ray
- Department of Biology, West Virginia University, 53 Campus Dr, Morgantown, WV 26506, USA.
| |
Collapse
|
17
|
Konkel MK, Wang J, Liang P, Batzer MA. Identification and characterization of novel polymorphic LINE-1 insertions through comparison of two human genome sequence assemblies. Gene 2006; 390:28-38. [PMID: 17034961 DOI: 10.1016/j.gene.2006.07.040] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2006] [Revised: 07/18/2006] [Accepted: 07/26/2006] [Indexed: 11/29/2022]
Abstract
Mobile elements represent a relatively new class of markers for the study of human evolution. Long interspersed elements (LINEs) belong to a group of retrotransposons comprising approximately 21% of the human genome. Young LINE-1 (L1) elements that have integrated recently into the human genome can be polymorphic for insertion presence/absence in different human populations at particular chromosomal locations. To identify putative novel L1 insertion polymorphisms, we computationally compared two draft assemblies of the whole human genome (Public and Celera Human Genome assemblies). We identified a total of 148 potential polymorphic L1 insertion loci, among which 73 were candidates for novel polymorphic loci. Based on additional analyses we selected 34 loci for further experimental studies. PCR-based assays and DNA sequence analysis were performed for these 34 loci in 80 unrelated individuals from four diverse human populations: African-American, Asian, Caucasian, and South American. All but two of the selected loci were confirmed as polymorphic in our human population panel. Approximately 47% of the analyzed loci integrated into other repetitive elements, most commonly older L1s. One of the insertions was accompanied by a BC200 sequence. Collectively, these mobile elements represent a valuable source of genomic polymorphism for the study of human population genetics. Our results also suggest that the exhaustive identification of L1 insertion polymorphisms is far from complete, and new whole genome sequences are valuable sources for finding novel retrotransposon insertion polymorphisms.
Collapse
Affiliation(s)
- Miriam K Konkel
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for BioModular Multi-Scale Systems, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | |
Collapse
|
18
|
Herke SW, Xing J, Ray DA, Zimmerman JW, Cordaux R, Batzer MA. A SINE-based dichotomous key for primate identification. Gene 2006; 390:39-51. [PMID: 17056208 DOI: 10.1016/j.gene.2006.08.015] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2006] [Revised: 08/01/2006] [Accepted: 08/02/2006] [Indexed: 11/22/2022]
Abstract
For DNA samples or 'divorced' tissues, identifying the organism from which they were taken generally requires some type of analytical method. The ideal approach would be robust even in the hands of a novice, requiring minimal equipment, time, and effort. Genotyping SINEs (Short INterspersed Elements) is such an approach as it requires only PCR-related equipment, and the analysis consists solely of interpreting fragment sizes in agarose gels. Modern primate genomes are known to contain lineage-specific insertions of Alu elements (a primate-specific SINE); thus, to demonstrate the utility of this approach, we used members of the Alu family to identify DNA samples from evolutionarily divergent primate species. For each node of a combined phylogenetic tree (56 species; n=8 [Hominids]; 11 [New World monkeys]; 21 [Old World monkeys]; 2 [Tarsiformes]; and, 14 [Strepsirrhines]), we tested loci (>400 in total) from prior phylogenetic studies as well as newly identified elements for their ability to amplify in all 56 species. Ultimately, 195 loci were selected for inclusion in this Alu-based key for primate identification. This dichotomous SINE-based key is best used through hierarchical amplification, with the starting point determined by the level of initial uncertainty regarding sample origin. With newly emerging genome databases, finding informative retrotransposon insertions is becoming much more rapid; thus, the general principle of using SINEs to identify organisms is broadly applicable.
Collapse
Affiliation(s)
- Scott W Herke
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, United States
| | | | | | | | | | | |
Collapse
|
19
|
Kass DH, Jamison N, Mayberry MM, Tecle E. Identification of a unique Alu-based polymorphism and its use in human population studies. Gene 2006; 390:146-52. [PMID: 17010537 DOI: 10.1016/j.gene.2006.07.035] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2006] [Revised: 07/03/2006] [Accepted: 07/04/2006] [Indexed: 10/24/2022]
Abstract
Alu elements represent a family of short interspersed DNA elements (SINEs) found in primate genomes. These are members of a group of transposable elements that integrate into the genome by the process of retrotransposition. Recent integrations of Alu elements within the human genome have generated presence/absence variants useful as DNA markers in human population studies as well as in forensic and paternity analyses. Besides the ease of use, this type of marker is unique because the absence of the Alu represents the ancestral form. We have identified an Alu-based polymorphism that consists of four alleles in which we can predict the evolutionary order. Additionally, we have developed a simple PCR plus restriction endonuclease assay to readily distinguish the four alleles. We have thus far analyzed DNA from a small set of samples comprising ten different ethnic groups. The three populations of African descent exhibited a relatively low frequency of the absence allele in contrast to the other populations, as well as being the only populations in which all four alleles were identified. One presence allele was not found in both European Caucasian and South American populations that were sampled, whereas a different presence allele was not observed among the sampled Asian populations. Additionally, the four-allele system identified variations among populations not observed by simply scoring as presence/absence variants. Therefore, extending beyond the two-allele dimorphic Alu system further elucidates population variations. These features afford this marker as a unique tool in the study of both global and regional analyses of human populations.
Collapse
Affiliation(s)
- David H Kass
- Department of Biology, Eastern Michigan University, Ypsilanti, MI 48197, United States.
| | | | | | | |
Collapse
|
20
|
Cordaux R, Hedges DJ, Herke SW, Batzer MA. Estimating the retrotransposition rate of human Alu elements. Gene 2006; 373:134-7. [PMID: 16522357 DOI: 10.1016/j.gene.2006.01.019] [Citation(s) in RCA: 89] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2005] [Revised: 01/18/2006] [Accepted: 01/21/2006] [Indexed: 10/24/2022]
Abstract
Mobile elements such as Alu repeats have substantially altered the architecture of the human genome, and de novo mobile element insertions sometimes cause genetic disorders. Previous estimates for the retrotransposition rate (RR) of Alu elements in humans of one new insertion every approximately 100-125 births were developed prior to the sequencing of the human and chimpanzee genomes. Here, we used two independent methods (based on the new genomic data and on disease-causing de novo Alu insertions) to generate refined Alu RR estimates in humans. Both methods consistently yielded RR on the order of one new Alu insertion every approximately 20 births, despite the fact that the evolutionary-based method represents an average RR over the past approximately 6 million years while the mutation-based method better reflects the current-day RR. These results suggest that Alu elements retrotranspose at a faster rate in humans than previously thought, and support the potential of Alu elements as mutagenic factors in the human genome.
Collapse
Affiliation(s)
- Richard Cordaux
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | |
Collapse
|
21
|
Wang J, Song L, Gonder MK, Azrak S, Ray DA, Batzer MA, Tishkoff SA, Liang P. Whole genome computational comparative genomics: A fruitful approach for ascertaining Alu insertion polymorphisms. Gene 2006; 365:11-20. [PMID: 16376498 PMCID: PMC1847407 DOI: 10.1016/j.gene.2005.09.031] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2005] [Revised: 06/20/2005] [Accepted: 09/07/2005] [Indexed: 10/25/2022]
Abstract
Alu elements are the most active and predominant type of short interspersed elements (SINEs) in the human genome. Recently inserted polymorphic (for presence/absence) Alu elements contribute to genome diversity among different human populations, and they are useful genetic markers for population genetic studies. The objective of this study is to identify polymorphic Alu insertions through an in silico comparative genomics approach and to analyze their distribution pattern throughout the human genome. By computationally comparing the public and Celera sequence assemblies of the human genome, we identified a total of 800 polymorphic Alu elements. We used polymerase chain reaction-based assays to screen a randomly selected set of 16 of these 800 Alu insertion polymorphisms using a human diversity panel to demonstrate the efficiency of our approach. Based on sequence analysis of the 800 Alu polymorphisms, we report three new Alu subfamilies, Ya3, Ya4b, and Yb11, with Yb11 being the smallest known Alu subfamily. Analysis of retrotransposition activity revealed Yb11, Ya8, Ya5, Yb9, and Yb8 as the most active Alu subfamilies and the maintenance of a very low level of retrotransposition activity or recent gene conversion events involving S subfamilies. The 800 polymorphic Alu insertions are characterized by the presence of target site duplications (TSDs) and longer than average polyA-tail length. Their pre-integration sites largely follow an extended "NT-AARA" motif. Among chromosomes, the density of Alu insertion polymorphisms is positively correlated with the Alu-site availability and is inversely correlated with the densities of older Alu elements and genes.
Collapse
Affiliation(s)
- Jianxin Wang
- Department of Cancer Genetics, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, USA
| | - Lei Song
- Department of Cancer Genetics, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, USA
| | | | - Sami Azrak
- Department of Cancer Genetics, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, USA
| | - David A. Ray
- Department of Biological Sciences, Biological Computational and Visualization Center, Center for BioModular Multi-scale Systems, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Mark A. Batzer
- Department of Biological Sciences, Biological Computational and Visualization Center, Center for BioModular Multi-scale Systems, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Sarah A. Tishkoff
- Department of Biology, University of Maryland, College Park, MD 20742, USA
| | - Ping Liang
- Department of Cancer Genetics, Roswell Park Cancer Institute, Elm and Carlton Streets, Buffalo, NY 14263, USA
- * Corresponding author. Tel.: +1 716 845 1556; fax: +1 716 845 1692. E-mail address: (P. Liang)
| |
Collapse
|
22
|
Wang W, Kirkness EF. Short interspersed elements (SINEs) are a major source of canine genomic diversity. Genome Res 2005; 15:1798-808. [PMID: 16339378 PMCID: PMC1356118 DOI: 10.1101/gr.3765505] [Citation(s) in RCA: 94] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2005] [Accepted: 08/03/2005] [Indexed: 01/12/2023]
Abstract
SINEs are retrotransposons that have enjoyed remarkable reproductive success during the course of mammalian evolution, and have played a major role in shaping mammalian genomes. Previously, an analysis of survey-sequence data from an individual dog (a poodle) indicated that canine genomes harbor a high frequency of alleles that differ only by the absence or presence of a SINEC_Cf repeat. Comparison of this survey-sequence data with a draft genome sequence of a distinct dog (a boxer) has confirmed this prediction, and revealed the chromosomal coordinates for >10,000 loci that are bimorphic for SINEC_Cf insertions. Analysis of SINE insertion sites from the genomes of nine additional dogs indicates that 3%-5% are absent from either the poodle or boxer genome sequences--suggesting that an additional 10,000 bimorphic loci could be readily identified in the general dog population. We describe a methodology that can be used to identify these loci, and could be adapted to exploit these bimorphic loci for genotyping purposes. Approximately half of all annotated canine genes contain SINEC_Cf repeats, and these elements are occasionally transcribed. When transcribed in the antisense orientation, they provide splice acceptor sites that can result in incorporation of novel exons. The high frequency of bimorphic SINE insertions in the dog population is predicted to provide numerous examples of allele-specific transcription patterns that will be valuable for the study of differential gene expression among multiple dog breeds.
Collapse
Affiliation(s)
- Wei Wang
- The Institute for Genomic Research, Rockville, Maryland 20850, USA
| | | |
Collapse
|
23
|
Abstract
Background Alu elements are Short INterspersed Elements (SINEs) in primate genomes that have proven useful as markers for studying genome evolution, population biology and phylogenetics. Most of these applications, however, have been limited to humans and their nearest relatives, chimpanzees. In an effort to expand our understanding of Alu sequence evolution and to increase the applicability of these markers to non-human primate biology, we have analyzed available Alu sequences for loci specific to platyrrhine (New World) primates. Results Branching patterns along an Alu sequence phylogeny indicate three major classes of platyrrhine-specific Alu sequences. Sequence comparisons further reveal at least three New World monkey-specific subfamilies; AluTa7, AluTa10, and AluTa15. Two of these subfamilies appear to be derived from a gene conversion event that has produced a recently active fusion of AluSc- and AluSp-type elements. This is a novel mode of origin for new Alu subfamilies. Conclusion The use of Alu elements as genetic markers in studies of genome evolution, phylogenetics, and population biology has been very productive when applied to humans. The characterization of these three new Alu subfamilies not only increases our understanding of Alu sequence evolution in primates, but also opens the door to the application of these genetic markers outside the hominid lineage.
Collapse
Affiliation(s)
- David A Ray
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Multiscale Systems, Louisiana State University, Baton Rouge, LA, 70803, USA
- Department of Biology, West Virginia University, Morgantown, WV, 26506, USA
| | - Mark A Batzer
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Multiscale Systems, Louisiana State University, Baton Rouge, LA, 70803, USA
| |
Collapse
|
24
|
Ray DA, Walker JA, Hall A, Llewellyn B, Ballantyne J, Christian AT, Turteltaub K, Batzer MA. Inference of human geographic origins using Alu insertion polymorphisms. Forensic Sci Int 2005; 153:117-24. [PMID: 16139099 DOI: 10.1016/j.forsciint.2004.10.017] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2004] [Revised: 10/26/2004] [Accepted: 10/28/2004] [Indexed: 01/29/2023]
Abstract
The inference of an individual's geographic ancestry or origin can be critical in narrowing the field of potential suspects in a criminal investigation. Most current technologies rely on single nucleotide polymorphism (SNP) genotypes to accomplish this task. However, SNPs can introduce homoplasy into an analysis since they can be identical-by-state. We introduce the use of insertion polymorphisms based on short interspersed elements (SINEs) as a potential alternative to SNPs. SINE polymorphisms are identical-by-descent, essentially homoplasy-free, and inexpensive to genotype using a variety of approaches. Herein, we present results of a blind study using 100 Alu insertion polymorphisms to infer the geographic ancestry of 18 unknown individuals from a variety of geographic locations. Using a Structure analysis of the Alu insertion polymorphism-based genotypes, we were able to correctly infer the geographic affiliation of all 18 unknown human individuals with high levels of confidence. This technique to infer the geographic affiliation of unknown human DNA samples will be a useful tool in forensic genomics.
Collapse
Affiliation(s)
- David A Ray
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | | | | | | | | | |
Collapse
|
25
|
Ho HJ, Ray DA, Salem AH, Myers JS, Batzer MA. Straightening out the LINEs: LINE-1 orthologous loci. Genomics 2005; 85:201-7. [PMID: 15676278 DOI: 10.1016/j.ygeno.2004.10.016] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2004] [Accepted: 10/29/2004] [Indexed: 11/19/2022]
Abstract
The L1Hs preTa subfamily of long interspersed elements (LINEs) originated after the divergence of human and chimpanzee and is therefore found only in the human genome. Thirty-three of the 254 L1Hs preTa elements are polymorphic for the absence/presence of the insertion, making them useful markers for studying human population genetics. The problem of homoplasy, however, can diminish the value of LINEs as phylogenetic and population genetic markers. We examined anomalous orthologous sites in a range of nonhuman primates. Only two cases of other mobile elements inserting near the preintegration sites of L1Hs preTa elements were observed: an AluY insertion in Chlorocebus and an L1PA8 insertion in Aotus. Sequence analysis showed that both elements were clearly distinguishable from their human counterparts. We conclude that L1 elements can continue to be regarded as essentially homoplasy-free genetic characters.
Collapse
Affiliation(s)
- Huei Jin Ho
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | | | |
Collapse
|
26
|
Han K, Sen SK, Wang J, Callinan PA, Lee J, Cordaux R, Liang P, Batzer MA. Genomic rearrangements by LINE-1 insertion-mediated deletion in the human and chimpanzee lineages. Nucleic Acids Res 2005; 33:4040-52. [PMID: 16034026 PMCID: PMC1179734 DOI: 10.1093/nar/gki718] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Long INterspersed Elements (LINE-1s or L1s) are abundant non-LTR retrotransposons in mammalian genomes that are capable of insertional mutagenesis. They have been associated with target site deletions upon insertion in cell culture studies of retrotransposition. Here, we report 50 deletion events in the human and chimpanzee genomes directly linked to the insertion of L1 elements, resulting in the loss of approximately 18 kb of sequence from the human genome and approximately 15 kb from the chimpanzee genome. Our data suggest that during the primate radiation, L1 insertions may have deleted up to 7.5 Mb of target genomic sequences. While the results of our in vivo analysis differ from those of previous cell culture assays of L1 insertion-mediated deletions in terms of the size and rate of sequence deletion, evolutionary factors can reconcile the differences. We report a pattern of genomic deletion sizes similar to those created during the retrotransposition of Alu elements. Our study provides support for the existence of different mechanisms for small and large L1-mediated deletions, and we present a model for the correlation of L1 element size and the corresponding deletion size. In addition, we show that internal rearrangements can modify L1 structure during retrotransposition events associated with large deletions.
Collapse
Affiliation(s)
| | | | - Jianxin Wang
- Department of Cancer Genetics, Roswell Park Cancer InstituteElm and Carlton Streets, Buffalo, NY 14263, USA
| | | | | | | | - Ping Liang
- Department of Cancer Genetics, Roswell Park Cancer InstituteElm and Carlton Streets, Buffalo, NY 14263, USA
| | - Mark A. Batzer
- To whom correspondence should be addressed. Tel: +1 225 578 7102; Fax: +1 225 578 7113;
| |
Collapse
|
27
|
Chen JM, Stenson PD, Cooper DN, Férec C. A systematic analysis of LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease. Hum Genet 2005; 117:411-27. [PMID: 15983781 DOI: 10.1007/s00439-005-1321-0] [Citation(s) in RCA: 155] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2005] [Accepted: 04/04/2005] [Indexed: 10/25/2022]
Abstract
Diverse long interspersed element-1 (LINE-1 or L1)-dependent mutational mechanisms have been extensively studied with respect to L1 and Alu elements engineered for retrotransposition in cultured cells and/or in genome-wide analyses. To what extent the in vitro studies can be held to accurately reflect in vivo events in the human genome, however, remains to be clarified. We have attempted to address this question by means of a systematic analysis of recent L1-mediated retrotranspositional events that have caused human genetic disease, with a view to providing a more complete picture of how L1-mediated retrotransposition impacts upon the architecture of the human genome. A total of 48 such mutations were identified, including those described as L1-mediated retrotransposons, as well as insertions reported to contain a poly(A) tail: 26 were L1 trans-driven Alu insertions, 15 were direct L1 insertions, four were L1 trans-driven SVA insertions, and three were associated with simple poly(A) insertions. The systematic study of these lesions, when combined with previous in vitro and genome-wide analyses, has strengthened several important conclusions regarding L1-mediated retrotransposition in humans: (a) approximately 25% of L1 insertions are associated with the 3' transduction of adjacent genomic sequences, (b) approximately 25% of the new L1 inserts are full-length, (c) poly(A) tail length correlates inversely with the age of the element, and (d) the length of target site duplication in vivo is rarely longer than 20 bp. Our analysis also suggests that some 10% of L1-mediated retrotranspositional events are associated with significant genomic deletions in humans. Finally, the identification of independent retrotranspositional events that have integrated at the same genomic locations provides new insight into the L1-mediated insertional process in humans.
Collapse
Affiliation(s)
- Jian-Min Chen
- INSERM U613-Génétique Moléculaire et Génétique Epidémiologique, Etablissement Français du Sang-Bretagne, Université de Bretagne Occidentale, Centre Hospitalier Universitaire, Brest, 29220, France.
| | | | | | | |
Collapse
|
28
|
Abstract
Background Alu elements are short (~300 bp) interspersed elements that amplify in primate genomes through a process termed retroposition. The expansion of these elements has had a significant impact on the structure and function of primate genomes. Approximately 10 % of the mass of the human genome is comprised of Alu elements, making them the most abundant short interspersed element (SINE) in our genome. The majority of Alu amplification occurred early in primate evolution, and the current rate of Alu retroposition is at least 100 fold slower than the peak of amplification that occurred 30–50 million years ago. Alu elements are therefore a rich source of inter- and intra-species primate genomic variation. Results A total of 153 Alu elements from the Ye subfamily were extracted from the draft sequence of the human genome. Analysis of these elements resulted in the discovery of two new Alu subfamilies, Ye4 and Ye6, complementing the previously described Ye5 subfamily. DNA sequence analysis of each of the Alu Ye subfamilies yielded average age estimates of ~14, ~13 and ~9.5 million years old for the Alu Ye4, Ye5 and Ye6 subfamilies, respectively. In addition, 120 Alu Ye4, Ye5 and Ye6 loci were screened using polymerase chain reaction (PCR) assays to determine their phylogenetic origin and levels of human genomic diversity. Conclusion The Alu Ye lineage appears to have started amplifying relatively early in primate evolution and continued propagating at a low level as many of its members are found in a variety of hominoid (humans, greater and lesser ape) genomes. Detailed sequence analysis of several Alu pre-integration sites indicated that multiple types of events had occurred, including gene conversions, near-parallel independent insertions of different Alu elements and Alu-mediated genomic deletions. A potential hotspot for Alu insertion in the Fer1L3 gene on chromosome 10 was also identified.
Collapse
|