1
|
Ahmad SF, Singchat W, Jehangir M, Suntronpong A, Panthum T, Malaivijitnond S, Srikulnath K. Dark Matter of Primate Genomes: Satellite DNA Repeats and Their Evolutionary Dynamics. Cells 2020; 9:E2714. [PMID: 33352976 PMCID: PMC7767330 DOI: 10.3390/cells9122714] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 12/15/2020] [Accepted: 12/16/2020] [Indexed: 12/12/2022] Open
Abstract
A substantial portion of the primate genome is composed of non-coding regions, so-called "dark matter", which includes an abundance of tandemly repeated sequences called satellite DNA. Collectively known as the satellitome, this genomic component offers exciting evolutionary insights into aspects of primate genome biology that raise new questions and challenge existing paradigms. A complete human reference genome was recently reported with telomere-to-telomere human X chromosome assembly that resolved hundreds of dark regions, encompassing a 3.1 Mb centromeric satellite array that had not been identified previously. With the recent exponential increase in the availability of primate genomes, and the development of modern genomic and bioinformatics tools, extensive growth in our knowledge concerning the structure, function, and evolution of satellite elements is expected. The current state of knowledge on this topic is summarized, highlighting various types of primate-specific satellite repeats to compare their proportions across diverse lineages. Inter- and intraspecific variation of satellite repeats in the primate genome are reviewed. The functional significance of these sequences is discussed by describing how the transcriptional activity of satellite repeats can affect gene expression during different cellular processes. Sex-linked satellites are outlined, together with their respective genomic organization. Mechanisms are proposed whereby satellite repeats might have emerged as novel sequences during different evolutionary phases. Finally, the main challenges that hinder the detection of satellite DNA are outlined and an overview of the latest methodologies to address technological limitations is presented.
Collapse
Affiliation(s)
- Syed Farhan Ahmad
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Worapong Singchat
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Maryam Jehangir
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Department of Structural and Functional Biology, Institute of Bioscience at Botucatu, São Paulo State University (UNESP), Botucatu, São Paulo 18618-689, Brazil
| | - Aorarat Suntronpong
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Thitipong Panthum
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
| | - Suchinda Malaivijitnond
- National Primate Research Center of Thailand, Chulalongkorn University, Saraburi 18110, Thailand;
- Department of Biology, Faculty of Science, Chulalongkorn University, Bangkok 10330, Thailand
| | - Kornsorn Srikulnath
- Laboratory of Animal Cytogenetics and Comparative Genomics (ACCG), Department of Genetics, Faculty of Science, Kasetsart University, Bangkok 10900, Thailand; (S.F.A.); (W.S.); (M.J.); (A.S.); (T.P.)
- Special Research Unit for Wildlife Genomics (SRUWG), Department of Forest Biology, Faculty of Forestry, Kasetsart University, Bangkok 10900, Thailand
- National Primate Research Center of Thailand, Chulalongkorn University, Saraburi 18110, Thailand;
- Center of Excellence on Agricultural Biotechnology (AG-BIO/PERDO-CHE), Bangkok 10900, Thailand
- Omics Center for Agriculture, Bioresources, Food and Health, Kasetsart University (OmiKU), Bangkok 10900, Thailand
| |
Collapse
|
2
|
Kulski JK, Mawart A, Marie K, Tay GK, AlSafar HS. MHC class I polymorphic Alu insertion (POALIN) allele and haplotype frequencies in the Arabs of the United Arab Emirates and other world populations. Int J Immunogenet 2019; 46:247-262. [PMID: 31021060 DOI: 10.1111/iji.12426] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2018] [Revised: 02/17/2019] [Accepted: 03/12/2019] [Indexed: 01/02/2023]
Abstract
Polymorphic Alu insertions (POALINs) are found throughout the human genome and have been used in various studies to infer geographic origin of human populations. The main aim of this study was to determine the allele and haplotype frequencies of five POALINs, AluHF, AluHG, AluHJ, AluTF and AluMICB, within the major histocompatibility complex (MHC) class I region of 95 UAE Arabs, and correlate their frequencies to those of the HLA-A, HLA-C and HLA-B class I allele lineages. Evolutionary relationships between the POALINs of the Arabs and those previously studied in populations of African, Asian and European descent were compared. At each of the five Alu loci (AluHF, AluHG, AluHJ, AluTF and AluMICB), Alu insertion was designated as Alu(locus)*02 and absence was Alu(locus)*01. The AluHG insertion (AluHG*02) had the highest frequency (0.332), followed by AluHF*02 (0.300), AluHJ*02 (0.263), AluMICB*02 (0.111) and AluTF*02 (0.058). Of the 270 Alu-HLA haplotypes pairs in the UAE Arabs, 110 had no Alu insertion, and 54 had an Alu insertion at >50% per haplotype. An Alu insertion >75% per haplotype was found between AluMICB*02 and HLA-B*14, HLA-B*22, HLA-B*44, HLA-B*55, HLA-B*57 and HLA-B*73, and with HLA-C*01 and HLA-C*18; AluHJ*02 with HLA-A*01, HLA-A*19, HLA-A*24 and HLA-A*32; AluHG*02 with HLA-A*02 and HLA-B*18; and AluHF*02 with HLA-A*10. The genotyped allele and haplotype frequencies of the MHC POALINs in UAE Arabs were compared with the results of 30 previously published Asian, European, American and African populations. Phylogenetic and multidimensional scaling (MDS) analysis of the relative MHC POALINs allele and haplotype frequencies revealed that the UAE Arabs have a similar lineage to Caucasians and the most distant genetic relationship to the Waorani native American population of Ecuador. The structure of both the phylogenetic tree and the MDS analysis supports the Out of Africa theory of human evolution. The nature of the clusters suggests the Arabian Middle East represents a crossroads from which human populations migrated towards Asia in the east and Europe to the north-west.
Collapse
Affiliation(s)
- Jerzy K Kulski
- Faculty of Health and Medical Sciences, UWA Medical School, The University of Western Australia, Crawley, Western Australia, Australia
| | - Aurelie Mawart
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Kirsten Marie
- Faculty of Health and Medical Sciences, UWA Medical School, The University of Western Australia, Crawley, Western Australia, Australia
| | - Guan K Tay
- Faculty of Health and Medical Sciences, UWA Medical School, The University of Western Australia, Crawley, Western Australia, Australia.,Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.,Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| | - Habiba S AlSafar
- Center for Biotechnology, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates.,Department of Biomedical Engineering, Khalifa University of Science and Technology, Abu Dhabi, United Arab Emirates
| |
Collapse
|
3
|
Saint-Leandre B, Clavereau I, Hua-Van A, Capy P. Transcriptional polymorphism ofpiRNA regulatory genes underlies themarineractivity inDrosophila simulanstestes. Mol Ecol 2017; 26:3715-3731. [DOI: 10.1111/mec.14145] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 12/28/2016] [Indexed: 02/03/2023]
Affiliation(s)
- Bastien Saint-Leandre
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS; Univ. Paris-Sud, IRD; Université Paris-Saclay; Gif-sur-Yvette Cedex France
| | - Isabelle Clavereau
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS; Univ. Paris-Sud, IRD; Université Paris-Saclay; Gif-sur-Yvette Cedex France
| | - Aurelie Hua-Van
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS; Univ. Paris-Sud, IRD; Université Paris-Saclay; Gif-sur-Yvette Cedex France
| | - Pierre Capy
- Laboratoire Evolution, Génomes, Comportement, Ecologie CNRS; Univ. Paris-Sud, IRD; Université Paris-Saclay; Gif-sur-Yvette Cedex France
| |
Collapse
|
4
|
Campos-Sánchez R, Cremona MA, Pini A, Chiaromonte F, Makova KD. Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis. PLoS Comput Biol 2016; 12:e1004956. [PMID: 27309962 PMCID: PMC4911145 DOI: 10.1371/journal.pcbi.1004956] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 04/29/2016] [Indexed: 01/24/2023] Open
Abstract
Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-wide study of the most recently active ERVs in the human and mouse genome. We investigated 826 fixed and 1,065 in vitro HERV-Ks in human, and 1,624 fixed and 242 polymorphic ETns, as well as 3,964 fixed and 1,986 polymorphic IAPs, in mouse. We quantitated >40 human and mouse genomic features (e.g., non-B DNA structure, recombination rates, and histone modifications) in ±32 kb of these ERVs' integration sites and in control regions, and analyzed them using Functional Data Analysis (FDA) methodology. In one of the first applications of FDA in genomics, we identified genomic scales and locations at which these features display their influence, and how they work in concert, to provide signals essential for integration and fixation of ERVs. The investigation of ERVs of different evolutionary ages (young in vitro and polymorphic ERVs, older fixed ERVs) allowed us to disentangle integration vs. fixation preferences. As a result of these analyses, we built a comprehensive model explaining the uneven distribution of ERVs along the genome. We found that ERVs integrate in late-replicating AT-rich regions with abundant microsatellites, mirror repeats, and repressive histone marks. Regions favoring fixation are depleted of genes and evolutionarily conserved elements, and have low recombination rates, reflecting the effects of purifying selection and ectopic recombination removing ERVs from the genome. In addition to providing these biological insights, our study demonstrates the power of exploiting multiple scales and localization with FDA. These powerful techniques are expected to be applicable to many other genomic investigations.
Collapse
Affiliation(s)
- Rebeca Campos-Sánchez
- Genetics Graduate Program, The Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania, United States of America
| | - Marzia A. Cremona
- MOX—Modeling and Scientific Computing, Department of Mathematics, Politecnico di Milano, Milano, Italy
- Department of Statistics, Penn State University, University Park, Pennsylvania, United States of America
| | - Alessia Pini
- MOX—Modeling and Scientific Computing, Department of Mathematics, Politecnico di Milano, Milano, Italy
| | - Francesca Chiaromonte
- Department of Statistics, Penn State University, University Park, Pennsylvania, United States of America
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania, United States of America
| | - Kateryna D. Makova
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania, United States of America
- Department of Biology, Penn State University, University Park, Pennsylvania, United States of America
| |
Collapse
|
5
|
Konkel MK, Walker JA, Hotard AB, Ranck MC, Fontenot CC, Storer J, Stewart C, Marth GT, Batzer MA. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project. Genome Biol Evol 2015; 7:2608-22. [PMID: 26319576 PMCID: PMC4607524 DOI: 10.1093/gbe/evv167] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/23/2015] [Indexed: 12/17/2022] Open
Abstract
The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages.
Collapse
Affiliation(s)
- Miriam K Konkel
- Department of Biological Sciences, Louisiana State University
| | | | - Ashley B Hotard
- Department of Biological Sciences, Louisiana State University
| | - Megan C Ranck
- Department of Biological Sciences, Louisiana State University
| | | | - Jessica Storer
- Department of Biological Sciences, Louisiana State University Department of Molecular, Cellular and Developmental Biology, The Ohio State University
| | - Chip Stewart
- Department of Biology, Boston College Cancer Genome Computational Analysis, Cambridge, MA
| | - Gabor T Marth
- Department of Biology, Boston College Eccles Institute of Human Genetics, University of Utah
| | - Mark A Batzer
- Department of Biological Sciences, Louisiana State University
| |
Collapse
|
6
|
Dios F, Barturen G, Lebrón R, Rueda A, Hackenberg M, Oliver JL. DNA clustering and genome complexity. Comput Biol Chem 2014; 53 Pt A:71-8. [PMID: 25182383 DOI: 10.1016/j.compbiolchem.2014.08.011] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 01/08/2023]
Abstract
Early global measures of genome complexity (power spectra, the analysis of fluctuations in DNA walks or compositional segmentation) uncovered a high degree of complexity in eukaryotic genome sequences. The main evolutionary mechanisms leading to increases in genome complexity (i.e. gene duplication and transposon proliferation) can all potentially produce increases in DNA clustering. To quantify such clustering and provide a genome-wide description of the formed clusters, we developed GenomeCluster, an algorithm able to detect clusters of whatever genome element identified by chromosome coordinates. We obtained a detailed description of clusters for ten categories of human genome elements, including functional (genes, exons, introns), regulatory (CpG islands, TFBSs, enhancers), variant (SNPs) and repeat (Alus, LINE1) elements, as well as DNase hypersensitivity sites. For each category, we located their clusters in the human genome, then quantifying cluster length and composition, and estimated the clustering level as the proportion of clustered genome elements. In average, we found a 27% of elements in clusters, although a considerable variation occurs among different categories. Genes form the lowest number of clusters, but these are the longest ones, both in bp and the average number of components, while the shortest clusters are formed by SNPs. Functional and regulatory elements (genes, CpG islands, TFBSs, enhancers) show the highest clustering level, as compared to DNase sites, repeats (Alus, LINE1) or SNPs. Many of the genome elements we analyzed are known to be composed of clusters of low-level entities. In addition, we found here that the clusters generated by GenomeCluster can be in turn clustered into high-level super-clusters. The observation of 'clusters-within-clusters' parallels the 'domains within domains' phenomenon previously detected through global statistical methods in eukaryotic sequences, and reveals a complex human genome landscape dominated by hierarchical clustering.
Collapse
Affiliation(s)
- Francisco Dios
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, 18100 Granada, Spain
| | - Guillermo Barturen
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, 18100 Granada, Spain
| | - Ricardo Lebrón
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, 18100 Granada, Spain
| | - Antonio Rueda
- Plataforma Andaluza de Genómica y Bioinformática (GBPA), Edificio INSUR, Calle Albert Einstein, 41092 Sevilla, Spain
| | - Michael Hackenberg
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, 18100 Granada, Spain
| | - José L Oliver
- Dpto. de Genética, Facultad de Ciencias, Universidad de Granada, 18071 Granada, Spain; Lab. de Bioinformática, Inst. de Biotecnología, Centro de Investigación Biomédica, 18100 Granada, Spain.
| |
Collapse
|
7
|
Campos-Sánchez R, Kapusta A, Feschotte C, Chiaromonte F, Makova KD. Genomic landscape of human, bat, and ex vivo DNA transposon integrations. Mol Biol Evol 2014; 31:1816-32. [PMID: 24809961 DOI: 10.1093/molbev/msu138] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The integration and fixation preferences of DNA transposons, one of the major classes of eukaryotic transposable elements, have never been evaluated comprehensively on a genome-wide scale. Here, we present a detailed study of the distribution of DNA transposons in the human and bat genomes. We studied three groups of DNA transposons that integrated at different evolutionary times: 1) ancient (>40 My) and currently inactive human elements, 2) younger (<40 My) bat elements, and 3) ex vivo integrations of piggyBat and Sleeping Beauty elements in HeLa cells. Although the distribution of ex vivo elements reflected integration preferences, the distribution of human and (to a lesser extent) bat elements was also affected by selection. We used regression techniques (linear, negative binomial, and logistic regression models with multiple predictors) applied to 20-kb and 1-Mb windows to investigate how the genomic landscape in the vicinity of DNA transposons contributes to their integration and fixation. Our models indicate that genomic landscape explains 16-79% of variability in DNA transposon genome-wide distribution. Importantly, we not only confirmed previously identified predictors (e.g., DNA conformation and recombination hotspots) but also identified several novel predictors (e.g., signatures of double-strand breaks and telomere hexamer). Ex vivo integrations showed a bias toward actively transcribed regions. Older DNA transposons were located in genomic regions scarce in most conserved elements-likely reflecting purifying selection. Our study highlights how DNA transposons are integral to the evolution of bat and human genomes, and has implications for the development of DNA transposon assays for gene therapy and mutagenesis applications.
Collapse
Affiliation(s)
- Rebeca Campos-Sánchez
- Genetics Program, The Huck Institutes of the Life Sciences, Penn State University, University Park, PA
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
| | - Francesca Chiaromonte
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, PADepartment of Statistics, Penn State University, University Park, PA
| | - Kateryna D Makova
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, PADepartment of Biology, Penn State University, University Park, PA
| |
Collapse
|
8
|
Villarreal LP, Witzany G. The DNA Habitat and its RNA Inhabitants: At the Dawn of RNA Sociology. GENOMICS INSIGHTS 2013; 6:1-12. [PMID: 26217106 PMCID: PMC4510605 DOI: 10.4137/gei.s11490] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Most molecular biological concepts derive from physical chemical assumptions about the genetic code that are basically more than 40 years old. Additionally, systems biology, another quantitative approach, investigates the sum of interrelations to obtain a more holistic picture of nucleotide sequence order. Recent empirical data on genetic code compositions and rearrangements by mobile genetic elements and noncoding RNAs, together with results of virus research and their role in evolution, does not really fit into these concepts and compel a reexamination. In this review, we try to find an alternate hypothesis. It seems plausible now that if we look at the abundance of regulatory RNAs and persistent viruses in host genomes, we will find more and more evidence that the key players that edit the genetic codes of host genomes are consortia of RNA agents and viruses that drive evolutionary novelty and regulation of cellular processes in all steps of development. This agent-based approach may lead to a qualitative RNA sociology that investigates and identifies relevant behavioral motifs of cooperative RNA consortia. In addition to molecular biological perspectives, this may lead to a better understanding of genetic code evolution and dynamics.
Collapse
Affiliation(s)
- Luis P Villarreal
- Department of Molecular Biology and Biochemistry, University of California, Irvine, CA, USA
| | | |
Collapse
|
9
|
Ward M, Wilson M, Barbosa-Morais N, Schmidt D, Stark R, Pan Q, Schwalie P, Menon S, Lukk M, Watt S, Thybert D, Kutter C, Kirschner K, Flicek P, Blencowe B, Odom D. Latent regulatory potential of human-specific repetitive elements. Mol Cell 2013; 49:262-72. [PMID: 23246434 PMCID: PMC3560060 DOI: 10.1016/j.molcel.2012.11.013] [Citation(s) in RCA: 52] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2012] [Revised: 09/28/2012] [Accepted: 11/09/2012] [Indexed: 12/26/2022]
Abstract
At least half of the human genome is derived from repetitive elements, which are often lineage specific and silenced by a variety of genetic and epigenetic mechanisms. Using a transchromosomic mouse strain that transmits an almost complete single copy of human chromosome 21 via the female germline, we show that a heterologous regulatory environment can transcriptionally activate transposon-derived human regulatory regions. In the mouse nucleus, hundreds of locations on human chromosome 21 newly associate with activating histone modifications in both somatic and germline tissues, and influence the gene expression of nearby transcripts. These regions are enriched with primate and human lineage-specific transposable elements, and their activation corresponds to changes in DNA methylation at CpG dinucleotides. This study reveals the latent regulatory potential of the repetitive human genome and illustrates the species specificity of mechanisms that control it.
Collapse
Affiliation(s)
- Michelle C. Ward
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Michael D. Wilson
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Nuno L. Barbosa-Morais
- Banting and Best Department of Medical Research and Department of Molecular Genetics, Donnelly Centre, Toronto, ON M5S 3E1, Canada
| | - Dominic Schmidt
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Rory Stark
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Qun Pan
- Banting and Best Department of Medical Research and Department of Molecular Genetics, Donnelly Centre, Toronto, ON M5S 3E1, Canada
| | - Petra C. Schwalie
- European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Suraj Menon
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Margus Lukk
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Stephen Watt
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - David Thybert
- European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
| | - Claudia Kutter
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Kristina Kirschner
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
| | - Paul Flicek
- European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton CB10 1SD, UK
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| | - Benjamin J. Blencowe
- Banting and Best Department of Medical Research and Department of Molecular Genetics, Donnelly Centre, Toronto, ON M5S 3E1, Canada
| | - Duncan T. Odom
- University of Cambridge, Cancer Research UK-Cambridge Institute, Robinson Way, Cambridge CB2 0RE, UK
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton CB10 1SA, UK
| |
Collapse
|
10
|
Nellåker C, Keane TM, Yalcin B, Wong K, Agam A, Belgard TG, Flint J, Adams DJ, Frankel WN, Ponting CP. The genomic landscape shaped by selection on transposable elements across 18 mouse strains. Genome Biol 2012; 13:R45. [PMID: 22703977 PMCID: PMC3446317 DOI: 10.1186/gb-2012-13-6-r45] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Revised: 05/25/2012] [Accepted: 06/15/2012] [Indexed: 12/20/2022] Open
Abstract
Background Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Results Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Conclusions Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.
Collapse
Affiliation(s)
- Christoffer Nellåker
- MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Orangutan Alu quiescence reveals possible source element: support for ancient backseat drivers. Mob DNA 2012; 3:8. [PMID: 22541534 PMCID: PMC3357318 DOI: 10.1186/1759-8753-3-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2011] [Accepted: 04/30/2012] [Indexed: 01/25/2023] Open
Abstract
Background Sequence analysis of the orangutan genome revealed that recent proliferative activity of Alu elements has been uncharacteristically quiescent in the Pongo (orangutan) lineage, compared with all previously studied primate genomes. With relatively few young polymorphic insertions, the genomic landscape of the orangutan seemed like the ideal place to search for a driver, or source element, of Alu retrotransposition. Results Here we report the identification of a nearly pristine insertion possessing all the known putative hallmarks of a retrotranspositionally competent Alu element. It is located in an intronic sequence of the DGKB gene on chromosome 7 and is highly conserved in Hominidae (the great apes), but absent from Hylobatidae (gibbon and siamang). We provide evidence for the evolution of a lineage-specific subfamily of this shared Alu insertion in orangutans and possibly the lineage leading to humans. In the orangutan genome, this insertion contains three orangutan-specific diagnostic mutations which are characteristic of the youngest polymorphic Alu subfamily, AluYe5b5_Pongo. In the Homininae lineage (human, chimpanzee and gorilla), this insertion has acquired three different mutations which are also found in a single human-specific Alu insertion. Conclusions This seemingly stealth-like amplification, ongoing at a very low rate over millions of years of evolution, suggests that this shared insertion may represent an ancient backseat driver of Alu element expansion.
Collapse
|
12
|
Klimopoulos A, Sellis D, Almirantis Y. Widespread occurrence of power-law distributions in inter-repeat distances shaped by genome dynamics. Gene 2012; 499:88-98. [PMID: 22370293 DOI: 10.1016/j.gene.2012.02.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Revised: 02/05/2012] [Accepted: 02/06/2012] [Indexed: 11/25/2022]
Abstract
Repetitive DNA sequences derived from transposable elements (TE) are distributed in a non-random way, co-clustering with other classes of repeat elements, genes and other genomic components. In a previous work we reported power-law-like size distributions (linearity in log-log scale) in the spatial arrangement of Alu and LINE1 elements in the human genome. Here we investigate the large-scale features of the spatial arrangement of all principal classes of TEs in 14 genomes from phylogenetically distant organisms by studying the size distribution of inter-repeat distances. Power-law-like size distributions are found to be widespread, extending up to several orders of magnitude. In order to understand the emergence of this distributional pattern, we introduce an evolutionary scenario, which includes (i) Insertions of DNA segments (e.g., more recent repeats) into the considered sequence and (ii) Eliminations of members of the studied TE family. In the proposed model we also incorporate the potential for transposition events (characteristic of the DNA transposons' life-cycle) and segmental duplications. Simulations reproduce the main features of the observed size distributions. Furthermore, we investigate the effects of various genomic features on the presence and extent of power-law size distributions including TE class and age, mode of parental TE transmission, GC content, deletion and recombination rates in the studied genomic region, etc. Our observations corroborate the hypothesis that insertions of genomic material and eliminations of repeats are at the basis of power-laws in inter-repeat distances. The existence of these power-laws could facilitate the formation of the recently proposed "fractal globule" for the confined chromatin organization.
Collapse
Affiliation(s)
- Alexandros Klimopoulos
- National Center for Scientific Research "Demokritos," Institute of Biology, 153 10 Athens, Greece.
| | | | | |
Collapse
|
13
|
Roy-Engel AM. LINEs, SINEs and other retroelements: do birds of a feather flock together? Front Biosci (Landmark Ed) 2012; 17:1345-61. [PMID: 22201808 DOI: 10.2741/3991] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Mobile elements account for almost half of the mass of the human genome. Only the retroelements from the non-LTR (long terminal repeat) retrotransposon family, which include the LINE-1 (L1) and its non-autonomous partners, are currently active and contributing to new insertions. Although these elements seem to share the same basic amplification mechanism, the activity and success of the different types of retroelements varies. For example, Alu-induced mutagenesis is responsible for the majority of the documented instances of human disease induced by insertion of retroelements. Using copy number in mammals as an indicator, some SINEs have been vastly more successful than other retroelements, such as the retropseudogenes and even L1, likely due to differences in post-insertion selection and ability to overcome cellular controls. SINE and LINE integration can be differentially influenced by cellular factors, indicating some differences between in their amplification mechanisms. We focus on the known aspects of this group of retroelements and highlight their similarities and differences that may significantly influence their biological impact.
Collapse
Affiliation(s)
- Astrid M Roy-Engel
- Tulane University, Department of Epidemiology, School of Public Health and Tropical Medicine, Tulane Cancer Center, SL-66 1430 Tulane Ave., New Orleans, LA 70112.
| |
Collapse
|
14
|
Jurka J, Bao W, Kojima KK. Families of transposable elements, population structure and the origin of species. Biol Direct 2011; 6:44. [PMID: 21929767 PMCID: PMC3183009 DOI: 10.1186/1745-6150-6-44] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 09/19/2011] [Indexed: 11/23/2022] Open
Abstract
Background Eukaryotic genomes harbor diverse families of repetitive DNA derived from transposable elements (TEs) that are able to replicate and insert into genomic DNA. The biological role of TEs remains unclear, although they have profound mutagenic impact on eukaryotic genomes and the origin of repetitive families often correlates with speciation events. We present a new hypothesis to explain the observed correlations based on classical concepts of population genetics. Presentation of the hypothesis The main thesis presented in this paper is that the TE-derived repetitive families originate primarily by genetic drift in small populations derived mostly by subdivisions of large populations into subpopulations. We outline the potential impact of the emerging repetitive families on genetic diversification of different subpopulations, and discuss implications of such diversification for the origin of new species. Testing the hypothesis Several testable predictions of the hypothesis are examined. First, we focus on the prediction that the number of diverse families of TEs fixed in a representative genome of a particular species positively correlates with the cumulative number of subpopulations (demes) in the historical metapopulation from which the species has emerged. Furthermore, we present evidence indicating that human AluYa5 and AluYb8 families might have originated in separate proto-human subpopulations. We also revisit prior evidence linking the origin of repetitive families to mammalian phylogeny and present additional evidence linking repetitive families to speciation based on mammalian taxonomy. Finally, we discuss evidence that mammalian orders represented by the largest numbers of species may be subject to relatively recent population subdivisions and speciation events. Implications of the hypothesis The hypothesis implies that subdivision of a population into small subpopulations is the major step in the origin of new families of TEs as well as of new species. The origin of new subpopulations is likely to be driven by the availability of new biological niches, consistent with the hypothesis of punctuated equilibria. The hypothesis also has implications for the ongoing debate on the role of genetic drift in genome evolution. Reviewers This article was reviewed by Eugene Koonin, Juergen Brosius and I. King Jordan.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, USA.
| | | | | |
Collapse
|
15
|
Brown WM. The Parental Antagonism Theory of Language Evolution: Preliminary Evidence for the Proposal. Hum Biol 2011; 83:213-45. [DOI: 10.3378/027.083.0205] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
16
|
Witherspoon DJ, Xing J, Zhang Y, Watkins WS, Batzer MA, Jorde LB. Mobile element scanning (ME-Scan) by targeted high-throughput sequencing. BMC Genomics 2010; 11:410. [PMID: 20591181 PMCID: PMC2996938 DOI: 10.1186/1471-2164-11-410] [Citation(s) in RCA: 75] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2010] [Accepted: 06/30/2010] [Indexed: 11/10/2022] Open
Abstract
Background Mobile elements (MEs) are diverse, common and dynamic inhabitants of nearly all genomes. ME transposition generates a steady stream of polymorphic genetic markers, deleterious and adaptive mutations, and substrates for further genomic rearrangements. Research on the impacts, population dynamics, and evolution of MEs is constrained by the difficulty of ascertaining rare polymorphic ME insertions that occur against a large background of pre-existing fixed elements and then genotyping them in many individuals. Results Here we present a novel method for identifying nearly all insertions of a ME subfamily in the whole genomes of multiple individuals and simultaneously genotyping (for presence or absence) those insertions that are variable in the population. We use ME-specific primers to construct DNA libraries that contain the junctions of all ME insertions of the subfamily, with their flanking genomic sequences, from many individuals. Individual-specific "index" sequences are designed into the oligonucleotide adapters used to construct the individual libraries. These libraries are then pooled and sequenced using a ME-specific sequencing primer. Mobile element insertion loci of the target subfamily are uniquely identified by their junction sequence, and all insertion junctions are linked to their individual libraries by the corresponding index sequence. To test this method's feasibility, we apply it to the human AluYb8 and AluYb9 subfamilies. In four individuals, we identified a total of 2,758 AluYb8 and AluYb9 insertions, including nearly all those that are present in the reference genome, as well as 487 that are not. Index counts show the sequenced products from each sample reflect the intended proportions to within 1%. At a sequencing depth of 355,000 paired reads per sample, the sensitivity and specificity of ME-Scan are both approximately 95%. Conclusions Mobile Element Scanning (ME-Scan) is an efficient method for quickly genotyping mobile element insertions with very high sensitivity and specificity. In light of recent improvements to high-throughput sequencing technology, it should be possible to employ ME-Scan to genotype insertions of almost any mobile element family in many individuals from any species.
Collapse
Affiliation(s)
- David J Witherspoon
- Dept. of Human Genetics, University of Utah Health Sciences Center, Salt Lake City, Utah 84112, USA.
| | | | | | | | | | | |
Collapse
|
17
|
Kvikstad EM, Makova KD. The (r)evolution of SINE versus LINE distributions in primate genomes: sex chromosomes are important. Genome Res 2010; 20:600-13. [PMID: 20219940 DOI: 10.1101/gr.099044.109] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The densities of transposable elements (TEs) in the human genome display substantial variation both within individual chromosomes and among chromosome types (autosomes and the two sex chromosomes). Finding an explanation for this variability has been challenging, especially in light of genome landscapes unique to the sex chromosomes. Here, using a multiple regression framework, we investigate primate Alu and L1 densities shaped by regional genome features and location on a particular chromosome type. As a result of our analysis, first, we build statistical models explaining up to 79% and 44% of variation in Alu and L1 element density, respectively. Second, we analyze sex chromosome versus autosome TE densities corrected for regional genomic effects. We discover that sex-chromosome bias in Alu and L1 distributions not only persists after accounting for these effects, but even presents differences in patterns, confirming preferential Alu integration in the male germline, yet likely integration of L1s in both male and female germlines or in early embryogenesis. Additionally, our models reveal that local base composition (measured by GC content and density of L1 target sites) and natural selection (inferred via density of most conserved elements) are significant to predicting densities of L1s. Interestingly, measurements of local double-stranded breaks (a 13-mer associated with genome instability) strongly correlate with densities of Alu elements; little evidence was found for the role of recombination-driven deletion in driving TE distributions over evolutionary time. Thus, Alu and L1 densities have been influenced by the combination of distinct local genome landscapes and the unique evolutionary dynamics of sex chromosomes.
Collapse
Affiliation(s)
- Erika M Kvikstad
- Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania 16802, USA.
| | | |
Collapse
|
18
|
Abstract
Advocates of chimpanzee research claim the genetic similarity of humans and chimpanzees make them an indispensable research tool to combat human diseases. Given that cancer is a leading cause of human death worldwide, one might expect that if chimpanzees were needed for, or were productive in, cancer research, then they would have been widely used. This comprehensive literature analysis reveals that chimpanzees have scarcely been used in any form of cancer research, and that chimpanzee tumours are extremely rare and biologically different from human cancers. Often, chimpanzee citations described peripheral use of chimpanzee cells and genetic material in predominantly human genomic studies. Papers describing potential new cancer therapies noted significant concerns regarding the chimpanzee model. Other studies described interventions that have not been pursued clinically. Finally, available evidence indicates that chimpanzees are not essential in the development of therapeutic monoclonal antibodies. It would therefore be unscientific to claim that chimpanzees are vital to cancer research. On the contrary, it is reasonable to conclude that cancer research would not suffer, if the use of chimpanzees for this purpose were prohibited in the US. Genetic differences between humans and chimpanzees, make them an unsuitable model for cancer, as well as other human diseases.
Collapse
Affiliation(s)
- Jarrod Bailey
- New England Anti-Vivisection Society, Boston, MA 02108-5100, USA.
| |
Collapse
|
19
|
Styles P, Brookfield JFY. Source gene composition and gene conversion of the AluYh and AluYi lineages of retrotransposons. BMC Evol Biol 2009; 9:102. [PMID: 19442302 PMCID: PMC2686708 DOI: 10.1186/1471-2148-9-102] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2008] [Accepted: 05/14/2009] [Indexed: 11/20/2022] Open
Abstract
Background Alu elements are a family of SINE retrotransposons in primates. They are classified into subfamilies according to specific diagnostic mutations from the general Alu consensus. It is now believed that there may be several retrotranspositionally-competent source genes within an Alu subfamily. In this study, subfamilies falling on the AluYi and AluYh lineages, and the AluYg6 subfamily, are assessed for the presence of secondary source genes, and the influence of gene conversion on the AluYh and AluYi lineages is also described. Results The AluYh7 and AluYi6 subfamilies appear to contain multiple source genes. The novel subfamilies AluYh3a1 and AluYh3a3 are described, for which there is no convincing evidence to suggest the presence of secondary sources. The mutational substructure of AluYh3a3 can be explained completely by inference of single master gene. A complete backwards gene conversion event appears to have inactivated the AluYh3a3 master gene in humans. Polymorphism data suggest a larger number of secondary source elements may be active in the AluYg6 family than previously thought. Conclusion It is clear that there is considerable variation in the number of source genes present in each of the young Alu subfamilies. This can range from a single master source gene, as for AluYh3a3, to as many as 14 source elements in AluYi6.
Collapse
Affiliation(s)
- Pamela Styles
- Institute of Genetics, School of Biology, University of Nottingham, Nottingham, UK.
| | | |
Collapse
|
20
|
Park ES, Huh JW, Kim TH, Kwak KD, Kim W, Kim HS. Analysis of newly identified low copy AluYj subfamily. Genes Genet Syst 2009; 80:415-22. [PMID: 16501310 DOI: 10.1266/ggs.80.415] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
Abstract
Human specific AluY elements were investigated by comparative analysis between human chromosome 21 and chimpanzee chromosome 22. Human specific AluY element was identified on human chromosome 21q22 (accession no. AL163282), and then that was a new member of AluYj subfamily. From the bioinformatic analysis, AluYj subfamily was investigated in human whole genome using AluYj4 consensus sequence (accession no. AL163282). Thirteen members of the AluYj4 elements (4 diagnostic mutations) and eight members of the AluYj3 elements (3 diagnostic mutations) were identified with distinct diagnostic mutation from AluY consensus sequence. The results of the molecular clock calculation of non-CpG region substitution indicated that, AluYj4 elements (2.1 million years old) may be proliferated more recent time than AluYj3 elements (14.1 million years old). For the verification of recent insertion time, four of AluYj4 elements (ch2-AC017101, ch10-AC044786, ch12-AC007656 and ch21-AL163282) from human chromosomes 2, 10, 12, 21 were analyzed by PCR amplification using various human and primate DNA samples. Though, no polymorphism was detected in human population, we identified the new AluYj4 subfamily as the human specific elements.
Collapse
Affiliation(s)
- Eun-Sil Park
- Division of Biological Sciences, College of Natural Sciences, Pusan National University, Buscan, Korea
| | | | | | | | | | | |
Collapse
|
21
|
Bochukova EG, Roscioli T, Hedges DJ, Taylor IB, Johnson D, David DJ, Deininger PL, Wilkie AO. Rare mutations ofFGFR2causing apert syndrome: identification of the first partial gene deletion, and anAluelement insertion from a new subfamily. Hum Mutat 2009; 30:204-11. [DOI: 10.1002/humu.20825] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
|
22
|
Kortschak RD, Tsend-Ayush E, Grützner F. Analysis of SINE and LINE repeat content of Y chromosomes in the platypus, Ornithorhynchus anatinus. Reprod Fertil Dev 2009; 21:964-75. [DOI: 10.1071/rd09084] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2009] [Accepted: 06/21/2009] [Indexed: 01/11/2023] Open
Abstract
Monotremes feature an extraordinary sex-chromosome system that consists of five X and five Y chromosomes in males. These sex chromosomes share homology with bird sex chromosomes but no homology with the therian X. The genome of a female platypus was recently completed, providing unique insights into sequence and gene content of autosomes and X chromosomes, but no Y-specific sequence has so far been analysed. Here we report the isolation, sequencing and analysis of ~700 kb of sequence of the non-recombining regions of Y2, Y3 and Y5, which revealed differences in base composition and repeat content between autosomes and sex chromosomes, and within the sex chromosomes themselves. This provides the first insights into repeat content of Y chromosomes in platypus, which overall show similar patterns of repeat composition to Y chromosomes in other species. Interestingly, we also observed differences between the various Y chromosomes, and in combination with timing and activity patterns we provide an approach that can be used to examine the evolutionary history of the platypus sex-chromosome chain.
Collapse
|
23
|
Analysis of transposon interruptions suggests selection for L1 elements on the X chromosome. PLoS Genet 2008; 4:e1000172. [PMID: 18769724 PMCID: PMC2517846 DOI: 10.1371/journal.pgen.1000172] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2008] [Accepted: 07/17/2008] [Indexed: 01/02/2023] Open
Abstract
It has been hypothesised that the massive accumulation of L1 transposable elements on the X chromosome is due to their function in X inactivation, and that the accumulation of Alu elements near genes is adaptive. We tested the possible selective advantage of these two transposable element (TE) families with a novel method, interruption analysis. In mammalian genomes, a large number of TEs interrupt other TEs due to the high overall abundance and age of repeats, and these interruptions can be used to test whether TEs are selectively neutral. Interruptions of TEs, which are beneficial for the host, are expected to be deleterious and underrepresented compared with neutral ones. We found that L1 elements in the regions of the X chromosome that contain the majority of the inactivated genes are significantly less frequently interrupted than on the autosomes, while L1s near genes that escape inactivation are interrupted with higher frequency, supporting the hypothesis that L1s on the X chromosome play a role in its inactivation. In addition, we show that TEs are less frequently interrupted in introns than in intergenic regions, probably due to selection against the expansion of introns, but the insertion pattern of Alus is comparable to other repeats. Recent experimental findings (for example the ENCODE project) show that many functional non-coding regions of genomes are not conserved across species, making the in-silico discovery of such regions challenging. Transposable elements (TEs), which represent 45 percent of the human genome and typically show no sequence conservation, are particularly intriguing from this point of view, because the highly nonrandom genomic distribution of many TE families in genomes has led to hypotheses that their presence is adaptive and have an epigenetic (regulatory) function. We use a novel approach based on the analysis of interrupted TEs to investigate if repeats are under selection that does not rely on sequence conservation. L1 elements, the most active transposable elements of the human genome, are highly overrepresented on the X-chromosome and were suggested to enhance its inactivation in mammals. We find that the interruption pattern of L1 repeats indicates a function for L1 elements in the inactivation of the mammalian X chromosome. Additionally, we show that a considerable fraction of TEs in introns are under selection for integrity, possibly due to selection on intron size or on TEs themselves.
Collapse
|
24
|
Methylation perturbations in retroelements within the genome of a Mus interspecific hybrid correlate with double minute chromosome formation. Genomics 2008; 91:267-73. [DOI: 10.1016/j.ygeno.2007.12.001] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2007] [Revised: 10/10/2007] [Accepted: 12/05/2007] [Indexed: 12/22/2022]
|
25
|
Jurka J, Kapitonov VV, Kohany O, Jurka MV. Repetitive sequences in complex genomes: structure and evolution. Annu Rev Genomics Hum Genet 2007; 8:241-59. [PMID: 17506661 DOI: 10.1146/annurev.genom.8.080706.092416] [Citation(s) in RCA: 238] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Eukaryotic genomes contain vast amounts of repetitive DNA derived from transposable elements (TEs). Large-scale sequencing of these genomes has produced an unprecedented wealth of information about the origin, diversity, and genomic impact of what was once thought to be "junk DNA." This has also led to the identification of two new classes of DNA transposons, Helitrons and Polintons, as well as several new superfamilies and thousands of new families. TEs are evolutionary precursors of many genes, including RAG1, which plays a role in the vertebrate immune system. They are also the driving force in the evolution of epigenetic regulation and have a long-term impact on genomic stability and evolution. Remnants of TEs appear to be overrepresented in transcription regulatory modules and other regions conserved among distantly related species, which may have implications for our understanding of their impact on speciation.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, Mountain View, California 94043, USA.
| | | | | | | |
Collapse
|
26
|
Umylny B, Presting G, Efird JT, Klimovitsky BI, Ward WS. Most human Alu and murine B1 repeats are unique. J Cell Biochem 2007; 102:110-21. [PMID: 17407136 DOI: 10.1002/jcb.21278] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Alus and B1s are short interspersed repeat elements (SINEs) indirectly derived from the 7SL RNA gene. While most researchers recognize that there exists extensive variability between individual elements, the extent of this variability has never been systematically tested. We examined all Alu elements over 200 nucleotides and all B1 elements over 100 nucleotides in the human and mouse genomes, and analyzed the number of copies of each element at various stringencies from 22 nucleotides to full length. Over 98% of 923,277 Alus and 365,377 B1s examined were unique when queried at full length. When the criterion was reduced to half the length of the repeat, 97% of the Alus and 73% of the B1s were still found to be a single copy. All single and multi-copy sequences have been mapped and documented. Access to the data is possible using the AluPlus website http://www.ibr.hawaii.edu.
Collapse
Affiliation(s)
- Boris Umylny
- Institute of Biogenesis Research, John A. Burns School of Medicine, University of Hawaii, Honolulu, Hawaii 96822, USA
| | | | | | | | | |
Collapse
|
27
|
Abstract
Alus and B1s are short interspersed repeat elements (SINEs) derived from the 7SL RNA gene. Alus and B1s exist in the cytoplasm as non-coding RNA indicating that they are actively transcribed, but their function, if any, is unknown. Transcription of individual SINEs is a prerequisite for retroposition, but it is also possible that individual Alu and B1 elements have some cellular functions. Previous studies suggest that transcription of Alu elements depends on the presence of an RNA polymerase-III bipartite promoter and the poly-A tail. Sequencing of small RNAs has demonstrated that the members of the Y and S subfamily are expressed. We analyzed almost one million Alu sequences longer than 200 nucleotides for the presence of RNA polymerase-III bipartite promoter sequences. More than half contained a promoter indicating some potential for expression. We searched 7.7 million human EST sequences in dbEST for the presence of Alu non-coding RNAs and found evidence for the expression of 452. Analysis of mouse spermatogenic dbEST libraries revealed an apparent relationship between the level of differentiation and the level of B1-related sequences in the EST library.
Collapse
Affiliation(s)
- Boris Umylny
- Asia Pacific Bioinformatics Research Institute, Honolulu, HI, USA
| | | | | |
Collapse
|
28
|
Analysis of the features and source gene composition of the AluYg6 subfamily of human retrotransposons. BMC Evol Biol 2007; 7:102. [PMID: 17603915 PMCID: PMC1925064 DOI: 10.1186/1471-2148-7-102] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2007] [Accepted: 07/01/2007] [Indexed: 11/19/2022] Open
Abstract
Background Alu elements are a family of SINE retrotransposons in primates. They are classified into subfamilies according to specific diagnostic mutations from the general Alu consensus. It is now believed that there may be several retrotranspositionally-competent source genes within an Alu subfamily. To investigate the evolution of young Alu elements it is critical to have access to complete subfamilies, which, following the release of the final human genome assembly, can now be obtained using in silico methods. Results 380 elements belonging to the young AluYg6 subfamily were identified in the human genome, a number significantly exceeding prior expectations. An AluYg6 element was also identified in the chimpanzee genome, indicating that the subfamily is older than previously estimated, and appears to have undergone a period of dormancy before its expansion. The relative contributions of back mutation and gene conversion to variation at the six diagnostic positions are examined, and cases of complete forward gene conversion events are reported. Two small subfamilies derived from AluYg6 have been identified, named AluYg6a2 and AluYg5b3, which contain 40 and 27 members, respectively. These small subfamilies are used to illustrate the ambiguity regarding Alu subfamily definition, and to assess the contribution of secondary source genes to the AluYg6 subfamily. Conclusion The number of elements in the AluYg6 subfamily greatly exceeds prior expectations, indicating that the current knowledge of young Alu subfamilies is incomplete, and that prior analyses that have been carried out using these data may have generated inaccurate results. A definition of primary and secondary source genes has been provided, and it has been shown that several source genes have contributed to the proliferation of the AluYg6 subfamily. Access to the sequence data for the complete AluYg6 subfamily will be invaluable in future computational analyses investigating the evolution of young Alu subfamilies.
Collapse
|
29
|
Gu Y, Kodama H, Watanabe S, Kikuchi N, Ishitsuka I, Ozawa H, Fujisawa C, Shiga K. The first reported case of Menkes disease caused by an Alu insertion mutation. Brain Dev 2007; 29:105-8. [PMID: 17178205 DOI: 10.1016/j.braindev.2006.05.012] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 12/29/2005] [Revised: 04/25/2006] [Accepted: 05/27/2006] [Indexed: 11/17/2022]
Abstract
We present the first reported case of Menkes disease caused by an Alu element insertion mutation that interfered with splicing regulatory elements. A whole young AluYa5a2 element, which was 382-bp long, was identified within exon 9 of the ATP7A gene, and all of exon 9 was aberrantly skipped in the cDNA, resulting in severely truncated proteins. To confirm whether the aberrant skipping resulted in Alu insertion, an exonic splicing enhancer finder was used. The Alu element created two new high-score exonic splicing enhancer sequences in the mutation located near the site of the insertion. Exon 9, which encodes the first and second transmembrane domains, is necessary for the normal function of the ATP7A protein.
Collapse
Affiliation(s)
- YanHong Gu
- Department of Health Policy, National Research Institute for Child Health and Development, 2-10-1 Okura, Tokyo, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
30
|
Schmidt AL, Anderson LM. Repetitive DNA elements as mediators of genomic change in response to environmental cues. Biol Rev Camb Philos Soc 2007. [DOI: 10.1111/j.1469-185x.2006.tb00217.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
31
|
Jurka J, Kohany O, Pavlicek A, Kapitonov VV, Jurka MV. Clustering, duplication and chromosomal distribution of mouse SINE retrotransposons. Cytogenet Genome Res 2005; 110:117-23. [PMID: 16093663 DOI: 10.1159/000084943] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2003] [Accepted: 02/05/2004] [Indexed: 10/25/2022] Open
Abstract
We analyzed potential mechanisms determining chromosomal distributions of the mouse B1 and B2 non-LTR retrotransposons, also known as SINE elements. We report that young B1 and B2 SINEs are underrepresented on chromosome X relative to autosomes, which is consistent with their integration in male germ lines. As the age of the SINE elements progresses, their densities on chromosome X increase relative to autosomal densities, possibly due to differences in ectopic recombination rates between chromosome X and autosomes. Furthermore, unlike young human Alus that tend to be integrated outside Alu-dense regions, young B1 and B2 elements are found mostly in SINE-rich clusters. The B1- or B2-rich clusters are more likely to contain duplicated elements than B1- or B2-poor chromosomal regions. We also present evidence indicating potential association of B1 and B2 elements with intra-chromosomal segmental duplications. No such association was found with inter-chromosomal duplications. We propose that the accumulation of mouse SINE elements observed in GC-rich regions may be due to the excess of DNA duplications over deletions in gene-rich regions that tend to be GC rich.
Collapse
Affiliation(s)
- J Jurka
- Genetic Information Research Institute, Mountain View, CA 94043, USA.
| | | | | | | | | |
Collapse
|
32
|
Wang H, Xing J, Grover D, Hedges DJ, Han K, Walker JA, Batzer MA. SVA elements: a hominid-specific retroposon family. J Mol Biol 2005; 354:994-1007. [PMID: 16288912 DOI: 10.1016/j.jmb.2005.09.085] [Citation(s) in RCA: 260] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2005] [Revised: 09/22/2005] [Accepted: 09/27/2005] [Indexed: 11/25/2022]
Abstract
SVA is a composite repetitive element named after its main components, SINE, VNTR and Alu. We have identified 2762 SVA elements from the human genome draft sequence. Genomic distribution analysis indicates that the SVA elements are enriched in G+C-rich regions but have no preferences for inter- or intragenic regions. A phylogenetic analysis of the elements resulted in the recovery of six subfamilies that were named SVA_A to SVA_F. The composition, age and genomic distribution of the subfamilies have been examined. Subfamily age estimates based upon nucleotide divergence indicate that the expansion of four SVA subfamilies (SVA_A, SVA_B, SVA_C and SVA_D) began before the divergence of human, chimpanzee and gorilla, while subfamilies SVA_E and SVA_F are restricted to the human lineage. A survey of human genomic diversity associated with SVA_E and SVA_F subfamily members showed insertion polymorphism frequencies of 37.5% and 27.6%, respectively. In addition, we examined the amplification dynamics of SVA elements throughout the primate order and traced their origin back to the beginnings of hominid primate evolution, approximately 18 to 25 million years ago. This makes SVA elements the youngest family of retroposons in the primate order.
Collapse
Affiliation(s)
- Hui Wang
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for BioModular Multi-Scale Systems, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | | | | | | | |
Collapse
|
33
|
Gentles AJ, Kohany O, Jurka J. Evolutionary diversity and potential recombinogenic role of integration targets of Non-LTR retrotransposons. Mol Biol Evol 2005; 22:1983-91. [PMID: 15944437 PMCID: PMC1400617 DOI: 10.1093/molbev/msi188] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Short interspersed elements (SINEs) make up a significant fraction of total DNA in mammalian genomes, providing a rich substrate for chromosomal rearrangements by SINE-SINE recombinations. Proliferation of mammalian SINEs is mediated primarily by long interspersed element 1 (L1) non-long terminal repeat retrotransposons that preferentially integrate at DNA sequence targets with an average length of approximately 15 bp and containing conserved endonucleolytic nicking signals at both ends. We report that sequence variations in the first of the two nicking signals, represented by a 5'-TT-AAAA consensus sequence, affect the position of the second signal thus leading to target site duplications (TSDs) of different lengths. The length distribution of TSDs appears to be affected also by L1-encoded enzyme variants because targets with the same 5' nicking site can be of different average lengths in different mammalian species. Taking this into account, we reanalyzed the second nicking site and found that it is larger and includes more conserved sites than previously appreciated, with a consensus of 5'-ANTNTN-AA. We also studied potential involvement of the nicking sites in stimulating recombinations between SINEs. We determined that SINEs retaining TSDs with perfect 5'-TT-AAAA nicking sites appear to be lost relatively rapidly from the human and rat genomes and less rapidly from dog. We speculate that the introduction of DNA breaks induced by recurring endonucleolytic attacks at these sites, combined with the ubiquitousness of SINEs, may significantly promote recombination between repetitive elements, leading to the observed losses. At the same time, new L1 subfamilies may be selected for "incompatibility" with preexisting targets. This provides a possible driving force for the continual emergence of new L1 subfamilies which, in turn, may affect selection of L1-dependent SINE subfamilies.
Collapse
Affiliation(s)
- Andrew J. Gentles
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, Tel: 650-961-4480, Fax: 650-961-4473
| | - Oleksiy Kohany
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, Tel: 650-961-4480, Fax: 650-961-4473
| | - Jerzy Jurka
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, Tel: 650-961-4480, Fax: 650-961-4473
| |
Collapse
|
34
|
Abstract
As is the case with mammals in general, primate genomes are inundated with repetitive sequence. Although much of this repetitive content consists of "molecular fossils" inherited from early mammalian ancestors, a significant portion of this material comprises active mobile element lineages. Despite indications that these elements played a major role in shaping the architecture of the genome, there remain many unanswered questions surrounding the nature of the host-element relationship. Here we review advances in our understanding of the host-mobile element dynamic and its overall impact on primate evolution.
Collapse
Affiliation(s)
- Dale J Hedges
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, LA 70803, USA
| | | |
Collapse
|
35
|
Chen JM, Stenson PD, Cooper DN, Férec C. A systematic analysis of LINE-1 endonuclease-dependent retrotranspositional events causing human genetic disease. Hum Genet 2005; 117:411-27. [PMID: 15983781 DOI: 10.1007/s00439-005-1321-0] [Citation(s) in RCA: 155] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2005] [Accepted: 04/04/2005] [Indexed: 10/25/2022]
Abstract
Diverse long interspersed element-1 (LINE-1 or L1)-dependent mutational mechanisms have been extensively studied with respect to L1 and Alu elements engineered for retrotransposition in cultured cells and/or in genome-wide analyses. To what extent the in vitro studies can be held to accurately reflect in vivo events in the human genome, however, remains to be clarified. We have attempted to address this question by means of a systematic analysis of recent L1-mediated retrotranspositional events that have caused human genetic disease, with a view to providing a more complete picture of how L1-mediated retrotransposition impacts upon the architecture of the human genome. A total of 48 such mutations were identified, including those described as L1-mediated retrotransposons, as well as insertions reported to contain a poly(A) tail: 26 were L1 trans-driven Alu insertions, 15 were direct L1 insertions, four were L1 trans-driven SVA insertions, and three were associated with simple poly(A) insertions. The systematic study of these lesions, when combined with previous in vitro and genome-wide analyses, has strengthened several important conclusions regarding L1-mediated retrotransposition in humans: (a) approximately 25% of L1 insertions are associated with the 3' transduction of adjacent genomic sequences, (b) approximately 25% of the new L1 inserts are full-length, (c) poly(A) tail length correlates inversely with the age of the element, and (d) the length of target site duplication in vivo is rarely longer than 20 bp. Our analysis also suggests that some 10% of L1-mediated retrotranspositional events are associated with significant genomic deletions in humans. Finally, the identification of independent retrotranspositional events that have integrated at the same genomic locations provides new insight into the L1-mediated insertional process in humans.
Collapse
Affiliation(s)
- Jian-Min Chen
- INSERM U613-Génétique Moléculaire et Génétique Epidémiologique, Etablissement Français du Sang-Bretagne, Université de Bretagne Occidentale, Centre Hospitalier Universitaire, Brest, 29220, France.
| | | | | | | |
Collapse
|
36
|
Han K, Xing J, Wang H, Hedges DJ, Garber RK, Cordaux R, Batzer MA. Under the genomic radar: the stealth model of Alu amplification. Genome Res 2005; 15:655-64. [PMID: 15867427 PMCID: PMC1088293 DOI: 10.1101/gr.3492605] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2004] [Accepted: 01/28/2005] [Indexed: 11/25/2022]
Abstract
Alu elements are the most successful SINEs (Short INterspersed Elements) in primate genomes and have reached more than 1,000,000 copies in the human genome. The amplification of most Alu elements is thought to occur through a limited number of hyperactive "master" genes that produce a high number of copies during long evolutionary periods of time. However, the existence of long-lived, low-activity Alu lineages in the human genome suggests a more complex propagation mechanism. Using both computational and wet-bench approaches, we reconstructed the evolutionary history of the AluYb lineage, one of the most active Alu lineages in the human genome. We show that the major AluYb lineage expansion in humans is a species-specific event, as nonhuman primates possess only a handful of AluYb elements. However, the oldest existing AluYb element resided in an orthologous position in all hominoid primate genomes examined, demonstrating that the AluYb lineage originated 18-25 million years ago. Thus, the history of the AluYb lineage is characterized by approximately 20 million years of retrotranspositional quiescence preceding a major expansion in the human genome within the past few million years. We suggest that the evolutionary success of the Alu family may be driven at least in part by "stealth-driver" elements that maintain low retrotranspositional activity over extended periods of time and occasionally produce short-lived hyperactive copies responsible for the formation and remarkable expansion of Alu elements within the genome.
Collapse
Affiliation(s)
- Kyudong Han
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for BioModular Multi-Scale Systems, Louisiana State University, Baton Rouge, LA 70803, USA
| | | | | | | | | | | | | |
Collapse
|
37
|
Khil PP, Oliver B, Camerini-Otero RD. X for intersection: retrotransposition both on and off the X chromosome is more frequent. Trends Genet 2005; 21:3-7. [PMID: 15680505 DOI: 10.1016/j.tig.2004.11.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
As the heteromorphic sex chromosomes evolved from a pair of autosomes, the sex chromosomes became increasingly different in gene content and structure from each other and from the autosomes. Although recently there has been progress in documenting and understanding these differences, the molecular mechanisms that have fashioned some of these changes remain unclear. A new study addresses the differential distribution of retroposed genes in human and mouse genomes. Surprisingly, chromosome X is a major source and a preferred target for retrotransposition.
Collapse
Affiliation(s)
- Pavel P Khil
- Genetics and Biochemistry Branch and Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892, USA
| | | | | |
Collapse
|
38
|
Bennett EA, Coleman LE, Tsui C, Pittard WS, Devine SE. Natural genetic variation caused by transposable elements in humans. Genetics 2005; 168:933-51. [PMID: 15514065 PMCID: PMC1448813 DOI: 10.1534/genetics.104.031757] [Citation(s) in RCA: 127] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Transposons and transposon-like repetitive elements collectively occupy 44% of the human genome sequence. In an effort to measure the levels of genetic variation that are caused by human transposons, we have developed a new method to broadly detect transposon insertion polymorphisms of all kinds in humans. We began by identifying 606,093 insertion and deletion (indel) polymorphisms in the genomes of diverse humans. We then screened these polymorphisms to detect indels that were caused by de novo transposon insertions. Our method was highly efficient and led to the identification of 605 nonredundant transposon insertion polymorphisms in 36 diverse humans. We estimate that this represents 25-35% of approximately 2075 common transposon polymorphisms in human populations. Because we identified all transposon insertion polymorphisms with a single method, we could evaluate the relative levels of variation that were caused by each transposon class. The average human in our study was estimated to harbor 1283 Alu insertion polymorphisms, 180 L1 polymorphisms, 56 SVA polymorphisms, and 17 polymorphisms related to other forms of mobilized DNA. Overall, our study provides significant steps toward (i) measuring the genetic variation that is caused by transposon insertions in humans and (ii) identifying the transposon copies that produce this variation.
Collapse
Affiliation(s)
- E Andrew Bennett
- Department of Biochemistry, Emory University School of Medicine, Atlanta, Georgia 30322, USA
| | | | | | | | | |
Collapse
|
39
|
Hackenberg M, Bernaola-Galván P, Carpena P, Oliver JL. The Biased Distribution of Alus in Human Isochores Might Be Driven by Recombination. J Mol Evol 2005; 60:365-77. [PMID: 15871047 DOI: 10.1007/s00239-004-0197-2] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2004] [Accepted: 10/01/2004] [Indexed: 11/30/2022]
Abstract
Alu retrotransposons do not show a homogeneous distribution over the human genome but have a higher density in GC-rich (H) than in AT-rich (L) isochores. However, since they preferentially insert into the L isochores, the question arises: What is the evolutionary mechanism that shifts the Alu density maximum from L to H isochores? To disclose the role played by each of the potential mechanisms involved in such biased distribution, we carried out a genome-wide analysis of the density of the Alus as a function of their evolutionary age, isochore membership, and intron vs. intergene location. Since Alus depend on the retrotransposase encoded by the LINE1 elements, we also studied the distribution of LINE1 to provide a complete evolutionary scenario. We consecutively check, and discard, the contributions of the Alu/LINE1 competition for retrotransposase, compositional matching pressure, and Alu overrepresentation in introns. In analyzing the role played by unequal recombination, we scan the genome for Alu trimers, a direct product of Alu-Alu recombination. Through computer simulations, we show that such trimers are much more frequent than expected, the observed/expected ratio being higher in L than in H isochores. This result, together with the known higher selective disadvantage of recombination products in H isochores, points to Alu-Alu recombination as the main agent provoking the density shift of Alus toward the GC-rich parts of the genome. Two independent pieces of evidence-the lower evolutionary divergence shown by recently inserted Alu subfamilies and the higher frequency of old stand-alone Alus in L isochores-support such a conclusion. Other evolutionary factors, such as population bottlenecks during primate speciation, may have accelerated the fast accumulation of Alus in GC-rich isochores.
Collapse
Affiliation(s)
- Michael Hackenberg
- Departamento de Genética, Facultad de Ciencias, Universidad de Granada, Spain
| | | | | | | |
Collapse
|
40
|
Abstract
Background Alu elements are short (~300 bp) interspersed elements that amplify in primate genomes through a process termed retroposition. The expansion of these elements has had a significant impact on the structure and function of primate genomes. Approximately 10 % of the mass of the human genome is comprised of Alu elements, making them the most abundant short interspersed element (SINE) in our genome. The majority of Alu amplification occurred early in primate evolution, and the current rate of Alu retroposition is at least 100 fold slower than the peak of amplification that occurred 30–50 million years ago. Alu elements are therefore a rich source of inter- and intra-species primate genomic variation. Results A total of 153 Alu elements from the Ye subfamily were extracted from the draft sequence of the human genome. Analysis of these elements resulted in the discovery of two new Alu subfamilies, Ye4 and Ye6, complementing the previously described Ye5 subfamily. DNA sequence analysis of each of the Alu Ye subfamilies yielded average age estimates of ~14, ~13 and ~9.5 million years old for the Alu Ye4, Ye5 and Ye6 subfamilies, respectively. In addition, 120 Alu Ye4, Ye5 and Ye6 loci were screened using polymerase chain reaction (PCR) assays to determine their phylogenetic origin and levels of human genomic diversity. Conclusion The Alu Ye lineage appears to have started amplifying relatively early in primate evolution and continued propagating at a low level as many of its members are found in a variety of hominoid (humans, greater and lesser ape) genomes. Detailed sequence analysis of several Alu pre-integration sites indicated that multiple types of events had occurred, including gene conversions, near-parallel independent insertions of different Alu elements and Alu-mediated genomic deletions. A potential hotspot for Alu insertion in the Fer1L3 gene on chromosome 10 was also identified.
Collapse
|
41
|
Xing J, Hedges DJ, Han K, Wang H, Cordaux R, Batzer MA. Alu element mutation spectra: molecular clocks and the effect of DNA methylation. J Mol Biol 2005; 344:675-82. [PMID: 15533437 DOI: 10.1016/j.jmb.2004.09.058] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2004] [Revised: 09/21/2004] [Accepted: 09/22/2004] [Indexed: 11/29/2022]
Abstract
In primate genomes more than 40% of CpG islands are found within repetitive elements. With more than one million copies in the human genome, the Alu family of retrotransposons represents the most successful short interspersed element (SINE) in primates and CpG dinucleotides make up about 20% of Alu sequences. It is generally thought that CpG dinucleotides mutate approximately ten times faster than other dinucleotides due to cytosine methylation and the subsequent deamination and conversion of C-->T. However, the disparity of Alu subfamily age estimations based upon CpG or non-CpG substitution density indicates a more complex relationship between CpG and non-CpG substitutions within the Alu elements. Here we report an analysis of the mutation patterns for 5296 Alu elements comprising 20 subfamilies. Our results indicate a relatively constant CpG versus non-CpG substitution ratio of approximately 6 for the young (AluY) and intermediate (AluS) Alu subfamilies. However, a more complex non-linear relationship between CpG and non-CpG substitutions was observed when old (AluJ) subfamilies were included in the analysis. These patterns may be the result of the slowdown of the neutral mutation rate during primate evolution and/or an increase in the CpG mutation rate as the consequence of increased DNA methylation in response to a burst of retrotransposition activity approximately 35 million years ago.
Collapse
Affiliation(s)
- Jinchuan Xing
- Department of Biological Sciences, Biological Computation and Visualization Center, Center for Bio-Modular Microsystems, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | | | | | |
Collapse
|
42
|
Abstract
There are over a million Alu repetitive elements dispersed throughout the human genome, and a high level of Alu-sequence similarity ensures a strong propensity for unequal crossover events, some of which have lead to deleterious oncogenic rearrangements. Furthermore, Alu insertions introduce consensus 3' splice sites, which potentially facilitate alternative splicing. Not surprisingly, Alu-mediated defective splicing has also been associated with cancer. To investigate a possible correlation between the expansion of Alu repeats associated with primate divergence and predisposition to cancer, 4 Alu-mediated rearrangements — known to be the basis of cancer — were selected for phylogenetic analysis of the necessary genotype. In these 4 cases, it was determined that the different phylogenetic age of the oncogenic recombination-prone genotype reflected the evolutionary history of Alu repeats spreading to new genomic sites. Our data implies that the evolutionary expansion of Alu repeats to new genomic locations establishes new predispositions to cancer in various primate species.Key words: Alu repeats, evolution, cancer, primates, splicing, DNA recombination.
Collapse
Affiliation(s)
- Rosaleen Gibbons
- Department of Biochemistry, University of California, Riverside, CA 92521, USA
| | | |
Collapse
|
43
|
Abstract
Early studies of human Alu retrotransposons focused on their origin, evolution and biological properties, but current focus is shifting toward the effect of Alu elements on evolution of the human genome. Recent analyses indicate that numerous factors have affected the chromosomal distribution of Alu elements over time, including male-driven insertions, deletions and rapid CpG mutations after their retrotransposition. Unequal crossing over between Alu elements can lead to local mutations or to large segmental duplications responsible for genetic diseases and long-term evolutionary changes. Alu elements can also affect human (primate) evolution by introducing alternative splice sites in existing genes. Studying the Alu family in a human genomic context is likely to have general significance for our understanding of the evolutionary impact of other repetitive elements in diverse eukaryotic genomes.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, USA.
| |
Collapse
|
44
|
Gibbons R, Dugaiczyk LJ, Girke T, Duistermars B, Zielinski R, Dugaiczyk A. Distinguishing humans from great apes with AluYb8 repeats. J Mol Biol 2004; 339:721-9. [PMID: 15165846 DOI: 10.1016/j.jmb.2004.04.033] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2004] [Revised: 04/13/2004] [Accepted: 04/16/2004] [Indexed: 11/26/2022]
Abstract
Humans and chimpanzees share some 99% of DNA and amino acid identity, yet they exhibit important biomedical, morphological, and cognitive differences, difficult to accommodate within the remaining 1% of sequence diversity. Other types of genetic variation must be responsible for the taxonomic differences. Here we trace the evolution of AluYb8 repeats from a single origin at the roots of higher primates to a large increase in their number in humans. We identify nine AluYb8 DNA repeats in the chimpanzee genome compared to over 2200 repeats in the human, which represents a 250-fold increase in the rate of change in the human lineage and far outweighs the 99% sequence similarity between the two species. It is estimated that the average age of the human Yb8Alus is about 3.3 million years (My); almost 10% of them are identical in sequence, and hence are of recent origin. Genomic variations of this magnitude, distinguishing humans from great apes have not been realized. This explosive Alu expansion must have had a profound effect on the organization of our genome and the architecture of our chromosomes, inferentially altering profiles of gene expression and chromosome choreography in cell division. Additionally, we conclude that this major evolutionary process of Alu proliferation is driven by internal forces, written in the chemistry of DNA, rather than by external selection.
Collapse
Affiliation(s)
- Rosaleen Gibbons
- Department of Biochemistry, University of California, Riverside, CA 92521, USA
| | | | | | | | | | | |
Collapse
|
45
|
Jurka J, Kohany O, Pavlicek A, Kapitonov VV, Jurka MV. Duplication, coclustering, and selection of human Alu retrotransposons. Proc Natl Acad Sci U S A 2004; 101:1268-72. [PMID: 14736919 PMCID: PMC337042 DOI: 10.1073/pnas.0308084100] [Citation(s) in RCA: 98] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2003] [Indexed: 11/18/2022] Open
Abstract
Alu and L1 are families of non-LTR retrotransposons representing approximately equal 30% of the human genome. Genomic distributions of young Alu and L1 elements are quite similar, but over time, Alu densities in GC-rich DNA increase in comparison with L1 densities. Here we analyze two processes that may contribute to this phenomenon. First, DNA duplications in the human genome occur more frequently in Alu- and GC-rich than in AT-rich chromosomal regions. Second, most Alu elements tend to be coclustered with each other, but recently retroposed elements are likely to be inserted outside the existing clusters. These "stand-alone" elements appear to be rapidly eliminated from the genome. We also report that over time, the densities of recently retroposed Alu families on chromosome Y decline rapidly, whereas Alu densities on chromosome X increase relative to autosomal densities. We propose that these changes in the chromosomal proportions of Alu densities and the elimination of stand-alone Alus represent the same process of paternal Alu selection. We also propose that long-term Alu accumulation in GC-rich DNA is associated with DNA duplication initiated by elevated recombinogenic activities in Alu clusters.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, 2081 Landings Drive, Mountain View, CA 94043-0815, USA.
| | | | | | | | | |
Collapse
|
46
|
Dagan T, Sorek R, Sharon E, Ast G, Graur D. AluGene: a database of Alu elements incorporated within protein-coding genes. Nucleic Acids Res 2004; 32:D489-92. [PMID: 14681464 PMCID: PMC308866 DOI: 10.1093/nar/gkh132] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Alu elements are short interspersed elements (SINEs) approximately 300 nucleotides in length. More than 1 million Alus are found in the human genome. Despite their being genetically functionless, recent findings suggest that Alu elements may have a broad evolutionary impact by affecting gene structures, protein sequences, splicing motifs and expression patterns. Because of these effects, compiling a genomic database of Alu sequences that reside within protein-coding genes seemed a useful enterprise. Presently, such data are limited since the structural and positional information on genes and Alu sequences are scattered throughout incompatible and unconnected databases. AluGene (http://Alugene.tau.ac.il/) provides easy access to a complete Alu map of the human genome, as well as Alu-associated information. The Alu elements are annotated with respect to coding region and exon/intron location. This design facilitates queries on Alu sequences, locations, as well as motifs and compositional properties via a one-stop search page.
Collapse
Affiliation(s)
- Tal Dagan
- Department of Zoology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv 69978, Israel.
| | | | | | | | | |
Collapse
|
47
|
Cleary JD, Pearson CE. The contribution of cis-elements to disease-associated repeat instability: clinical and experimental evidence. Cytogenet Genome Res 2003; 100:25-55. [PMID: 14526163 DOI: 10.1159/000072837] [Citation(s) in RCA: 116] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2002] [Accepted: 02/11/2003] [Indexed: 11/19/2022] Open
Abstract
Alterations in the length (instability) of gene-specific microsatellites and minisatellites are associated with at least 35 human diseases. This review will discuss the various cis-elements that contribute to repeat instability, primarily through examination of the most abundant disease-associated repetitive element, trinucleotide repeats. For the purpose of this review, we define cis-elements to include the sequence of the repeat units, the length and purity of the repeat tracts, the sequences flanking the repeat, as well as the surrounding epigenetic environment, including DNA methylation and chromatin structure. Gender-, tissue-, developmental- and locus-specific cis-elements in conjunction with trans-factors may facilitate instability through the processes of DNA replication, repair and/or recombination. Here we review the available human data that supports the involvement of cis-elements in repeat instability with limited reference to model systems. In diverse tissues at different developmental times and at specific loci, repetitive elements display variable levels of instability, suggesting vastly different mechanisms may be responsible for repeat instability amongst the disease loci and between various tissues.
Collapse
Affiliation(s)
- J D Cleary
- Program of Genetics and Genomic Biology, The Hospital for Sick Children, and Department of Molecular and Medical Genetics, University of Toronto, Toronto, Ontario, Canada
| | | |
Collapse
|
48
|
Callinan PA, Hedges DJ, Salem AH, Xing J, Walker JA, Garber RK, Watkins WS, Bamshad MJ, Jorde LB, Batzer MA. Comprehensive analysis of Alu-associated diversity on the human sex chromosomes. Gene 2003; 317:103-10. [PMID: 14604797 DOI: 10.1016/s0378-1119(03)00662-0] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A comprehensive analysis of the human sex chromosomes was undertaken to assess Alu-associated human genomic diversity and to identify novel Alu insertion polymorphisms for the study of human evolution. Three hundred forty-five recently integrated Alu elements from eight different Alu subfamilies were identified on the X and Y chromosomes, 225 of which were selected and analyzed by polymerase chain reaction (PCR). From a total of 225 elements analyzed, 16 were found to be polymorphic on the X chromosome and one on the Y chromosome. In line with previous research using other classes of genetic markers, our results indicate reduced Alu-associated insertion polymorphism on the human sex chromosomes, presumably reflective of the reduced recombination rates and lower effective population sizes on the sex chromosomes. The Alu insertion polymorphisms identified in this study should prove useful for the study of human population genetics.
Collapse
Affiliation(s)
- Pauline A Callinan
- Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA
| | | | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Abstract
The eukaryotic genome has undergone a series of epidemics of amplification of mobile elements that have resulted in most eukaryotic genomes containing much more of this 'junk' DNA than actual coding DNA. The majority of these elements utilize an RNA intermediate and are termed retroelements. Most of these retroelements appear to amplify in evolutionary waves that insert in the genome and then gradually diverge. In humans, almost half of the genome is recognizably derived from retroelements, with the two elements that are currently actively amplifying, L1 and Alu, making up about 25% of the genome and contributing extensively to disease. The mechanisms of this amplification process are beginning to be understood, although there are still more questions than answers. Insertion of new retroelements may directly damage the genome, and the presence of multiple copies of these elements throughout the genome has longer-term influences on recombination events in the genome and more subtle influences on gene expression.
Collapse
Affiliation(s)
- Prescott L Deininger
- Tulane Cancer Center, Department of Environmental Health Sciences, Tulane University Health Sciences Center, New Orleans, Louisiana 70112, USA.
| | | |
Collapse
|