1
|
Vieira Mourato B, Haubold B. Detection and annotation of unique regions in mammalian genomes. G3 (BETHESDA, MD.) 2025; 15:jkae257. [PMID: 39503253 PMCID: PMC11708210 DOI: 10.1093/g3journal/jkae257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/26/2024] [Accepted: 10/28/2024] [Indexed: 01/11/2025]
Abstract
Long unique genomic regions have been reported to be highly enriched for developmental genes in mice and humans. In this paper, we identify unique genomic regions using an efficient method based on fast string matching. We quantify the resource consumption and accuracy of this method before applying it to the genomes of 18 mammals. We annotate their unique regions (URs) of at least 10 kb and find that they are strongly enriched for developmental genes across the board. We then investigated the subset of URs that lack annotations, which we call "anonymous." The longest anonymous UR in the Tasmanian devil spanned 83 kb and contained the gene encoding inositol polyphosphate-5-phosphatase A, which is an essential part of intracellular signaling. This discovery of an essential gene in a UR implies that URs might be given priority when annotating mammalian genomes. Our documented pipeline for annotating URs in any mammalian genome is available from the repository github.com/evolbioinf/auger; the additional data for this study are available from the dataverse at doi.org/10.17617/3.4IKQAG.
Collapse
Affiliation(s)
- Beatriz Vieira Mourato
- Research Group Bioinformatics, Max-Planck-Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, Schleswig-Holstein 24306, Germany
| | - Bernhard Haubold
- Research Group Bioinformatics, Max-Planck-Institute for Evolutionary Biology, August-Thienemann-Str. 2, Plön, Schleswig-Holstein 24306, Germany
| |
Collapse
|
2
|
Gozashti L, Hartl DL, Corbett-Detig R. Universal signatures of transposable element compartmentalization across eukaryotic genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.17.562820. [PMID: 38585780 PMCID: PMC10996525 DOI: 10.1101/2023.10.17.562820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
The evolutionary mechanisms that drive the emergence of genome architecture remain poorly understood but can now be assessed with unprecedented power due to the massive accumulation of genome assemblies spanning phylogenetic diversity1,2. Transposable elements (TEs) are a rich source of large-effect mutations since they directly and indirectly drive genomic structural variation and changes in gene expression3. Here, we demonstrate universal patterns of TE compartmentalization across eukaryotic genomes spanning ~1.7 billion years of evolution, in which TEs colocalize with gene families under strong predicted selective pressure for dynamic evolution and involved in specific functions. For non-pathogenic species these genes represent families involved in defense, sensory perception and environmental interaction, whereas for pathogenic species, TE-compartmentalized genes are highly enriched for pathogenic functions. Many TE-compartmentalized gene families display signatures of positive selection at the molecular level. Furthermore, TE-compartmentalized genes exhibit an excess of high-frequency alleles for polymorphic TE insertions in fruit fly populations. We postulate that these patterns reflect selection for adaptive TE insertions as well as TE-associated structural variants. This process may drive the emergence of a shared TE-compartmentalized genome architecture across diverse eukaryotic lineages.
Collapse
Affiliation(s)
- Landen Gozashti
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Daniel L. Hartl
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| |
Collapse
|
3
|
Mahlfeld K, Parenti LR. Croizat's form-making, RNA networks, and biogeography. HISTORY AND PHILOSOPHY OF THE LIFE SCIENCES 2023; 45:42. [PMID: 38010532 PMCID: PMC10682228 DOI: 10.1007/s40656-023-00597-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 10/18/2023] [Indexed: 11/29/2023]
Abstract
Advances in technology have increased our knowledge of the processes that effect genomic changes and of the roles of RNA networks in biocommunication, functionality, and evolution of genomes. Natural genetic engineering and genomic inscription occur at all levels of life: cell cycles, development, and evolution. This has implications for phylogenetic studies and for biogeography, particularly given the general acceptance of using molecular clocks as arbiters between vicariance and dispersal explanations in biogeography. Léon Croizat's development of panbiogeography and his explanation for the distribution patterns of organisms are based on concepts of dispersal, differential form-making, and ancestor that differ from concepts of descent used broadly in phylogenetic and biogeographic studies. Croizat's differential form-making is consistent with the extensive roles ascribed to RNAs in development and evolution and recent discoveries of genome studies. Evolutionary-developmental biology (evo-devo), including epigenetics, and the role of RNAs should be incorporated into biogeography.
Collapse
Affiliation(s)
- Karin Mahlfeld
- Openlabnz, 5 Imlay Crescent, Wellington, Ngaio, 6035, New Zealand.
| | - Lynne R Parenti
- Division of Fishes, National Museum of Natural History, Smithsonian Institution, Washington, DC, 20560, USA
| |
Collapse
|
4
|
Lallemand T, Leduc M, Desmazières A, Aubourg S, Rizzon C, Landès C, Celton JM. Insights into the Evolution of Ohnologous Sequences and Their Epigenetic Marks Post-WGD in Malus Domestica. Genome Biol Evol 2023; 15:evad178. [PMID: 37847638 PMCID: PMC10601995 DOI: 10.1093/gbe/evad178] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2023] [Revised: 08/25/2023] [Accepted: 10/02/2023] [Indexed: 10/19/2023] Open
Abstract
A Whole Genome Duplication (WGD) event occurred several Ma in a Rosaceae ancestor, giving rise to the Maloideae subfamily which includes today many pome fruits such as pear (Pyrus communis) and apple (Malus domestica). This complete and well-conserved genome duplication makes the apple an organism of choice to study the early evolutionary events occurring to ohnologous chromosome fragments. In this study, we investigated gene sequence evolution and expression, transposable elements (TE) density, and DNA methylation level. Overall, we identified 16,779 ohnologous gene pairs in the apple genome, confirming the relatively recent WGD. We identified several imbalances in QTL localization among duplicated chromosomal fragments and characterized various biases in genome fractionation, gene transcription, TE densities, and DNA methylation. Our results suggest a particular chromosome dominance in this autopolyploid species, a phenomenon that displays similarities with subgenome dominance that has only been described so far in allopolyploids.
Collapse
Affiliation(s)
- Tanguy Lallemand
- Université d’Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, France
| | - Martin Leduc
- Université d’Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, France
| | - Adèle Desmazières
- Université d’Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, France
| | - Sébastien Aubourg
- Université d’Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, France
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d’Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d’Evry Val d’Essonne, Evry, France
| | - Claudine Landès
- Université d’Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, France
| | - Jean-Marc Celton
- Université d’Angers, Institut Agro, INRAE, IRHS, SFR QUASAV, Angers, France
| |
Collapse
|
5
|
Mattick JS. RNA out of the mist. Trends Genet 2023; 39:187-207. [PMID: 36528415 DOI: 10.1016/j.tig.2022.11.001] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 11/08/2022] [Accepted: 11/27/2022] [Indexed: 12/23/2022]
Abstract
RNA has long been regarded primarily as the intermediate between genes and proteins. It was a surprise then to discover that eukaryotic genes are mosaics of mRNA sequences interrupted by large tracts of transcribed but untranslated sequences, and that multicellular organisms also express many long 'intergenic' and antisense noncoding RNAs (lncRNAs). The identification of small RNAs that regulate mRNA translation and half-life did not disturb the prevailing view that animals and plant genomes are full of evolutionary debris and that their development is mainly supervised by transcription factors. Gathering evidence to the contrary involved addressing the low conservation, expression, and genetic visibility of lncRNAs, demonstrating their cell-specific roles in cell and developmental biology, and their association with chromatin-modifying complexes and phase-separated domains. The emerging picture is that most lncRNAs are the products of genetic loci termed 'enhancers', which marshal generic effector proteins to their sites of action to control cell fate decisions during development.
Collapse
Affiliation(s)
- John S Mattick
- School of Biotechnology and Biomolecular Sciences, UNSW, Sydney, NSW 2052, Australia; UNSW RNA Institute, UNSW, Sydney, NSW 2052, Australia.
| |
Collapse
|
6
|
Mendizábal-Castillero M, Merlo MA, Cross I, Rodríguez ME, Rebordinos L. Genomic Characterization of hox Genes in Senegalese Sole ( Solea senegalensis, Kaup 1858): Clues to Evolutionary Path in Pleuronectiformes. Animals (Basel) 2022; 12:ani12243586. [PMID: 36552509 PMCID: PMC9774920 DOI: 10.3390/ani12243586] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Revised: 12/12/2022] [Accepted: 12/15/2022] [Indexed: 12/23/2022] Open
Abstract
The Senegalese sole (Solea senegalensis, Kaup 1858), a marine flatfish, belongs to the Pleuronectiformes order. It is a commercially important species for fisheries and aquaculture. However, in aquaculture, several production bottlenecks have still to be resolved, including skeletal deformities and high mortality during the larval and juvenile phase. The study aims to characterize the hox gene clusters in S. senegalensis to understand better the developmental and metamorphosis process in this species. Using a BAC library, the clones that contain hox genes were isolated, sequenced by NGS and used as BAC-FISH probes. Subsequently the hox clusters were studied by sequence analysis, comparative genomics, and cytogenetic and phylogenetic analysis. Cytogenetic analysis demonstrated the localization of four BAC clones on chromosome pairs 4, 12, 13, and 16 of the Senegalese sole cytogenomic map. Comparative and phylogenetic analysis showed a highly conserved organization in each cluster and different phylogenetic clustering in each hox cluster. Analysis of structural and repetitive sequences revealed accumulations of polymorphisms mediated by repetitive elements in the hoxba cluster, mainly retroelements. Therefore, a possible loss of the hoxb7a gene can be established in the Pleuronectiformes lineage. This work allows the organization and regulation of hox clusters to be understood, and is a good base for further studies of expression patterns.
Collapse
|
7
|
Wucherpfennig JI, Howes TR, Au JN, Au EH, Roberts Kingman GA, Brady SD, Herbert AL, Reimchen TE, Bell MA, Lowe CB, Dalziel AC, Kingsley DM. Evolution of stickleback spines through independent cis-regulatory changes at HOXDB. Nat Ecol Evol 2022; 6:1537-1552. [PMID: 36050398 PMCID: PMC9525239 DOI: 10.1038/s41559-022-01855-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Accepted: 07/19/2022] [Indexed: 11/10/2022]
Abstract
Understanding the mechanisms leading to new traits or additional features in organisms is a fundamental goal of evolutionary biology. We show that HOXDB regulatory changes have been used repeatedly in different fish genera to alter the length and number of the prominent dorsal spines used to classify stickleback species. In Gasterosteus aculeatus (typically 'three-spine sticklebacks'), a variant HOXDB allele is genetically linked to shortening an existing spine and adding an additional spine. In Apeltes quadracus (typically 'four-spine sticklebacks'), a variant HOXDB allele is associated with lengthening a spine and adding an additional spine in natural populations. The variant alleles alter the same non-coding enhancer region in the HOXDB locus but do so by diverse mechanisms, including single-nucleotide polymorphisms, deletions and transposable element insertions. The independent regulatory changes are linked to anterior expansion or contraction of HOXDB expression. We propose that associated changes in spine lengths and numbers are partial identity transformations in a repeating skeletal series that forms major defensive structures in fish. Our findings support the long-standing hypothesis that natural Hox gene variation underlies key patterning changes in wild populations and illustrate how different mutational mechanisms affecting the same region may produce opposite gene expression changes with similar phenotypic outcomes.
Collapse
Affiliation(s)
- Julia I Wucherpfennig
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Timothy R Howes
- Department of Chemical and Systems Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Jessica N Au
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Eric H Au
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | | | - Shannon D Brady
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Amy L Herbert
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA
| | - Thomas E Reimchen
- Department of Biology, University of Victoria, Victoria, British Columbia, Canada
| | - Michael A Bell
- University of California Museum of Paleontology, University of California, Berkeley, CA, USA
| | - Craig B Lowe
- Department of Molecular Genetics and Microbiology, Duke University School of Medicine, Durham, NC, USA
| | - Anne C Dalziel
- Department of Biology, Saint Mary's University, Halifax, Nova Scotia, Canada
| | - David M Kingsley
- Department of Developmental Biology, Stanford University School of Medicine, Stanford, CA, USA.
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA.
| |
Collapse
|
8
|
Chesnokova E, Beletskiy A, Kolosov P. The Role of Transposable Elements of the Human Genome in Neuronal Function and Pathology. Int J Mol Sci 2022; 23:5847. [PMID: 35628657 PMCID: PMC9148063 DOI: 10.3390/ijms23105847] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/17/2022] [Accepted: 05/19/2022] [Indexed: 12/13/2022] Open
Abstract
Transposable elements (TEs) have been extensively studied for decades. In recent years, the introduction of whole-genome and whole-transcriptome approaches, as well as single-cell resolution techniques, provided a breakthrough that uncovered TE involvement in host gene expression regulation underlying multiple normal and pathological processes. Of particular interest is increased TE activity in neuronal tissue, and specifically in the hippocampus, that was repeatedly demonstrated in multiple experiments. On the other hand, numerous neuropathologies are associated with TE dysregulation. Here, we provide a comprehensive review of literature about the role of TEs in neurons published over the last three decades. The first chapter of the present review describes known mechanisms of TE interaction with host genomes in general, with the focus on mammalian and human TEs; the second chapter provides examples of TE exaptation in normal neuronal tissue, including TE involvement in neuronal differentiation and plasticity; and the last chapter lists TE-related neuropathologies. We sought to provide specific molecular mechanisms of TE involvement in neuron-specific processes whenever possible; however, in many cases, only phenomenological reports were available. This underscores the importance of further studies in this area.
Collapse
Affiliation(s)
- Ekaterina Chesnokova
- Laboratory of Cellular Neurobiology of Learning, Institute of Higher Nervous Activity and Neurophysiology of the Russian Academy of Sciences, 117485 Moscow, Russia; (A.B.); (P.K.)
| | | | | |
Collapse
|
9
|
Cano-Sánchez E, Rodríguez-Gómez F, Ruedas LA, Oyama K, León-Paniagua L, Mastretta-Yanes A, Velazquez A. Using Ultraconserved Elements to Unravel Lagomorph Phylogenetic Relationships. J MAMM EVOL 2022. [DOI: 10.1007/s10914-021-09595-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
10
|
Correa M, Lerat E, Birmelé E, Samson F, Bouillon B, Normand K, Rizzon C. The Transposable Element Environment of Human Genes Differs According to Their Duplication Status and Essentiality. Genome Biol Evol 2021; 13:6273345. [PMID: 33973013 PMCID: PMC8155550 DOI: 10.1093/gbe/evab062] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/17/2021] [Indexed: 12/13/2022] Open
Abstract
Transposable elements (TEs) are major components of eukaryotic genomes and represent approximately 45% of the human genome. TEs can be important sources of novelty in genomes and there is increasing evidence that TEs contribute to the evolution of gene regulation in mammals. Gene duplication is an evolutionary mechanism that also provides new genetic material and opportunities to acquire new functions. To investigate how duplicated genes are maintained in genomes, here, we explored the TE environment of duplicated and singleton genes. We found that singleton genes have more short-interspersed nuclear elements and DNA transposons in their vicinity than duplicated genes, whereas long-interspersed nuclear elements and long-terminal repeat retrotransposons have accumulated more near duplicated genes. We also discovered that this result is highly associated with the degree of essentiality of the genes with an unexpected accumulation of short-interspersed nuclear elements and DNA transposons around the more-essential genes. Our results underline the importance of taking into account the TE environment of genes to better understand how duplicated genes are maintained in genomes.
Collapse
Affiliation(s)
- Margot Correa
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Emmanuelle Lerat
- Laboratoire de Biométrie et Biologie Evolutive, UMR 5558, Université de Lyon, Université Lyon 1, CNRS, Villeurbanne, France
| | - Etienne Birmelé
- Laboratoire MAP5 UMR 8145, Université de Paris, Paris, France
| | - Franck Samson
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Bérengère Bouillon
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Kévin Normand
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| | - Carène Rizzon
- Laboratoire de Mathématiques et Modélisation d'Evry (LaMME), UMR CNRS 8071, ENSIIE, USC INRA, Université d'Evry Val d'Essonne, Evry, France
| |
Collapse
|
11
|
Parada A, Hanson J, D'Elía G. Ultraconserved Elements Improve the Resolution of Difficult Nodes within the Rapid Radiation of Neotropical Sigmodontine Rodents (Cricetidae: Sigmodontinae). Syst Biol 2021; 70:1090-1100. [PMID: 33787920 DOI: 10.1093/sysbio/syab023] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2019] [Revised: 03/23/2021] [Accepted: 03/29/2021] [Indexed: 11/14/2022] Open
Abstract
Sigmodontine rodents (Cricetidae, Sigmodontinae) represent the second largest muroid subfamily and the most species-rich group of New World mammals, encompassing above 410 living species and ca. 87 genera. Even with advances on the clarification of sigmodontine phylogenetic relationships that have been made recently, the phylogenetic relationships among the 12 main group of genera (i.e., tribes) remain poorly resolved, in particular among those forming the large clade Oryzomyalia. This pattern has been interpreted as consequence of a rapid radiation upon the group entrance into South America. Here, we attempted to resolve phylogenetic relationships within Sigmodontinae using target capture and high-throughput sequencing of ultraconserved elements (UCEs). We enriched and sequenced UCEs for 56 individuals and collected data from four already available genomes. Analyses of distinct data sets, based on the capture of 4,634 loci, resulted in a highly resolved phylogeny consistent across different methods. Coalescent species-tree based approaches, concatenated matrices, and Bayesian analyses recovered similar topologies that were congruent at the resolution of difficult nodes. We recovered good support for the intertribal relationships within Oryzomyalia; for instance, the tribe Oryzomyini appears as the sister taxa of the remaining oryzomyalid tribes. The estimates of divergence times agree with results of previous studies. We inferred the crown age of the sigmodontine rodents at the end of Middle Miocene, while the main lineages of Oryzomyalia appear to have radiated in a short interval during the Late Miocene. Thus, the collection of a genomic scale data set with a wide taxonomic sampling, provided resolution for the first time of the relationships among the main lineages of Sigmodontinae. We expect the phylogeny presented here will become the backbone for future systematic and evolutionary studies of the group.
Collapse
Affiliation(s)
- Andrés Parada
- Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile
| | - John Hanson
- RTLGenomics, Lubbock, TX, USA. Department of Biology, Columbus State University, Columbus, GA, USA
| | - Guillermo D'Elía
- Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile
| |
Collapse
|
12
|
Phylogenomic Reconstruction of the Neotropical Poison Frogs (Dendrobatidae) and Their Conservation. DIVERSITY-BASEL 2019. [DOI: 10.3390/d11080126] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
The evolutionary history of the Dendrobatidae, the charismatic Neotropical poison frog family, remains in flux, even after a half-century of intensive research. Understanding the evolutionary relationships between dendrobatid genera and the larger-order groups within Dendrobatidae is critical for making accurate assessments of all aspects of their biology and evolution. In this study, we provide the first phylogenomic reconstruction of Dendrobatidae with genome-wide nuclear markers known as ultraconserved elements. We performed sequence capture on 61 samples representing 33 species across 13 of the 16 dendrobatid genera, aiming for a broadly representative taxon sample. We compare topologies generated using maximum likelihood and coalescent methods and estimate divergence times using Bayesian methods. We find most of our dendrobatid tree to be consistent with previously published results based on mitochondrial and low-count nuclear data, with notable exceptions regarding the placement of Hyloxalinae and certain genera within Dendrobatinae. We also characterize how the evolutionary history and geographic distributions of the 285 poison frog species impact their conservation status. We hope that our phylogeny will serve as a backbone for future evolutionary studies and that our characterizations of conservation status inform conservation practices while highlighting taxa in need of further study.
Collapse
|
13
|
Speciation, gene flow, and seasonal migration in Catharus thrushes (Aves:Turdidae). Mol Phylogenet Evol 2019; 139:106564. [PMID: 31330265 DOI: 10.1016/j.ympev.2019.106564] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2019] [Revised: 07/16/2019] [Accepted: 07/16/2019] [Indexed: 10/26/2022]
Abstract
New World thrushes in the genus Catharus are small, insectivorous or omnivorous birds that have been used to explore several important questions in avian evolution, including the evolution of seasonal migration and plumage variation. Within Catharus, members of a clade of obligate long-distance migrants (C. fuscescens, C. minimus, and C. bicknelli) have also been used in the development of heteropatric speciation theory, a divergence process in which migratory lineages (which might occur in allopatry or sympatry during portions of their annual cycle) diverge despite low levels of gene flow. However, research on Catharus relationships has thus far been restricted to the use of small genetic datasets, which provide limited resolution of both phylogenetic and demographic histories. We used a large, multi-locus dataset from loci containing ultraconserved elements (UCEs) to study the demographic histories of the migratory C. fuscescens-minimus-bicknelli clade and to resolve the phylogeny of the migratory species of Catharus. Our dataset included more than 2000 loci and over 1700 variable genotyped sites, and analyses supported our prediction of divergence with gene flow in the fully migratory clade, with significant gene flow among all three species. Our phylogeny of the genus differs from past work in its placement of C. ustulatus, and further analyses suggest historic gene flow throughout the genus, producing genetically reticulate (or network) phylogenies. This raises questions about trait origins and suggests that seasonal migration and the resulting migratory condition of heteropatry is likely to promote hybridization not only during pairwise divergence and speciation, but also among non-sisters.
Collapse
|
14
|
Flasch DA, Macia Á, Sánchez L, Ljungman M, Heras SR, García-Pérez JL, Wilson TE, Moran JV. Genome-wide de novo L1 Retrotransposition Connects Endonuclease Activity with Replication. Cell 2019; 177:837-851.e28. [PMID: 30955886 DOI: 10.1016/j.cell.2019.02.050] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2018] [Revised: 01/10/2019] [Accepted: 02/25/2019] [Indexed: 12/18/2022]
Abstract
L1 retrotransposon-derived sequences comprise approximately 17% of the human genome. Darwinian selective pressures alter L1 genomic distributions during evolution, confounding the ability to determine initial L1 integration preferences. Here, we generated high-confidence datasets of greater than 88,000 engineered L1 insertions in human cell lines that act as proxies for cells that accommodate retrotransposition in vivo. Comparing these insertions to a null model, in which L1 endonuclease activity is the sole determinant dictating L1 integration preferences, demonstrated that L1 insertions are not significantly enriched in genes, transcribed regions, or open chromatin. By comparison, we provide compelling evidence that the L1 endonuclease disproportionately cleaves predominant lagging strand DNA replication templates, while lagging strand 3'-hydroxyl groups may prime endonuclease-independent L1 retrotransposition in a Fanconi anemia cell line. Thus, acquisition of an endonuclease domain, in conjunction with the ability to integrate into replicating DNA, allowed L1 to become an autonomous, interspersed retrotransposon.
Collapse
Affiliation(s)
- Diane A Flasch
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, 48109, USA.
| | - Ángela Macia
- Department of Genomic Medicine, GENYO: Centre for Genomics and Oncology (Pfizer-University of Granada and Andalusian Regional Government), PTS Granada, 18016, Spain
| | - Laura Sánchez
- Department of Genomic Medicine, GENYO: Centre for Genomics and Oncology (Pfizer-University of Granada and Andalusian Regional Government), PTS Granada, 18016, Spain
| | - Mats Ljungman
- Department of Radiation Oncology, University of Michigan Comprehensive Cancer Center, Translational Oncology Program and Center for RNA Biomedicine, University of Michigan, Ann Arbor, Michigan, 48109, USA; Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, Michigan, 48109, USA
| | - Sara R Heras
- Department of Genomic Medicine, GENYO: Centre for Genomics and Oncology (Pfizer-University of Granada and Andalusian Regional Government), PTS Granada, 18016, Spain
| | - José L García-Pérez
- Department of Genomic Medicine, GENYO: Centre for Genomics and Oncology (Pfizer-University of Granada and Andalusian Regional Government), PTS Granada, 18016, Spain; Medical Research Council Human Genetics Unit, Institute of Genetics and Molecular Medicine (IGMM), University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Thomas E Wilson
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, 48109, USA; Department of Pathology, University of Michigan Medical School, Ann Arbor, Michigan, 48109, USA.
| | - John V Moran
- Department of Human Genetics, University of Michigan Medical School, Ann Arbor, Michigan, 48109, USA; Department of Internal Medicine, University of Michigan Medical School, Ann Arbor, Michigan, 48109, USA.
| |
Collapse
|
15
|
Sugino K, Clark E, Schulmann A, Shima Y, Wang L, Hunt DL, Hooks BM, Tränkner D, Chandrashekar J, Picard S, Lemire AL, Spruston N, Hantman AW, Nelson SB. Mapping the transcriptional diversity of genetically and anatomically defined cell populations in the mouse brain. eLife 2019; 8:38619. [PMID: 30977723 PMCID: PMC6499542 DOI: 10.7554/elife.38619] [Citation(s) in RCA: 49] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2018] [Accepted: 04/11/2019] [Indexed: 01/27/2023] Open
Abstract
Understanding the principles governing neuronal diversity is a fundamental goal for neuroscience. Here, we provide an anatomical and transcriptomic database of nearly 200 genetically identified cell populations. By separately analyzing the robustness and pattern of expression differences across these cell populations, we identify two gene classes contributing distinctly to neuronal diversity. Short homeobox transcription factors distinguish neuronal populations combinatorially, and exhibit extremely low transcriptional noise, enabling highly robust expression differences. Long neuronal effector genes, such as channels and cell adhesion molecules, contribute disproportionately to neuronal diversity, based on their patterns rather than robustness of expression differences. By linking transcriptional identity to genetic strains and anatomical atlases, we provide an extensive resource for further investigation of mouse neuronal cell types.
Collapse
Affiliation(s)
- Ken Sugino
- Janelia Research CampusAshburnUnited States
| | | | | | | | - Lihua Wang
- Janelia Research CampusAshburnUnited States
| | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
DiBattista JD, Alfaro ME, Sorenson L, Choat JH, Hobbs JA, Sinclair‐Taylor TH, Rocha LA, Chang J, Luiz OJ, Cowman PF, Friedman M, Berumen ML. Ice ages and butterflyfishes: Phylogenomics elucidates the ecological and evolutionary history of reef fishes in an endemism hotspot. Ecol Evol 2018; 8:10989-11008. [PMID: 30519422 PMCID: PMC6262737 DOI: 10.1002/ece3.4566] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2018] [Revised: 08/19/2018] [Accepted: 08/29/2018] [Indexed: 01/19/2023] Open
Abstract
For tropical marine species, hotspots of endemism occur in peripheral areas furthest from the center of diversity, but the evolutionary processes that lead to their origin remain elusive. We test several hypotheses related to the evolution of peripheral endemics by sequencing ultraconserved element (UCE) loci to produce a genome-scale phylogeny of 47 butterflyfish species (family Chaetodontidae) that includes all shallow water butterflyfish from the coastal waters of the Arabian Peninsula (i.e., Red Sea to Arabian Gulf) and their close relatives. Bayesian tree building methods produced a well-resolved phylogeny that elucidated the origins of butterflyfishes in this hotspots of endemism. We show that UCEs, often used to resolve deep evolutionary relationships, represent an important tool to assess the mechanisms underlying recently diverged taxa. Our analyses indicate that unique environmental conditions in the coastal waters of the Arabian Peninsula probably contributed to the formation of endemic butterflyfishes. Older endemic species are also associated with narrow versus broad depth ranges, suggesting that adaptation to deeper coral reefs in this region occurred only recently (<1.75 Ma). Even though deep reef environments were drastically reduced during the extreme low sea level stands of glacial ages, shallow reefs persisted, and as such there was no evidence supporting mass extirpation of fauna in this region.
Collapse
Affiliation(s)
- Joseph D. DiBattista
- Red Sea Research Center, Division of Biological and Environmental Science and EngineeringKing Abdullah University of Science and TechnologyThuwalSaudi Arabia
- Australian Museum Research Institute, Australian MuseumSydneyNew South WalesAustralia
- School of Molecular and Life SciencesCurtin UniversityPerthWestern AustraliaAustralia
| | - Michael E. Alfaro
- Department of Ecology and Evolutionary BiologyUniversity of California Los AngelesLos AngelesCalifornia
| | - Laurie Sorenson
- Department of Ecology and Evolutionary BiologyUniversity of California Los AngelesLos AngelesCalifornia
| | - John H. Choat
- College of Science and EngineeringJames Cook UniversityTownsvilleQueenslandAustralia
| | - Jean‐Paul A. Hobbs
- School of Molecular and Life SciencesCurtin UniversityPerthWestern AustraliaAustralia
| | - Tane H. Sinclair‐Taylor
- Red Sea Research Center, Division of Biological and Environmental Science and EngineeringKing Abdullah University of Science and TechnologyThuwalSaudi Arabia
| | - Luiz A. Rocha
- Section of IchthyologyCalifornia Academy of SciencesSan FranciscoCalifornia
| | - Jonathan Chang
- Department of Ecology and Evolutionary BiologyUniversity of California Los AngelesLos AngelesCalifornia
| | - Osmar J. Luiz
- Research Institute for the Environment and Livelihoods, Charles Darwin UniversityDarwinNorthern TerritoryAustralia
| | - Peter F. Cowman
- ARC Centre of Excellence for Coral Reef StudiesJames Cook UniversityTownsvilleQueenslandAustralia
| | - Matt Friedman
- Department of Earth SciencesUniversity of OxfordOxfordUK
- Museum of Paleontology and Department of Earth and Environmental SciencesUniversity of MichiganAnn ArborMichigan
| | - Michael L. Berumen
- Red Sea Research Center, Division of Biological and Environmental Science and EngineeringKing Abdullah University of Science and TechnologyThuwalSaudi Arabia
| |
Collapse
|
17
|
Winker K, Glenn TC, Faircloth BC. Ultraconserved elements (UCEs) illuminate the population genomics of a recent, high-latitude avian speciation event. PeerJ 2018; 6:e5735. [PMID: 30310754 PMCID: PMC6174879 DOI: 10.7717/peerj.5735] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2018] [Accepted: 09/05/2018] [Indexed: 01/08/2023] Open
Abstract
Using a large, consistent set of loci shared by descent (orthologous) to study relationships among taxa would revolutionize among-lineage comparisons of divergence and speciation processes. Ultraconserved elements (UCEs), highly conserved regions of the genome, offer such genomic markers. The utility of UCEs for deep phylogenetics is clearly established and there are mature analytical frameworks available, but fewer studies apply UCEs to recent evolutionary events, creating a need for additional example datasets and analytical approaches. We used UCEs to study population genomics in snow and McKay's buntings (Plectrophenax nivalis and P. hyperboreus). Prior work suggested divergence of these sister species during the last glacial maximum (∼18-74 Kya). With a sequencing depth of ∼30× from four individuals of each species, we used a series of analysis tools to genotype both alleles, obtaining a complete dataset of 2,635 variable loci (∼3.6 single nucleotide polymorphisms/locus) and 796 invariable loci. We found no fixed allelic differences between the lineages, and few loci had large allele frequency differences. Nevertheless, individuals were 100% diagnosable to species, and the two taxa were different genetically (F ST = 0.034; P = 0.03). The demographic model best fitting the data was one of divergence with gene flow. Estimates of demographic parameters differed from published mtDNA research, with UCE data suggesting lower effective population sizes (∼92,500-240,500 individuals), a deeper divergence time (∼241,000 years), and lower gene flow (2.8-5.2 individuals per generation). Our methods provide a framework for future population studies using UCEs, and our results provide additional evidence that UCEs are useful for answering questions at shallow evolutionary depths.
Collapse
Affiliation(s)
- Kevin Winker
- University of Alaska Museum & Department of Biology and Wildlife, University of Alaska Fairbanks, Fairbanks, AK, USA
| | - Travis C. Glenn
- Department of Environmental Health Science and Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| | - Brant C. Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
| |
Collapse
|
18
|
Li L, Barth NKH, Hirth E, Taher L. Pairs of Adjacent Conserved Noncoding Elements Separated by Conserved Genomic Distances Act as Cis-Regulatory Units. Genome Biol Evol 2018; 10:2535-2550. [PMID: 30184074 PMCID: PMC6161761 DOI: 10.1093/gbe/evy196] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2018] [Indexed: 01/02/2023] Open
Abstract
Comparative genomic studies have identified thousands of conserved noncoding elements (CNEs) in the mammalian genome, many of which have been reported to exert cis-regulatory activity. We analyzed ∼5,500 pairs of adjacent CNEs in the human genome and found that despite divergence at the nucleotide sequence level, the inter-CNE distances of the pairs are under strong evolutionary constraint, with inter-CNE sequences featuring significantly lower transposon densities than expected. Further, we show that different degrees of conservation of the inter-CNE distance are associated with distinct cis-regulatory functions at the CNEs. Specifically, the CNEs in pairs with conserved and mildly contracted inter-CNE sequences are the most likely to represent active or poised enhancers. In contrast, CNEs in pairs with extremely contracted or expanded inter-CNE sequences are associated with no cis-regulatory activity. Furthermore, we observed that functional CNEs in a pair have very similar epigenetic profiles, hinting at a functional relationship between them. Taken together, our results support the existence of epistatic interactions between adjacent CNEs that are distance-sensitive and disrupted by transposon insertions and deletions, and contribute to our understanding of the selective forces acting on cis-regulatory elements, which are crucial for elucidating the molecular mechanisms underlying adaptive evolution and human genetic diseases.
Collapse
Affiliation(s)
- Lifei Li
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Nicolai K H Barth
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Eva Hirth
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| | - Leila Taher
- Division of Bioinformatics, Department of Biology, Friedrich-Alexander-Universität Erlangen-Nürnberg, Erlangen, Germany
| |
Collapse
|
19
|
Soares ML, Edwards CA, Dearden FL, Ferrón SR, Curran S, Corish JA, Rancourt RC, Allen SE, Charalambous M, Ferguson-Smith MA, Rens W, Adams DJ, Ferguson-Smith AC. Targeted deletion of a 170-kb cluster of LINE-1 repeats and implications for regional control. Genome Res 2018; 28:345-356. [PMID: 29367313 PMCID: PMC5848613 DOI: 10.1101/gr.221366.117] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2017] [Accepted: 01/10/2018] [Indexed: 12/31/2022]
Abstract
Approximately half the mammalian genome is composed of repetitive sequences, and accumulating evidence suggests that some may have an impact on genome function. Here, we characterized a large array class of repeats of long-interspersed elements (LINE-1). Although widely distributed in mammals, locations of such arrays are species specific. Using targeted deletion, we asked whether a 170-kb LINE-1 array located at a mouse imprinted domain might function as a modulator of local transcriptional control. The LINE-1 array is lamina associated in differentiated ES cells consistent with its AT-richness, and although imprinting occurs both proximally and distally to the array, active LINE-1 transcripts within the tract are biallelically expressed. Upon deletion of the array, no perturbation of imprinting was observed, and abnormal phenotypes were not detected in maternal or paternal heterozygous or homozygous mutant mice. The array does not shield nonimprinted genes in the vicinity from local imprinting control. Reduced neural expression of protein-coding genes observed upon paternal transmission of the deletion is likely due to the removal of a brain-specific enhancer embedded within the LINE array. Our findings suggest that presence of a 170-kb LINE-1 array reflects the tolerance of the site for repeat insertion rather than an important genomic function in normal development.
Collapse
Affiliation(s)
- Miguel L Soares
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
- Departamento de Biomedicina, Unidade de Biologia Experimental, Faculdade de Medicina da Universidade do Porto, Porto; and i3S-Instituto de Investigação e Inovação em Saúde, Universidade do Porto, 4200-319 Porto, Portugal
| | - Carol A Edwards
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Frances L Dearden
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Sacri R Ferrón
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Scott Curran
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Jennifer A Corish
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Rebecca C Rancourt
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Sarah E Allen
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | - Marika Charalambous
- Department of Genetics, University of Cambridge, Cambridge CB2 3EH, United Kingdom
| | | | - Willem Rens
- Department of Veterinary Medicine, University of Cambridge, Cambridge CB3 0ES, United Kingdom
| | - David J Adams
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom
| | | |
Collapse
|
20
|
Jacob-Hirsch J, Eyal E, Knisbacher BA, Roth J, Cesarkas K, Dor C, Farage-Barhom S, Kunik V, Simon AJ, Gal M, Yalon M, Moshitch-Moshkovitz S, Tearle R, Constantini S, Levanon EY, Amariglio N, Rechavi G. Whole-genome sequencing reveals principles of brain retrotransposition in neurodevelopmental disorders. Cell Res 2018; 28:187-203. [PMID: 29327725 DOI: 10.1038/cr.2018.8] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2017] [Revised: 11/10/2017] [Accepted: 11/20/2017] [Indexed: 02/07/2023] Open
Abstract
Neural progenitor cells undergo somatic retrotransposition events, mainly involving L1 elements, which can be potentially deleterious. Here, we analyze the whole genomes of 20 brain samples and 80 non-brain samples, and characterized the retrotransposition landscape of patients affected by a variety of neurodevelopmental disorders including Rett syndrome, tuberous sclerosis, ataxia-telangiectasia and autism. We report that the number of retrotranspositions in brain tissues is higher than that observed in non-brain samples and even higher in pathologic vs normal brains. The majority of somatic brain retrotransposons integrate into pre-existing repetitive elements, preferentially A/T rich L1 sequences, resulting in nested insertions. Our findings document the fingerprints of encoded endonuclease independent mechanisms in the majority of L1 brain insertion events. The insertions are "non-classical" in that they are truncated at both ends, integrate in the same orientation as the host element, and their target sequences are enriched with a CCATT motif in contrast to the classical endonuclease motif of most other retrotranspositions. We show that L1Hs elements integrate preferentially into genes associated with neural functions and diseases. We propose that pre-existing retrotransposons act as "lightning rods" for novel insertions, which may give fine modulation of gene expression while safeguarding from deleterious events. Overwhelmingly uncontrolled retrotransposition may breach this safeguard mechanism and increase the risk of harmful mutagenesis in neurodevelopmental disorders.
Collapse
Affiliation(s)
- Jasmine Jacob-Hirsch
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel.,Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Israel
| | - Eran Eyal
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | | | - Jonathan Roth
- Department of Pediatric Neurosurgery, Dana Children's Hospital, Tel Aviv Medical Center, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Karen Cesarkas
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Chen Dor
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Sarit Farage-Barhom
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Vered Kunik
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Amos J Simon
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Moran Gal
- Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Israel
| | - Michal Yalon
- Department of Pediatric Hematology-Oncology, Edmond and Lily Safra Children's Hospital, The Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Sharon Moshitch-Moshkovitz
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel
| | - Rick Tearle
- Complete Genomics, 2071 Stierlin Court, Mountain View, CA 94043, USA
| | - Shlomi Constantini
- Department of Pediatric Neurosurgery, Dana Children's Hospital, Tel Aviv Medical Center, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| | - Erez Y Levanon
- Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Israel
| | - Ninette Amariglio
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel.,Mina and Everard Goodman Faculty of Life Sciences, Bar Ilan University, Israel
| | - Gideon Rechavi
- Cancer Research Center and the Wohl Institute of Translational Medicine, the Chaim Sheba Medical Center, Tel Hashomer, Israel.,Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv, Israel
| |
Collapse
|
21
|
Harmston N, Ing-Simmons E, Tan G, Perry M, Merkenschlager M, Lenhard B. Topologically associating domains are ancient features that coincide with Metazoan clusters of extreme noncoding conservation. Nat Commun 2017; 8:441. [PMID: 28874668 PMCID: PMC5585340 DOI: 10.1038/s41467-017-00524-5] [Citation(s) in RCA: 114] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2016] [Accepted: 07/05/2017] [Indexed: 02/08/2023] Open
Abstract
Developmental genes in metazoan genomes are surrounded by dense clusters of conserved noncoding elements (CNEs). CNEs exhibit unexplained extreme levels of sequence conservation, with many acting as developmental long-range enhancers. Clusters of CNEs define the span of regulatory inputs for many important developmental regulators and have been described previously as genomic regulatory blocks (GRBs). Their function and distribution around important regulatory genes raises the question of how they relate to 3D conformation of these loci. Here, we show that clusters of CNEs strongly coincide with topological organisation, predicting the boundaries of hundreds of topologically associating domains (TADs) in human and Drosophila. The set of TADs that are associated with high levels of noncoding conservation exhibit distinct properties compared to TADs devoid of extreme noncoding conservation. The close correspondence between extreme noncoding conservation and TADs suggests that these TADs are ancient, revealing a regulatory architecture conserved over hundreds of millions of years. Metazoan genomes contain many clusters of conserved noncoding elements. Here, the authors provide evidence that these clusters coincide with distinct topologically associating domains in humans and Drosophila, revealing a conserved regulatory genomic architecture.
Collapse
Affiliation(s)
- Nathan Harmston
- Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London, W12 0NN, UK. .,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, W12 0NN, UK. .,Program in Cardiovascular and Metabolic Disease, Duke-NUS Graduate Medical School, 8 College Road, Singapore, 169857, Singapore.
| | - Elizabeth Ing-Simmons
- Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London, W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, W12 0NN, UK.,Lymphocyte Development, MRC London Institute of Medical Sciences, London, W12 0NN, UK
| | - Ge Tan
- Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London, W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, W12 0NN, UK
| | - Malcolm Perry
- Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London, W12 0NN, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, W12 0NN, UK
| | - Matthias Merkenschlager
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, W12 0NN, UK.,Lymphocyte Development, MRC London Institute of Medical Sciences, London, W12 0NN, UK
| | - Boris Lenhard
- Computational Regulatory Genomics, MRC London Institute of Medical Sciences, London, W12 0NN, UK. .,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, W12 0NN, UK. .,Sars International Centre for Marine Molecular Biology, University of Bergen, N-5008, Bergen, Norway.
| |
Collapse
|
22
|
Tosetti V, Sassone J, Ferri ALM, Taiana M, Bedini G, Nava S, Brenna G, Di Resta C, Pareyson D, Di Giulio AM, Carelli S, Parati EA, Gorio A. Transcriptional role of androgen receptor in the expression of long non-coding RNA Sox2OT in neurogenesis. PLoS One 2017; 12:e0180579. [PMID: 28704421 PMCID: PMC5507538 DOI: 10.1371/journal.pone.0180579] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Accepted: 06/16/2017] [Indexed: 11/19/2022] Open
Abstract
The complex architecture of adult brain derives from tightly regulated migration and differentiation of precursor cells generated during embryonic neurogenesis. Changes at transcriptional level of genes that regulate migration and differentiation may lead to neurodevelopmental disorders. Androgen receptor (AR) is a transcription factor that is already expressed during early embryonic days. However, AR role in the regulation of gene expression at early embryonic stage is yet to be determinate. Long non-coding RNA (lncRNA) Sox2 overlapping transcript (Sox2OT) plays a crucial role in gene expression control during development but its transcriptional regulation is still to be clearly defined. Here, using Bicalutamide in order to pharmacologically inactivated AR, we investigated whether AR participates in the regulation of the transcription of the lncRNASox2OTat early embryonic stage. We identified a new DNA binding region upstream of Sox2 locus containing three androgen response elements (ARE), and found that AR binds such a sequence in embryonic neural stem cells and in mouse embryonic brain. Our data suggest that through this binding, AR can promote the RNA polymerase II dependent transcription of Sox2OT. Our findings also suggest that AR participates in embryonic neurogenesis through transcriptional control of the long non-coding RNA Sox2OT.
Collapse
Affiliation(s)
- Valentina Tosetti
- Department of Cerebrovascular Diseases, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milano, Italy
- Laboratory of Pharmacology, Department of Health Sciences, University of Milan, Milan, Italy
| | - Jenny Sassone
- Vita-Salute University and San Raffaele Scientific Institute, Division of Neuroscience, Milan, Italy
| | - Anna L. M. Ferri
- Department of Cerebrovascular Diseases, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milano, Italy
| | - Michela Taiana
- Clinic of Central and Peripheral Degenerative Neuropathies Unit, Department of Clinical Neurosciences, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Gloria Bedini
- Department of Cerebrovascular Diseases, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milano, Italy
| | - Sara Nava
- Cell Therapy Production Unit, Laboratory of Cellular Neurobiology, Cerebrovascular Unit, and Unit of Molecular Neuro-Oncology, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Greta Brenna
- Biostatistician Service Clinical Research—Scientific Department, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Chiara Di Resta
- Vita-Salute San Raffaele University, Milan, Italy
- Division of Genetics and Cell Biology, IRCCS San Raffaele Scientific Institute, Milan, Italy
| | - Davide Pareyson
- Neurological Rare Diseases of Adulthood Unit, Department of Clinical Neurosciences, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milan, Italy
| | - Anna Maria Di Giulio
- Laboratory of Pharmacology, Department of Health Sciences, University of Milan, Milan, Italy
- Pediatric Clinical Research Center Fondazione Romeo e Enrica Invernizzi, University of Milan, Milan, Italy
| | - Stephana Carelli
- Laboratory of Pharmacology, Department of Health Sciences, University of Milan, Milan, Italy
| | - Eugenio A. Parati
- Department of Cerebrovascular Diseases, Fondazione IRCCS Istituto Neurologico Carlo Besta, Milano, Italy
| | - Alfredo Gorio
- Laboratory of Pharmacology, Department of Health Sciences, University of Milan, Milan, Italy
| |
Collapse
|
23
|
Babaian A, Mager DL. Endogenous retroviral promoter exaptation in human cancer. Mob DNA 2016; 7:24. [PMID: 27980689 PMCID: PMC5134097 DOI: 10.1186/s13100-016-0080-x] [Citation(s) in RCA: 159] [Impact Index Per Article: 17.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 11/11/2016] [Indexed: 12/13/2022] Open
Abstract
Cancer arises from a series of genetic and epigenetic changes, which result in abnormal expression or mutational activation of oncogenes, as well as suppression/inactivation of tumor suppressor genes. Aberrant expression of coding genes or long non-coding RNAs (lncRNAs) with oncogenic properties can be caused by translocations, gene amplifications, point mutations or other less characterized mechanisms. One such mechanism is the inappropriate usage of normally dormant, tissue-restricted or cryptic enhancers or promoters that serve to drive oncogenic gene expression. Dispersed across the human genome, endogenous retroviruses (ERVs) provide an enormous reservoir of autonomous gene regulatory modules, some of which have been co-opted by the host during evolution to play important roles in normal regulation of genes and gene networks. This review focuses on the “dark side” of such ERV regulatory capacity. Specifically, we discuss a growing number of examples of normally dormant or epigenetically repressed ERVs that have been harnessed to drive oncogenes in human cancer, a process we term onco-exaptation, and we propose potential mechanisms that may underlie this phenomenon.
Collapse
Affiliation(s)
- Artem Babaian
- Terry Fox Laboratory, British Columbia Cancer Agency, 675 West 10th Avenue, Vancouver, BC V5Z1L3 Canada ; Department of Medical Genetics, University of British Columbia, Vancouver, BC Canada
| | - Dixie L Mager
- Terry Fox Laboratory, British Columbia Cancer Agency, 675 West 10th Avenue, Vancouver, BC V5Z1L3 Canada ; Department of Medical Genetics, University of British Columbia, Vancouver, BC Canada
| |
Collapse
|
24
|
Chuong EB, Elde NC, Feschotte C. Regulatory activities of transposable elements: from conflicts to benefits. Nat Rev Genet 2016; 18:71-86. [PMID: 27867194 DOI: 10.1038/nrg.2016.139] [Citation(s) in RCA: 827] [Impact Index Per Article: 91.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor-binding sites and non-coding RNAs. Many recent studies reinvigorate the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and the conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalysed the evolution of gene-regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic effect of regulatory activities encoded by TEs in health and disease.
Collapse
Affiliation(s)
- Edward B Chuong
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| | - Nels C Elde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| |
Collapse
|
25
|
Buckley RM, Adelson DL. Mammalian genome evolution as a result of epigenetic regulation of transposable elements. Biomol Concepts 2015; 5:183-94. [PMID: 25372752 DOI: 10.1515/bmc-2014-0013] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2014] [Accepted: 05/27/2014] [Indexed: 12/29/2022] Open
Abstract
Transposable elements (TEs) make up a large proportion of mammalian genomes and are a strong evolutionary force capable of rewiring regulatory networks and causing genome rearrangements. Additionally, there are many eukaryotic epigenetic defense mechanisms able to transcriptionally silence TEs. Furthermore, small RNA molecules that target TE DNA sequences often mediate these epigenetic defense mechanisms. As a result, epigenetic marks associated with TE silencing can be reestablished after epigenetic reprogramming - an event during the mammalian life cycle that results in widespread loss of parental epigenetic marks. Furthermore, targeted epigenetic marks associated with TE silencing may have an impact on nearby gene expression. Therefore, TEs may have driven species evolution via their ability to heritably alter the epigenetic regulation of gene expression in mammals.
Collapse
|
26
|
Shahryari A, Jazi MS, Samaei NM, Mowla SJ. Long non-coding RNA SOX2OT: expression signature, splicing patterns, and emerging roles in pluripotency and tumorigenesis. Front Genet 2015; 6:196. [PMID: 26136768 PMCID: PMC4469893 DOI: 10.3389/fgene.2015.00196] [Citation(s) in RCA: 87] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2015] [Accepted: 05/18/2015] [Indexed: 12/18/2022] Open
Abstract
SOX2 overlapping transcript (SOX2OT) is a long non-coding RNA which harbors one of the major regulators of pluripotency, SOX2 gene, in its intronic region. SOX2OT gene is mapped to human chromosome 3q26.3 (Chr3q26.3) locus and is extended in a high conserved region of over 700 kb. Little is known about the exact role of SOX2OT; however, recent studies have demonstrated a positive role for it in transcription regulation of SOX2 gene. Similar to SOX2, SOX2OT is highly expressed in embryonic stem cells and down-regulated upon the induction of differentiation. SOX2OT is dynamically regulated during the embryogenesis of vertebrates, and delimited to the brain in adult mice and human. Recently, the disregulation of SOX2OT expression and its concomitant expression with SOX2 have become highlighted in some somatic cancers including esophageal squamous cell carcinoma, lung squamous cell carcinoma, and breast cancer. Interestingly, SOX2OT is differentially spliced into multiple mRNA-like transcripts in stem and cancer cells. In this review, we are describing the structural and functional features of SOX2OT, with an emphasis on its expression signature, its splicing patterns and its critical function in the regulation of SOX2 expression during development and tumorigenesis.
Collapse
Affiliation(s)
- Alireza Shahryari
- Stem Cell Research Center, Golestan University of Medical Sciences , Gorgan, Iran
| | - Marie Saghaeian Jazi
- Department of Molecular Medicine, Faculty of Advanced Medical Technologies, Golestan University of Medical Sciences , Gorgan, Iran
| | - Nader M Samaei
- Department of Medical Genetics, Faculty of Advanced Medical Technologies, Golestan University of Medical Sciences , Gorgan, Iran
| | - Seyed J Mowla
- Department of Molecular Genetics, Faculty of Biological Sciences, Tarbiat Modares University , Tehran, Iran
| |
Collapse
|
27
|
Gilbert PS, Chang J, Pan C, Sobel EM, Sinsheimer JS, Faircloth BC, Alfaro ME. Genome-wide ultraconserved elements exhibit higher phylogenetic informativeness than traditional gene markers in percomorph fishes. Mol Phylogenet Evol 2015; 92:140-6. [PMID: 26079130 DOI: 10.1016/j.ympev.2015.05.027] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2015] [Revised: 05/13/2015] [Accepted: 05/26/2015] [Indexed: 02/04/2023]
Abstract
Ultraconserved elements (UCEs) have become popular markers in phylogenomic studies because of their cost effectiveness and their potential to resolve problematic phylogenetic relationships. Although UCE datasets typically contain a much larger number of loci and sites than more traditional datasets of PCR-amplified, single-copy, protein coding genes, a fraction of UCE sites are expected to be part of a nearly invariant core, and the relative performance of UCE datasets versus protein coding gene datasets is poorly understood. Here we use phylogenetic informativeness (PI) to compare the resolving power of multi-locus and UCE datasets in a sample of percomorph fishes with sequenced genomes (genome-enabled). We compare three data sets: UCE core regions, flanking sequence adjacent to the UCE core and a set of ten protein coding genes commonly used in fish systematics. We found the net informativeness of UCE core and flank regions to be roughly ten-fold and 100-fold more informative than that of the protein coding genes. On a per locus basis UCEs and protein coding genes exhibited similar levels of phylogenetic informativeness. Our results suggest that UCEs offer enormous potential for resolving relationships across the percomorph tree of life.
Collapse
Affiliation(s)
- Princess S Gilbert
- Department of Ecology & Evolutionary Biology, University of California, Los Angeles, CA, USA.
| | - Jonathan Chang
- Department of Ecology & Evolutionary Biology, University of California, Los Angeles, CA, USA
| | - Calvin Pan
- Department of Medicine, University of California, Los Angeles, CA, USA
| | - Eric M Sobel
- Department of Human Genetics, University of California, Los Angeles, CA, USA
| | - Janet S Sinsheimer
- Department of Biomathematics, University of California, Los Angeles, CA, USA; Department of Human Genetics, University of California, Los Angeles, CA, USA; Department of Biostatistics, University of California, Los Angeles, CA, USA
| | - Brant C Faircloth
- Department of Biological Sciences and Museum of Natural Science, Louisiana State University, Baton Rouge, LA, USA
| | - Michael E Alfaro
- Department of Ecology & Evolutionary Biology, University of California, Los Angeles, CA, USA.
| |
Collapse
|
28
|
Cronin MA, Rincon G, Meredith RW, MacNeil MD, Islas-Trejo A, Cánovas A, Medrano JF. Molecular phylogeny and SNP variation of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) derived from genome sequences. ACTA ACUST UNITED AC 2014; 105:312-23. [PMID: 24477675 DOI: 10.1093/jhered/est133] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
We assessed the relationships of polar bears (Ursus maritimus), brown bears (U. arctos), and black bears (U. americanus) with high throughput genomic sequencing data with an average coverage of 25× for each species. A total of 1.4 billion 100-bp paired-end reads were assembled using the polar bear and annotated giant panda (Ailuropoda melanoleuca) genome sequences as references. We identified 13.8 million single nucleotide polymorphisms (SNP) in the 3 species aligned to the polar bear genome. These data indicate that polar bears and brown bears share more SNP with each other than either does with black bears. Concatenation and coalescence-based analysis of consensus sequences of approximately 1 million base pairs of ultraconserved elements in the nuclear genome resulted in a phylogeny with black bears as the sister group to brown and polar bears, and all brown bears are in a separate clade from polar bears. Genotypes for 162 SNP loci of 336 bears from Alaska and Montana showed that the species are genetically differentiated and there is geographic population structure of brown and black bears but not polar bears.
Collapse
Affiliation(s)
- Matthew A Cronin
- the School of Natural Resources and Agricultural Sciences, University of Alaska Fairbanks, Palmer Research Center, 1509 South Trunk Road, Palmer, AK 99645
| | | | | | | | | | | | | |
Collapse
|
29
|
Carareto CMA, Hernandez EH, Vieira C. Genomic regions harboring insecticide resistance-associated Cyp genes are enriched by transposable element fragments carrying putative transcription factor binding sites in two sibling Drosophila species. Gene 2013; 537:93-9. [PMID: 24361809 DOI: 10.1016/j.gene.2013.11.080] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2011] [Revised: 11/27/2013] [Accepted: 11/30/2013] [Indexed: 11/27/2022]
Abstract
In the present study, an in silico analysis was performed to identify transposable element (TE) fragments inserted in Cyps with functions associated with resistance to insecticides and developmental regulation as well as in neighboring genes in two sibling species, Drosophila melanogaster and Drosophila simulans. The Cyps associated with insecticide resistance and their neighboring non-Cyp genes have accumulated a greater number of TE fragments than the other Cyps or a random sample of genes, predominantly in the 5'-flanking regions. Most of the insertions were due to DNA transposons, with DNAREP1 fragments being the most common. These fragments carry putative binding sites for transcription factors, which reinforces the hypothesis that DNAREP1 may influence gene regulation and play a role in the adaptation of the Drosophila species.
Collapse
Affiliation(s)
- Claudia M A Carareto
- UNESP-Univ. Estadual Paulista, Departamento de Biologia, Laboratório de Evolução Molecular, 15054-1000 São José do Rio Preto, São Paulo, Brazil.
| | - Eric H Hernandez
- UNESP-Univ. Estadual Paulista, Departamento de Biologia, Laboratório de Evolução Molecular, 15054-1000 São José do Rio Preto, São Paulo, Brazil
| | - Cristina Vieira
- Université de Lyon, F-69000, Lyon, Université Lyon 1, CNRS, UMR5558, Laboratoire de Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France; Institut Universitaire de France, France
| |
Collapse
|
30
|
Makunin IV, Shloma VV, Stephen SJ, Pheasant M, Belyakin SN. Comparison of ultra-conserved elements in drosophilids and vertebrates. PLoS One 2013; 8:e82362. [PMID: 24349264 PMCID: PMC3862641 DOI: 10.1371/journal.pone.0082362] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2013] [Accepted: 10/24/2013] [Indexed: 11/18/2022] Open
Abstract
Metazoan genomes contain many ultra-conserved elements (UCEs), long sequences identical between distant species. In this study we identified UCEs in drosophilid and vertebrate species with a similar level of phylogenetic divergence measured at protein-coding regions, and demonstrated that both the length and number of UCEs are larger in vertebrates. The proportion of non-exonic UCEs declines in distant drosophilids whilst an opposite trend was observed in vertebrates. We generated a set of 2,126 Sophophora UCEs by merging elements identified in several drosophila species and compared these to the eutherian UCEs identified in placental mammals. In contrast to vertebrates, the Sophophora UCEs are depleted around transcription start sites. Analysis of 52,954 P-element, piggyBac and Minos insertions in the D. melanogaster genome revealed depletion of the P-element and piggyBac insertions in and around the Sophophora UCEs. We examined eleven fly strains with transposon insertions into the intergenic UCEs and identified associated phenotypes in five strains. Four insertions behave as recessive lethals, and in one case we observed a suppression of the marker gene within the transgene, presumably by silenced chromatin around the integration site. To confirm the lethality is caused by integration of transposons we performed a phenotype rescue experiment for two stocks and demonstrated that the excision of the transposons from the intergenic UCEs restores viability. Sequencing of DNA after the transposon excision in one fly strain with the restored viability revealed a 47 bp insertion at the original transposon integration site suggesting that the nature of the mutation is important for the appearance of the phenotype. Our results suggest that the UCEs in flies and vertebrates have both common and distinct features, and demonstrate that a significant proportion of intergenic drosophila UCEs are sensitive to disruption.
Collapse
Affiliation(s)
- Igor V. Makunin
- Research Computing Centre, The University of Queensland, Brisbane, Queensland, Australia
- Institute of Molecular and Cellular Biology SD RAS, Novosibirsk, Russia
- * E-mail:
| | - Viktor V. Shloma
- Institute of Molecular and Cellular Biology SD RAS, Novosibirsk, Russia
| | - Stuart J. Stephen
- Computational Biology Group, CSIRO Plant Industry, Canberra, Australian Capital Territory, Australia
| | - Michael Pheasant
- Research Computing Centre, The University of Queensland, Brisbane, Queensland, Australia
| | | |
Collapse
|
31
|
Harmston N, Baresic A, Lenhard B. The mystery of extreme non-coding conservation. Philos Trans R Soc Lond B Biol Sci 2013; 368:20130021. [PMID: 24218634 PMCID: PMC3826495 DOI: 10.1098/rstb.2013.0021] [Citation(s) in RCA: 51] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Regions of several dozen to several hundred base pairs of extreme conservation have been found in non-coding regions in all metazoan genomes. The distribution of these elements within and across genomes has suggested that many have roles as transcriptional regulatory elements in multi-cellular organization, differentiation and development. Currently, there is no known mechanism or function that would account for this level of conservation at the observed evolutionary distances. Previous studies have found that, while these regions are under strong purifying selection, and not mutational coldspots, deletion of entire regions in mice does not necessarily lead to identifiable changes in phenotype during development. These opposing findings lead to several questions regarding their functional importance and why they are under strong selection in the first place. In this perspective, we discuss the methods and techniques used in identifying and dissecting these regions, their observed patterns of conservation, and review the current hypotheses on their functional significance.
Collapse
Affiliation(s)
- Nathan Harmston
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College London and MRC Clinical Sciences Centre, , Hammersmith Hospital Campus, Du Cane Road, London W12 0NN, UK
| | | | | |
Collapse
|
32
|
Casa V, Gabellini D. A repetitive elements perspective in Polycomb epigenetics. Front Genet 2012; 3:199. [PMID: 23060903 PMCID: PMC3465993 DOI: 10.3389/fgene.2012.00199] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2012] [Accepted: 09/17/2012] [Indexed: 01/10/2023] Open
Abstract
Repetitive elements comprise over two-thirds of the human genome. For a long time, these elements have received little attention since they were considered non-functional. On the contrary, recent evidence indicates that they play central roles in genome integrity, gene expression, and disease. Indeed, repeats display meiotic instability associated with disease and are located within common fragile sites, which are hotspots of chromosome re-arrangements in tumors. Moreover, a variety of diseases have been associated with aberrant transcription of repetitive elements. Overall this indicates that appropriate regulation of repetitive elements' activity is fundamental. Polycomb group (PcG) proteins are epigenetic regulators that are essential for the normal development of multicellular organisms. Mammalian PcG proteins are involved in fundamental processes, such as cellular memory, cell proliferation, genomic imprinting, X-inactivation, and cancer development. PcG proteins can convey their activity through long-distance interactions also on different chromosomes. This indicates that the 3D organization of PcG proteins contributes significantly to their function. However, it is still unclear how these complex mechanisms are orchestrated and which role PcG proteins play in the multi-level organization of gene regulation. Intriguingly, the greatest proportion of Polycomb-mediated chromatin modifications is located in genomic repeats and it has been suggested that they could provide a binding platform for Polycomb proteins. Here, these lines of evidence are woven together to discuss how repetitive elements could contribute to chromatin organization in the 3D nuclear space.
Collapse
Affiliation(s)
- Valentina Casa
- Division of Regenerative Medicine, Stem Cells, and Gene Therapy, Dulbecco Telethon Institute and San Raffaele Scientific Institute Milano, Italy ; Università Vita-Salute San Raffaele Milano, Italy
| | | |
Collapse
|
33
|
Sproul D, Kitchen RR, Nestor CE, Dixon JM, Sims AH, Harrison DJ, Ramsahoye BH, Meehan RR. Tissue of origin determines cancer-associated CpG island promoter hypermethylation patterns. Genome Biol 2012; 13:R84. [PMID: 23034185 PMCID: PMC3491412 DOI: 10.1186/gb-2012-13-10-r84] [Citation(s) in RCA: 123] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2012] [Revised: 07/13/2012] [Accepted: 10/03/2012] [Indexed: 12/31/2022] Open
Abstract
BACKGROUND Aberrant CpG island promoter DNA hypermethylation is frequently observed in cancer and is believed to contribute to tumor progression by silencing the expression of tumor suppressor genes. Previously, we observed that promoter hypermethylation in breast cancer reflects cell lineage rather than tumor progression and occurs at genes that are already repressed in a lineage-specific manner. To investigate the generality of our observation we analyzed the methylation profiles of 1,154 cancers from 7 different tissue types. RESULTS We find that 1,009 genes are prone to hypermethylation in these 7 types of cancer. Nearly half of these genes varied in their susceptibility to hypermethylation between different cancer types. We show that the expression status of hypermethylation prone genes in the originator tissue determines their propensity to become hypermethylated in cancer; specifically, genes that are normally repressed in a tissue are prone to hypermethylation in cancers derived from that tissue. We also show that the promoter regions of hypermethylation-prone genes are depleted of repetitive elements and that DNA sequence around the same promoters is evolutionarily conserved. We propose that these two characteristics reflect tissue-specific gene promoter architecture regulating the expression of these hypermethylation prone genes in normal tissues. CONCLUSIONS As aberrantly hypermethylated genes are already repressed in pre-cancerous tissue, we suggest that their hypermethylation does not directly contribute to cancer development via silencing. Instead aberrant hypermethylation reflects developmental history and the perturbation of epigenetic mechanisms maintaining these repressed promoters in a hypomethylated state in normal cells.
Collapse
Affiliation(s)
- Duncan Sproul
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Robert R Kitchen
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
- Yale University School of Medicine, Department of Molecular Biophysics & Biochemistry and Department of Psychiatry, 266 Whitney Ave, New Haven, CT 06511, USA
| | - Colm E Nestor
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - J Michael Dixon
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Andrew H Sims
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - David J Harrison
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
- University of St Andrews School of Medicine, Medical and Biological Sciences Building, University of St Andrews, North Haugh, St Andrews KY16 9TF, UK
| | - Bernard H Ramsahoye
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
- Centre for Molecular Medicine, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| | - Richard R Meehan
- Breakthrough Breast Cancer Research Unit and Division of Pathology, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, UK
| |
Collapse
|
34
|
Accord insertion in the 5′ flanking region of CYP6G1 confers nicotine resistance in Drosophila melanogaster. Gene 2012; 502:1-8. [DOI: 10.1016/j.gene.2012.04.031] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2011] [Revised: 04/06/2012] [Accepted: 04/11/2012] [Indexed: 11/19/2022]
|
35
|
Wang D, Su Y, Wang X, Lei H, Yu J. Transposon-derived and satellite-derived repetitive sequences play distinct functional roles in Mammalian intron size expansion. Evol Bioinform Online 2012; 8:301-19. [PMID: 22807622 PMCID: PMC3396637 DOI: 10.4137/ebo.s9758] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Background Repetitive sequences (RSs) are redundant, complex at times, and often lineage-specific, representing significant “building” materials for genes and genomes. According to their origins, sequence characteristics, and ways of propagation, repetitive sequences are divided into transposable elements (TEs) and satellite sequences (SSs) as well as related subfamilies and subgroups hierarchically. The combined changes attributable to the repetitive sequences alter gene and genome architectures, such as the expansion of exonic, intronic, and intergenic sequences, and most of them propagate in a seemingly random fashion and contribute very significantly to the entire mutation spectrum of mammalian genomes. Principal findings Our analysis is focused on evolutional features of TEs and SSs in the intronic sequence of twelve selected mammalian genomes. We divided them into four groups—primates, large mammals, rodents, and primary mammals—and used four non-mammalian vertebrate species as the out-group. After classifying intron size variation in an intron-centric way based on RS-dominance (TE-dominant or SS-dominant intron expansions), we observed several distinct profiles in intron length and positioning in different vertebrate lineages, such as retrotransposon-dominance in mammals and DNA transposon-dominance in the lower vertebrates, amphibians and fishes. The RS patterns of mouse and rat genes are most striking, which are not only distinct from those of other mammals but also different from that of the third rodent species analyzed in this study—guinea pig. Looking into the biological functions of relevant genes, we observed a two-dimensional divergence; in particular, genes that possess SS-dominant and/or RS-free introns are enriched in tissue-specific development and transcription regulation in all mammalian lineages. In addition, we found that the tendency of transposons in increasing intron size is much stronger than that of satellites, and the combined effect of both RSs is greater than either one of them alone in a simple arithmetic sum among the mammals and the opposite is found among the four non-mammalian vertebrates. Conclusions TE- and SS-derived RSs represent major mutational forces shaping the size and composition of vertebrate genes and genomes, and through natural selection they either fine-tune or facilitate changes in size expansion, position variation, and duplication, and thus in functions and evolutionary paths for better survival and fitness. When analyzed globally, not only are such changes significantly diversified but also comprehensible in lineages and biological implications.
Collapse
Affiliation(s)
- Dapeng Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100029, P.R. China
| | | | | | | | | |
Collapse
|
36
|
Neguembor MV, Gabellini D. In junk we trust: repetitive DNA, epigenetics and facioscapulohumeral muscular dystrophy. Epigenomics 2012; 2:271-87. [PMID: 22121874 DOI: 10.2217/epi.10.8] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Facioscapulohumeral muscular dystrophy (FSHD) is an autosomal dominant myopathy with a peculiar etiology. Unlike most genetic disorders, FSHD is not caused by mutations in a protein-coding gene. Instead, it is associated with contraction of the D4Z4 macrosatellite repeat array located at 4q35. Interestingly, D4Z4 deletion is not sufficient per se to cause FSHD. Moreover, the disease severity, its rate of progression and the distribution of muscle weakness display great variability even among close family relatives. Hence, additional genetic and epigenetic events appear to be required for FSHD pathogenesis. Indeed, recent findings suggest that virtually all levels of epigenetic regulation, from DNA methylation to higher order chromosomal architecture, exhibit alterations in the disease locus causing deregulation of 4q35 gene expression, ultimately leading to FSHD.
Collapse
Affiliation(s)
- Maria V Neguembor
- International PhD Program in Cellular & Molecular Biology, Vita-Salute San Raffaele University, Milan, Italy
| | | |
Collapse
|
37
|
Zeng J, Kirk BD, Gou Y, Wang Q, Ma J. Genome-wide polycomb target gene prediction in Drosophila melanogaster. Nucleic Acids Res 2012; 40:5848-63. [PMID: 22416065 PMCID: PMC3401425 DOI: 10.1093/nar/gks209] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
As key epigenetic regulators, polycomb group (PcG) proteins are responsible for the control of cell proliferation and differentiation as well as stem cell pluripotency and self-renewal. Aberrant epigenetic modification by PcG is strongly correlated with the severity and invasiveness of many types of cancers. Unfortunately, the molecular mechanism of PcG-mediated epigenetic regulation remained elusive, partly due to the extremely limited pool of experimentally confirmed PcG target genes. In order to facilitate experimental identification of PcG target genes, here we propose a novel computational method, EpiPredictor, that achieved significantly higher matching ratios with several recent chromatin immunoprecipitation studies than jPREdictor, an existing computational method. We further validated a subset of genes that were uniquely predicted by EpiPredictor by cross-referencing existing literature and by experimental means. Our data suggest that multiple transcription factor networking at the cis-regulatory elements is critical for PcG recruitment, while high GC content and high conservation level are also important features of PcG target genes. EpiPredictor should substantially expedite experimental discovery of PcG target genes by providing an effective initial screening tool. From a computational standpoint, our strategy of modelling transcription factor interaction with a non-linear kernel is original, effective and transferable to many other applications.
Collapse
Affiliation(s)
- Jia Zeng
- Verna and Marrs McLean Department of Biochemistry and Molecular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | | | | | | | | |
Collapse
|
38
|
Zhang Y, Mager DL. Gene properties and chromatin state influence the accumulation of transposable elements in genes. PLoS One 2012; 7:e30158. [PMID: 22272293 PMCID: PMC3260225 DOI: 10.1371/journal.pone.0030158] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2011] [Accepted: 12/14/2011] [Indexed: 12/03/2022] Open
Abstract
Transposable elements (TEs) are mobile DNA sequences found in the genomes of almost all species. By measuring the normalized coverage of TE sequences within genes, we identified sets of genes with conserved extremes of high/low TE density in the genomes of human, mouse and cow and denoted them as ‘shared upper/lower outliers (SUOs/SLOs)’. By comparing these outlier genes to the genomic background, we show that a large proportion of SUOs are involved in metabolic pathways and tend to be mammal-specific, whereas many SLOs are related to developmental processes and have more ancient origins. Furthermore, the proportions of different types of TEs within human and mouse orthologous SUOs showed high similarity, even though most detectable TEs in these two genomes inserted after their divergence. Interestingly, our computational analysis of polymerase-II (Pol-II) occupancy at gene promoters in different mouse tissues showed that 60% of tissue-specific SUOs show strong Pol-II binding only in embryonic stem cells (ESCs), a proportion significantly higher than the genomic background (37%). In addition, our analysis of histone marks such as H3K4me3 and H3K27me3 in mouse ESCs also suggest a strong association between TE-rich genes and open-chromatin at promoters. Finally, two independent whole-transcriptome datasets show a positive association between TE density and gene expression level in ESCs. While this study focuses on genes with extreme TE densities, the above results clearly show that the probability of TE accumulation/fixation in mammalian genes is not random and is likely associated with different factors/gene properties and, most importantly, an association between the TE insertion/fixation rate and gene activity status in ES cells.
Collapse
Affiliation(s)
- Ying Zhang
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Dixie L. Mager
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- * E-mail:
| |
Collapse
|
39
|
He C, Li Z, Chen P, Huang H, Hurst LD, Chen J. Young intragenic miRNAs are less coexpressed with host genes than old ones: implications of miRNA-host gene coevolution. Nucleic Acids Res 2012; 40:4002-12. [PMID: 22238379 PMCID: PMC3351155 DOI: 10.1093/nar/gkr1312] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
MicroRNAs (miRNAs) have emerged as key regulators of gene expression. Intragenic miRNAs account for ∼50% of mammalian miRNAs. Classic studies reported that they are usually coexpressed with host genes. Here, using genome-wide miRNA and gene expression profiles from five sample sets, we show that evolutionarily conserved (‘old’) intragenic miRNAs tend to be coexpressed with host genes, but non-conserved (‘young’) ones rarely do so. This result is robust: in all sample sets, the coexpression rate of young miRNAs is significantly lower than that of conserved ones even after controlling for abundance. As a result, although young miRNAs dominate in human genome, the majority of intragenic miRNAs that show coexpression with host genes are phylogenetically old ones. For younger miRNAs, extrapolation of their expression profiles from those of their host genes should be treated with caution. We propose a model to explain this phenomenon in which the majority of young miRNAs are unlikely to be coexpressed with host genes; however, for some fraction of young miRNAs coexpression with their host genes, initially imbued by chromatin level effects, is advantageous and these are the ones likely to embed into the system and evolve ever higher levels of coexpression, possibly by evolving piggybacking mechanisms.
Collapse
Affiliation(s)
- Chunjiang He
- Department of Medicine, University of Chicago, Chicago, IL 60637, USA
| | | | | | | | | | | |
Collapse
|
40
|
Faircloth BC, McCormack JE, Crawford NG, Harvey MG, Brumfield RT, Glenn TC. Ultraconserved elements anchor thousands of genetic markers spanning multiple evolutionary timescales. Syst Biol 2012; 61:717-26. [PMID: 22232343 DOI: 10.1093/sysbio/sys004] [Citation(s) in RCA: 719] [Impact Index Per Article: 55.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Although massively parallel sequencing has facilitated large-scale DNA sequencing, comparisons among distantly related species rely upon small portions of the genome that are easily aligned. Methods are needed to efficiently obtain comparable DNA fragments prior to massively parallel sequencing, particularly for biologists working with non-model organisms. We introduce a new class of molecular marker, anchored by ultraconserved genomic elements (UCEs), that universally enable target enrichment and sequencing of thousands of orthologous loci across species separated by hundreds of millions of years of evolution. Our analyses here focus on use of UCE markers in Amniota because UCEs and phylogenetic relationships are well-known in some amniotes. We perform an in silico experiment to demonstrate that sequence flanking 2030 UCEs contains information sufficient to enable unambiguous recovery of the established primate phylogeny. We extend this experiment by performing an in vitro enrichment of 2386 UCE-anchored loci from nine, non-model avian species. We then use alignments of 854 of these loci to unambiguously recover the established evolutionary relationships within and among three ancient bird lineages. Because many organismal lineages have UCEs, this type of genetic marker and the analytical framework we outline can be applied across the tree of life, potentially reshaping our understanding of phylogeny at many taxonomic levels.
Collapse
Affiliation(s)
- Brant C Faircloth
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90095, USA.
| | | | | | | | | | | |
Collapse
|
41
|
Bire S, Rouleux-Bonnin F. Transposable elements as tools for reshaping the genome: it is a huge world after all! Methods Mol Biol 2012; 859:1-28. [PMID: 22367863 DOI: 10.1007/978-1-61779-603-6_1] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Transposable elements (TEs) are discrete pieces of DNA that can move from one site to another within genomes and sometime between genomes. They are found in all major branches of life. Because of their wide distribution and considerable diversity, they are a considerable source of genomic variation and as such, they constitute powerful drivers of genome evolution. Moreover, it is becoming clear that the epigenetic regulation of certain genes is derived from defense mechanisms against the activity of ancestral transposable elements. TEs now tend to be viewed as natural molecular tools that can reshape the genome, which challenges the idea that TEs are natural tools used to answer biological questions. In the first part of this chapter, we review the classification and distribution of TEs, and look at how they have contributed to the structural and transcriptional reshaping of genomes. In the second part, we describe methodological innovations that have modified their contribution as molecular tools.
Collapse
Affiliation(s)
- Solenne Bire
- GICC, UMR CNRS 6239, Université François Rabelais, UFR des Sciences et Technques, Tours, France
| | | |
Collapse
|
42
|
McCormack JE, Faircloth BC, Crawford NG, Gowaty PA, Brumfield RT, Glenn TC. Ultraconserved elements are novel phylogenomic markers that resolve placental mammal phylogeny when combined with species-tree analysis. Genome Res 2011; 22:746-54. [PMID: 22207614 DOI: 10.1101/gr.125864.111] [Citation(s) in RCA: 267] [Impact Index Per Article: 19.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Phylogenomics offers the potential to fully resolve the Tree of Life, but increasing genomic coverage also reveals conflicting evolutionary histories among genes, demanding new analytical strategies for elucidating a single history of life. Here, we outline a phylogenomic approach using a novel class of phylogenetic markers derived from ultraconserved elements and flanking DNA. Using species-tree analysis that accounts for discord among hundreds of independent loci, we show that this class of marker is useful for recovering deep-level phylogeny in placental mammals. In broad outline, our phylogeny agrees with recent phylogenomic studies of mammals, including several formerly controversial relationships. Our results also inform two outstanding questions in placental mammal phylogeny involving rapid speciation, where species-tree methods are particularly needed. Contrary to most phylogenomic studies, our study supports a first-diverging placental mammal lineage that includes elephants and tenrecs (Afrotheria). The level of conflict among gene histories is consistent with this basal divergence occurring in or near a phylogenetic "anomaly zone" where a failure to account for coalescent stochasticity will mislead phylogenetic inference. Addressing a long-standing phylogenetic mystery, we find some support from a high genomic coverage data set for a traditional placement of bats (Chiroptera) sister to a clade containing Perissodactyla, Cetartiodactyla, and Carnivora, and not nested within the latter clade, as has been suggested recently, although other results were conflicting. One of the most remarkable findings of our study is that ultraconserved elements and their flanking DNA are a rich source of phylogenetic information with strong potential for application across Amniotes.
Collapse
Affiliation(s)
- John E McCormack
- Museum of Natural Science, Louisiana State University, Baton Rouge, Louisiana 70803, USA.
| | | | | | | | | | | |
Collapse
|
43
|
Zhang Y, Romanish MT, Mager DL. Distributions of transposable elements reveal hazardous zones in mammalian introns. PLoS Comput Biol 2011; 7:e1002046. [PMID: 21573203 PMCID: PMC3088655 DOI: 10.1371/journal.pcbi.1002046] [Citation(s) in RCA: 69] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 03/25/2011] [Indexed: 11/20/2022] Open
Abstract
Comprising nearly half of the human and mouse genomes, transposable elements (TEs) are found within most genes. Although the vast majority of TEs in introns are fixed in the species and presumably exert no significant effects on the enclosing gene, some markedly perturb transcription and result in disease or a mutated phenotype. Factors determining the likelihood that an intronic TE will affect transcription are not clear. In this study, we examined intronic TE distributions in both human and mouse and found several factors that likely contribute to whether a particular TE can influence gene transcription. Specifically, we observed that TEs near exons are greatly underrepresented compared to random distributions, but the size of these “underrepresentation zones” differs between TE classes. Compared to elsewhere in introns, TEs within these zones are shorter on average and show stronger orientation biases. Moreover, TEs in extremely close proximity (<20 bp) to exons show a strong bias to be near splice-donor sites. Interestingly, disease-causing intronic TE insertions show the opposite distributional trends, and by examining expressed sequence tag (EST) databases, we found that the proportion of TEs contributing to chimeric TE-gene transcripts is significantly higher within their underrepresentation zones. In addition, an analysis of predicted splice sites within human long terminal repeat (LTR) elements showed a significantly lower total number and weaker strength for intronic LTRs near exons. Based on these factors, we selectively examined a list of polymorphic mouse LTR elements in introns and showed clear evidence of transcriptional disruption by LTR element insertions in the Trpc6 and Kcnh6 genes. Taken together, these studies lend insight into the potential selective forces that have shaped intronic TE distributions and enable identification of TEs most likely to exert transcriptional effects on genes. Sequences derived from transposable elements (TEs) are major constituents of mammalian genomes and are found within introns of most genes. While nearly all TEs within introns appear harmless, some de novo intronic TE insertions do disrupt gene transcription and splicing and cause disease. It is unclear why some intronic TEs perturb gene transcription whereas most do not. Here, we examined intronic TE distributions in both human and mouse genes to gain insight into which TEs may be more likely to affect transcription. We found evidence that TEs near exons are likely subject to strong negative selection but the size of the region under selection or “underrepresentation zone” differs for different TE classes. Strikingly, all reported human disease-causing intronic TE insertions fall within these underrepresentation zones, and the proportion of TEs contributing to chimeric TE-gene transcripts is significantly higher when TEs are located in these zones. We also examined insertionally polymorphic mouse TEs located within underrepresentation zones and found evidence of transcriptional disruption in two genes. Given the growing appreciation for ongoing activity of TEs in human, our results should be of value in prioritizing insertionally polymorphic TEs for study of their potential contributions to gene expression differences and phenotypic variability.
Collapse
Affiliation(s)
- Ying Zhang
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Mark T. Romanish
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
| | - Dixie L. Mager
- Terry Fox Laboratory, British Columbia Cancer Agency, Vancouver, British Columbia, Canada
- Department of Medical Genetics, University of British Columbia, Vancouver, British Columbia, Canada
- * E-mail:
| |
Collapse
|
44
|
The Drosophila gene disruption project: progress using transposons with distinctive site specificities. Genetics 2011; 188:731-43. [PMID: 21515576 DOI: 10.1534/genetics.111.126995] [Citation(s) in RCA: 277] [Impact Index Per Article: 19.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
The Drosophila Gene Disruption Project (GDP) has created a public collection of mutant strains containing single transposon insertions associated with different genes. These strains often disrupt gene function directly, allow production of new alleles, and have many other applications for analyzing gene function. Here we describe the addition of ∼7600 new strains, which were selected from >140,000 additional P or piggyBac element integrations and 12,500 newly generated insertions of the Minos transposon. These additions nearly double the size of the collection and increase the number of tagged genes to at least 9440, approximately two-thirds of all annotated protein-coding genes. We also compare the site specificity of the three major transposons used in the project. All three elements insert only rarely within many Polycomb-regulated regions, a property that may contribute to the origin of "transposon-free regions" (TFRs) in metazoan genomes. Within other genomic regions, Minos transposes essentially at random, whereas P or piggyBac elements display distinctive hotspots and coldspots. P elements, as previously shown, have a strong preference for promoters. In contrast, piggyBac site selectivity suggests that it has evolved to reduce deleterious and increase adaptive changes in host gene expression. The propensity of Minos to integrate broadly makes possible a hybrid finishing strategy for the project that will bring >95% of Drosophila genes under experimental control within their native genomic contexts.
Collapse
|
45
|
Jjingo D, Huda A, Gundapuneni M, Mariño-Ramírez L, Jordan IK. Effect of the transposable element environment of human genes on gene length and expression. Genome Biol Evol 2011; 3:259-71. [PMID: 21362639 PMCID: PMC3070429 DOI: 10.1093/gbe/evr015] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
Independent lines of investigation have documented effects of both transposable elements (TEs) and gene length (GL) on gene expression. However, TE gene fractions are highly correlated with GL, suggesting that they cannot be considered independently. We evaluated the TE environment of human genes and GL jointly in an attempt to tease apart their relative effects. TE gene fractions and GL were compared with the overall level of gene expression and the breadth of expression across tissues. GL is strongly correlated with overall expression level but weakly correlated with the breadth of expression, confirming the selection hypothesis that attributes the compactness of highly expressed genes to selection for economy of transcription. However, TE gene fractions overall, and for the L1 family in particular, show stronger anticorrelations with expression level than GL, indicating that GL may not be the most important target of selection for transcriptional economy. These results suggest a specific mechanism, removal of TEs, by which highly expressed genes are selectively tuned for efficiency. MIR elements are the only family of TEs with gene fractions that show a positive correlation with tissue-specific expression, suggesting that they may provide regulatory sequences that help to control human gene expression. Consistent with this notion, MIR fractions are relatively enriched close to transcription start sites and associated with coexpression in specific sets of related tissues. Our results confirm the overall relevance of the TE environment to gene expression and point to distinct mechanisms by which different TE families may contribute to gene regulation.
Collapse
Affiliation(s)
- Daudi Jjingo
- School of Biology, Georgia Institute of Technology, GA, USA
| | | | | | | | | |
Collapse
|
46
|
Estécio MR, Gallegos J, Vallot C, Castoro RJ, Chung W, Maegawa S, Oki Y, Kondo Y, Jelinek J, Shen L, Hartung H, Aplan PD, Czerniak BA, Liang S, Issa JPJ. Genome architecture marked by retrotransposons modulates predisposition to DNA methylation in cancer. Genome Res 2010; 20:1369-82. [DOI: 10.1101/gr.107318.110] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Epigenetic silencing plays an important role in cancer development. An attractive hypothesis is that local DNA features may participate in differential predisposition to gene hypermethylation. We found that, compared with methylation-resistant genes, methylation-prone genes have a lower frequency of SINE and LINE retrotransposons near their transcription start site. In several large testing sets, this distribution was highly predictive of promoter methylation. Genome-wide analysis showed that 22% of human genes were predicted to be methylation-prone in cancer; these tended to be genes that are down-regulated in cancer and that function in developmental processes. Moreover, retrotransposon distribution marks a larger fraction of methylation-prone genes compared to Polycomb group protein (PcG) marking in embryonic stem cells; indeed, PcG marking and our predictive model based on retrotransposon frequency appear to be correlated but also complementary. In summary, our data indicate that retrotransposon elements, which are widespread in our genome, are strongly associated with gene promoter DNA methylation in cancer and may in fact play a role in influencing epigenetic regulation in normal and abnormal physiological states.
Collapse
|
47
|
Mortada H, Vieira C, Lerat E. Genes devoid of full-length transposable element insertions are involved in development and in the regulation of transcription in human and closely related species. J Mol Evol 2010; 71:180-91. [PMID: 20798934 DOI: 10.1007/s00239-010-9376-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2010] [Accepted: 07/26/2010] [Indexed: 02/04/2023]
Abstract
Transposable elements (TEs) are major components of mammalian genomes, and their impact on genome evolution is now well established. In recent years several findings have shown that they are associated with the expression level and function of genes. In this study, we analyze the relationships between human genes and full-length TE copies in terms of three factors (gene function, expression level, and selective pressure). We classified human genes according to their TE density, and found that TE-free genes are involved in important functions such as development, transcription, and the regulation of transcription, whereas TE-rich genes are involved in functions such as transport and metabolism. This trend is conserved through evolution. We show that this could be explained by a stronger selection pressure acting on both the coding and non-coding regions of TE-free genes than on those of TE-rich genes. The higher level of expression found for TE-rich genes in tumor and immune system tissues suggests that TEs play an important role in gene regulation.
Collapse
|
48
|
Babenko VN, Makunin IV, Brusentsova IV, Belyaeva ES, Maksimov DA, Belyakin SN, Maroy P, Vasil'eva LA, Zhimulev IF. Paucity and preferential suppression of transgenes in late replication domains of the D. melanogaster genome. BMC Genomics 2010; 11:318. [PMID: 20492674 PMCID: PMC2887417 DOI: 10.1186/1471-2164-11-318] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2010] [Accepted: 05/21/2010] [Indexed: 01/17/2023] Open
Abstract
Background Eukaryotic genomes are organized in extended domains with distinct features intimately linking genome structure, replication pattern and chromatin state. Recently we identified a set of long late replicating euchromatic regions that are underreplicated in salivary gland polytene chromosomes of D. melanogaster. Results Here we demonstrate that these underreplicated regions (URs) have a low density of P-element and piggyBac insertions compared to the genome average or neighboring regions. In contrast, Minos-based transposons show no paucity in URs but have a strong bias to testis-specific genes. We estimated the suppression level in 2,852 stocks carrying a single P-element by analysis of eye color determined by the mini-white marker gene and demonstrate that the proportion of suppressed transgenes in URs is more than three times higher than in the flanking regions or the genomic average. The suppressed transgenes reside in intergenic, genic or promoter regions of the annotated genes. We speculate that the low insertion frequency of P-elements and piggyBacs in URs partially results from suppression of transgenes that potentially could prevent identification of transgenes due to complete suppression of the marker gene. In a similar manner, the proportion of suppressed transgenes is higher in loci replicating late or very late in Kc cells and these loci have a lower density of P-elements and piggyBac insertions. In transgenes with two marker genes suppression of mini-white gene in eye coincides with suppression of yellow gene in bristles. Conclusions Our results suggest that the late replication domains have a high inactivation potential apparently linked to the silenced or closed chromatin state in these regions, and that such inactivation potential is largely maintained in different tissues.
Collapse
Affiliation(s)
- Vladimir N Babenko
- Department of Molecular and Cellular Biology, Institute of Chemical Biology and Fundamental Medicine SB RAS, Novosibirsk, 630090, Russia
| | | | | | | | | | | | | | | | | |
Collapse
|
49
|
Kvikstad EM, Makova KD. The (r)evolution of SINE versus LINE distributions in primate genomes: sex chromosomes are important. Genome Res 2010; 20:600-13. [PMID: 20219940 DOI: 10.1101/gr.099044.109] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The densities of transposable elements (TEs) in the human genome display substantial variation both within individual chromosomes and among chromosome types (autosomes and the two sex chromosomes). Finding an explanation for this variability has been challenging, especially in light of genome landscapes unique to the sex chromosomes. Here, using a multiple regression framework, we investigate primate Alu and L1 densities shaped by regional genome features and location on a particular chromosome type. As a result of our analysis, first, we build statistical models explaining up to 79% and 44% of variation in Alu and L1 element density, respectively. Second, we analyze sex chromosome versus autosome TE densities corrected for regional genomic effects. We discover that sex-chromosome bias in Alu and L1 distributions not only persists after accounting for these effects, but even presents differences in patterns, confirming preferential Alu integration in the male germline, yet likely integration of L1s in both male and female germlines or in early embryogenesis. Additionally, our models reveal that local base composition (measured by GC content and density of L1 target sites) and natural selection (inferred via density of most conserved elements) are significant to predicting densities of L1s. Interestingly, measurements of local double-stranded breaks (a 13-mer associated with genome instability) strongly correlate with densities of Alu elements; little evidence was found for the role of recombination-driven deletion in driving TE distributions over evolutionary time. Thus, Alu and L1 densities have been influenced by the combination of distinct local genome landscapes and the unique evolutionary dynamics of sex chromosomes.
Collapse
Affiliation(s)
- Erika M Kvikstad
- Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania 16802, USA.
| | | |
Collapse
|
50
|
Alonso A, Hasson D, Cheung F, Warburton PE. A paucity of heterochromatin at functional human neocentromeres. Epigenetics Chromatin 2010; 3:6. [PMID: 20210998 PMCID: PMC2845132 DOI: 10.1186/1756-8935-3-6] [Citation(s) in RCA: 82] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2009] [Accepted: 03/08/2010] [Indexed: 12/29/2022] Open
Abstract
Background Centromeres are responsible for the proper segregation of replicated chromatids during cell division. Neocentromeres are fully functional ectopic human centromeres that form on low-copy DNA sequences and permit analysis of centromere structure in relation to the underlying DNA sequence. Such structural analysis is not possible at endogenous centromeres because of the large amounts of repetitive alpha satellite DNA present. Results High-resolution chromatin immunoprecipitation (ChIP) on CHIP (microarray) analysis of three independent neocentromeres from chromosome 13q revealed that each neocentromere contained ~100 kb of centromere protein (CENP)-A in a two-domain organization. Additional CENP-A domains were observed in the vicinity of neocentromeres, coinciding with CpG islands at the 5' end of genes. Analysis of histone H3 dimethylated at lysine 4 (H3K4me2) revealed small domains at each neocentromere. However, these domains of H3K4me2 were also found in the equivalent non-neocentric chromosomes. A surprisingly minimal (~15 kb) heterochromatin domain was observed at one of the neocentromeres, which formed in an unusual transposon-free region distal to the CENP-A domains. Another neocentromere showed a distinct absence of nearby significant domains of heterochromatin. A subtle defect in centromere cohesion detected at these neocentromeres may be due to the paucity of heterochromatin domains. Conclusions This high-resolution mapping suggests that H3K4me2 does not seem sufficiently abundant to play a structural role at neocentromeres, as proposed for endogenous centromeres. Large domains of heterochromatin also do not appear necessary for centromere function. Thus, this study provides important insight into the structural requirements of human centromere function.
Collapse
Affiliation(s)
- Alicia Alonso
- Department of Genetics and Genomic Sciences, Mount Sinai School of Medicine, New York, NY 10029, USA
| | | | | | | |
Collapse
|