1
|
Gozashti L, Hartl DL, Corbett-Detig R. Universal signatures of transposable element compartmentalization across eukaryotic genomes. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.17.562820. [PMID: 38585780 PMCID: PMC10996525 DOI: 10.1101/2023.10.17.562820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/09/2024]
Abstract
The evolutionary mechanisms that drive the emergence of genome architecture remain poorly understood but can now be assessed with unprecedented power due to the massive accumulation of genome assemblies spanning phylogenetic diversity1,2. Transposable elements (TEs) are a rich source of large-effect mutations since they directly and indirectly drive genomic structural variation and changes in gene expression3. Here, we demonstrate universal patterns of TE compartmentalization across eukaryotic genomes spanning ~1.7 billion years of evolution, in which TEs colocalize with gene families under strong predicted selective pressure for dynamic evolution and involved in specific functions. For non-pathogenic species these genes represent families involved in defense, sensory perception and environmental interaction, whereas for pathogenic species, TE-compartmentalized genes are highly enriched for pathogenic functions. Many TE-compartmentalized gene families display signatures of positive selection at the molecular level. Furthermore, TE-compartmentalized genes exhibit an excess of high-frequency alleles for polymorphic TE insertions in fruit fly populations. We postulate that these patterns reflect selection for adaptive TE insertions as well as TE-associated structural variants. This process may drive the emergence of a shared TE-compartmentalized genome architecture across diverse eukaryotic lineages.
Collapse
Affiliation(s)
- Landen Gozashti
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
- Museum of Comparative Zoology, Harvard University, Cambridge, MA, USA
| | - Daniel L. Hartl
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA, USA
| | - Russell Corbett-Detig
- Department of Biomolecular Engineering, University of California Santa Cruz, Santa Cruz, CA, USA
- UC Santa Cruz Genomics Institute, University of California Santa Cruz, Santa Cruz, CA, USA
| |
Collapse
|
2
|
Lawson HA, Liang Y, Wang T. Transposable elements in mammalian chromatin organization. Nat Rev Genet 2023; 24:712-723. [PMID: 37286742 DOI: 10.1038/s41576-023-00609-6] [Citation(s) in RCA: 32] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/24/2023] [Indexed: 06/09/2023]
Abstract
Transposable elements (TEs) are mobile DNA elements that comprise almost 50% of mammalian genomic sequence. TEs are capable of making additional copies of themselves that integrate into new positions in host genomes. This unique property has had an important impact on mammalian genome evolution and on the regulation of gene expression because TE-derived sequences can function as cis-regulatory elements such as enhancers, promoters and silencers. Now, advances in our ability to identify and characterize TEs have revealed that TE-derived sequences also regulate gene expression by both maintaining and shaping 3D genome architecture. Studies are revealing how TEs contribute raw sequence that can give rise to the structures that shape chromatin organization, and thus gene expression, allowing for species-specific genome innovation and evolutionary novelty.
Collapse
Affiliation(s)
- Heather A Lawson
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA.
| | - Yonghao Liang
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, Saint Louis, MO, USA.
- Center for Genome Sciences and Systems Biology, Washington University School of Medicine, Saint Louis, MO, USA.
- McDonnell Genome Institute, Washington University School of Medicine, Saint Louis, MO, USA.
| |
Collapse
|
3
|
Paulat NS, McGuire E, Subramanian K, Osmanski AB, Moreno-Santillán DD, Ray DA, Xing J. Transposable Elements in Bats Show Differential Accumulation Patterns Determined by Class and Functionality. Life (Basel) 2022; 12:1190. [PMID: 36013369 PMCID: PMC9409754 DOI: 10.3390/life12081190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2022] [Revised: 08/01/2022] [Accepted: 08/02/2022] [Indexed: 11/16/2022] Open
Abstract
Bat genomes are characterized by a diverse transposable element (TE) repertoire. In particular, the genomes of members of the family Vespertilionidae contain both active retrotransposons and active DNA transposons. Each TE type is characterized by a distinct pattern of accumulation over the past ~40 million years. Each also exhibits its own target site preferences (sometimes shared with other TEs) that impact where they are likely to insert when mobilizing. Therefore, bats provide a great resource for understanding the diversity of TE insertion patterns. To gain insight into how these diverse TEs impact genome structure, we performed comparative spatial analyses between different TE classes and genomic features, including genic regions and CpG islands. Our results showed a depletion of all TEs in the coding sequence and revealed patterns of species- and element-specific attraction in the transcript. Trends of attraction in the distance tests also suggested significant TE activity in regions adjacent to genes. In particular, the enrichment of small, non-autonomous TE insertions in introns and near coding regions supports the hypothesis that the genomic distribution of TEs is the product of a balance of the TE insertion preference in open chromatin regions and the purifying selection against TEs within genes.
Collapse
Affiliation(s)
- Nicole S. Paulat
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Erin McGuire
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Krishnamurthy Subramanian
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| | - Austin B. Osmanski
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | | | - David A. Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA
| | - Jinchuan Xing
- Department of Genetics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
- Human Genetics Institute of New Jersey, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA
| |
Collapse
|
4
|
Ceraulo S, Perelman PL, Dumas F. Massive LINE‐1 retrotransposon enrichment in tamarins of the Cebidae family (Platyrrhini, Primates) and its significance for genome evolution. J ZOOL SYST EVOL RES 2021. [DOI: 10.1111/jzs.12536] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Simona Ceraulo
- Department of “Scienze e Tecnologie Biologiche, Chimiche e Farmaceutiche (STEBICEF)” University of Palermo Palermo Italy
| | | | - Francesca Dumas
- Department of “Scienze e Tecnologie Biologiche, Chimiche e Farmaceutiche (STEBICEF)” University of Palermo Palermo Italy
| |
Collapse
|
5
|
Wang Y, Shen D, Ullah N, Diaby M, Gao B, Song C. Characterization and expression pattern of ZB and PS transposons in zebrafish. Gene Expr Patterns 2021; 42:119203. [PMID: 34481069 DOI: 10.1016/j.gep.2021.119203] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2021] [Revised: 08/25/2021] [Accepted: 08/28/2021] [Indexed: 11/28/2022]
Abstract
Despite comprising much of the genome, transposons were once thought of as junk. However, transposons play many roles in the eukaryotic genome, such as providing new proteins as domesticated genes, expressing during germline-soma differentiation, function in DNA rearrangement in the offspring, and so on. We sought to describe the distribution and structural organization of the two autonomous transposons (ZB and PS) in the zebrafish genome and examine their expression patterns in embryos and adult tissues. The intact copy of ZB and PS was queried by BLAST on NCBI and ENSEMBL using default parameters. Of the copies with coverage and identity, more than 90 % were downloaded to do structural analysis. Spatial and temporal expression patterns were detected by qRT-PCR and Whole-mount in situ hybridization (WISH). There are 19 intact copies of ZB, encoding 341 amino acid residues with DD34E catalytic domain and flanked by 201bp TIRs, and seven intact PS copies, containing 425 amino acid residues with DD35D catalytic domain flanked by 28bp TIRs, were detected in the genome of zebrafish respectively. Analysis of genomic insertions indicated that both ZB and PS transposons are prone to be retained in the intron and intergenic regions of the zebrafish genome. The sense and antisense transcripts of ZB and PS were detected during embryonic development stages and exhibited similar expression patterns. The difference is that the sense strand transcript of ZB was explicitly expressed in midbrain-hindbrain boundary (MHB) and otic vesicle (OV), and pharyngeal arches and pharyngeal pouches (PA&PP) at 48 hpf. In adult zebrafish, the expressions of ZB and PS in muscle and brain are much higher than in other tissues. Our study results indicate that ZB and PS transposons may be involved in the embryonic development and regulation of somatic cells of certain adult tissues, such as the brain and muscle.
Collapse
Affiliation(s)
- Yali Wang
- College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225009, China
| | - Dan Shen
- College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225009, China
| | - Numan Ullah
- College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225009, China
| | - Mohamed Diaby
- College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225009, China
| | - Bo Gao
- College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225009, China
| | - Chengyi Song
- College of Animal Science and Technology, Yangzhou University, Yangzhou, Jiangsu, 225009, China.
| |
Collapse
|
6
|
Chen D, Cremona MA, Qi Z, Mitra RD, Chiaromonte F, Makova KD. Human L1 Transposition Dynamics Unraveled with Functional Data Analysis. Mol Biol Evol 2021; 37:3576-3600. [PMID: 32722770 DOI: 10.1093/molbev/msaa194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Long INterspersed Elements-1 (L1s) constitute >17% of the human genome and still actively transpose in it. Characterizing L1 transposition across the genome is critical for understanding genome evolution and somatic mutations. However, to date, L1 insertion and fixation patterns have not been studied comprehensively. To fill this gap, we investigated three genome-wide data sets of L1s that integrated at different evolutionary times: 17,037 de novo L1s (from an L1 insertion cell-line experiment conducted in-house), and 1,212 polymorphic and 1,205 human-specific L1s (from public databases). We characterized 49 genomic features-proxying chromatin accessibility, transcriptional activity, replication, recombination, etc.-in the ±50 kb flanks of these elements. These features were contrasted between the three L1 data sets and L1-free regions using state-of-the-art Functional Data Analysis statistical methods, which treat high-resolution data as mathematical functions. Our results indicate that de novo, polymorphic, and human-specific L1s are surrounded by different genomic features acting at specific locations and scales. This led to an integrative model of L1 transposition, according to which L1s preferentially integrate into open-chromatin regions enriched in non-B DNA motifs, whereas they are fixed in regions largely free of purifying selection-depleted of genes and noncoding most conserved elements. Intriguingly, our results suggest that L1 insertions modify local genomic landscape by extending CpG methylation and increasing mononucleotide microsatellite density. Altogether, our findings substantially facilitate understanding of L1 integration and fixation preferences, pave the way for uncovering their role in aging and cancer, and inform their use as mutagenesis tools in genetic studies.
Collapse
Affiliation(s)
- Di Chen
- Intercollege Graduate Degree Program in Genetics, The Huck Institutes of the Life Sciences, The Pennsylvania State University, University Park, PA
| | - Marzia A Cremona
- Department of Statistics, The Pennsylvania State University, University Park, PA.,Department of Operations and Decision Systems, Université Laval, Québec, Canada
| | - Zongtai Qi
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO
| | - Robi D Mitra
- Department of Genetics and Center for Genome Sciences and Systems Biology, Washington University School of Medicine, St. Louis, MO
| | - Francesca Chiaromonte
- Department of Statistics, The Pennsylvania State University, University Park, PA.,EMbeDS, Sant'Anna School of Advanced Studies, Pisa, Italy.,The Huck Institutes of the Life Sciences, Center for Medical Genomics, The Pennsylvania State University, University Park, PA
| | - Kateryna D Makova
- The Huck Institutes of the Life Sciences, Center for Medical Genomics, The Pennsylvania State University, University Park, PA.,Department of Biology, The Pennsylvania State University, University Park, PA
| |
Collapse
|
7
|
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, Lugo CSB, Elliott TA, Ware D, Peterson T, Jiang N, Hirsch CN, Hufford MB. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 2019. [PMID: 31843001 DOI: 10.1101/657890v1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2023] Open
Abstract
BACKGROUND Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. RESULTS We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. CONCLUSIONS The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.
Collapse
Affiliation(s)
- Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA
| | - Weija Su
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA, 92697, USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Jireh R A Agda
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | - Adam J Hellinga
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | | | - Tyler A Elliott
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario, N1G 2W1, Canada
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
- USDA-ARS NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY, 14853, USA
| | - Thomas Peterson
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA, 50011, USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI, 48824, USA.
| | - Candice N Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN, 55108, USA.
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA, 50011, USA.
| |
Collapse
|
8
|
Ou S, Su W, Liao Y, Chougule K, Agda JRA, Hellinga AJ, Lugo CSB, Elliott TA, Ware D, Peterson T, Jiang N, Hirsch CN, Hufford MB. Benchmarking transposable element annotation methods for creation of a streamlined, comprehensive pipeline. Genome Biol 2019; 20:275. [PMID: 31843001 PMCID: PMC6913007 DOI: 10.1186/s13059-019-1905-y] [Citation(s) in RCA: 580] [Impact Index Per Article: 96.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Accepted: 11/28/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Sequencing technology and assembly algorithms have matured to the point that high-quality de novo assembly is possible for large, repetitive genomes. Current assemblies traverse transposable elements (TEs) and provide an opportunity for comprehensive annotation of TEs. Numerous methods exist for annotation of each class of TEs, but their relative performances have not been systematically compared. Moreover, a comprehensive pipeline is needed to produce a non-redundant library of TEs for species lacking this resource to generate whole-genome TE annotations. RESULTS We benchmark existing programs based on a carefully curated library of rice TEs. We evaluate the performance of methods annotating long terminal repeat (LTR) retrotransposons, terminal inverted repeat (TIR) transposons, short TIR transposons known as miniature inverted transposable elements (MITEs), and Helitrons. Performance metrics include sensitivity, specificity, accuracy, precision, FDR, and F1. Using the most robust programs, we create a comprehensive pipeline called Extensive de-novo TE Annotator (EDTA) that produces a filtered non-redundant TE library for annotation of structurally intact and fragmented elements. EDTA also deconvolutes nested TE insertions frequently found in highly repetitive genomic regions. Using other model species with curated TE libraries (maize and Drosophila), EDTA is shown to be robust across both plant and animal species. CONCLUSIONS The benchmarking results and pipeline developed here will greatly facilitate TE annotation in eukaryotic genomes. These annotations will promote a much more in-depth understanding of the diversity and evolution of TEs at both intra- and inter-species levels. EDTA is open-source and freely available: https://github.com/oushujun/EDTA.
Collapse
Affiliation(s)
- Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011 USA
| | - Weija Su
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011 USA
| | - Yi Liao
- Department of Ecology and Evolutionary Biology, University of California, Irvine, CA 92697 USA
| | - Kapeel Chougule
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724 USA
| | - Jireh R. A. Agda
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario N1G 2W1 Canada
| | - Adam J. Hellinga
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario N1G 2W1 Canada
| | | | - Tyler A. Elliott
- Centre for Biodiversity Genomics, University of Guelph, Guelph, Ontario N1G 2W1 Canada
| | - Doreen Ware
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724 USA
- USDA-ARS NEA Robert W. Holley Center for Agriculture and Health, Cornell University, Ithaca, NY 14853 USA
| | - Thomas Peterson
- Department of Genetics, Development, and Cell Biology, Iowa State University, Ames, IA 50011 USA
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, MI 48824 USA
| | - Candice N. Hirsch
- Department of Agronomy and Plant Genetics, University of Minnesota, Saint Paul, MN 55108 USA
| | - Matthew B. Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, IA 50011 USA
| |
Collapse
|
9
|
Evolution and diversity of transposable elements in fish genomes. Sci Rep 2019; 9:15399. [PMID: 31659260 PMCID: PMC6817897 DOI: 10.1038/s41598-019-51888-1] [Citation(s) in RCA: 61] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 10/09/2019] [Indexed: 12/22/2022] Open
Abstract
Transposable elements (TEs) are genomic sequences that can move, multiply, and often form sizable fractions of vertebrate genomes. Fish belong to a unique group of vertebrates, since their karyotypes and genome sizes are more diverse and complex, with probably higher diversity and evolution specificity of TE. To investigate the characteristics of fish TEs, we compared the mobilomes of 39 species, and observed significant variation of TE content in fish (from 5% in pufferfish to 56% in zebrafish), along with a positive correlation between fish genome size and TE content. In different classification hierarchies, retrotransposons (class), long terminal repeat (order), as well as Helitron, Maverick, Kolobok, CMC, DIRS, P, I, L1, L2, and 5S (superfamily) were all positively correlated with fish genome size. Consistent with previous studies, our data suggested fish genomes to not always be dominated by DNA transposons; long interspersed nuclear elements are also prominent in many species. This study suggests CR1 distribution in fish genomes to be obviously regular, and provides new clues concerning important events in vertebrate evolution. Altogether, our results highlight the importance of TEs in the structure and evolution of fish genomes and suggest fish species diversity to parallel transposon content diversification.
Collapse
|
10
|
Fan H, Hu Y, Shan L, Yu L, Wang B, Li M, Wu Q, Wei F. Synteny search identifies carnivore Y chromosome for evolution of male specific genes. Integr Zool 2019; 14:224-234. [PMID: 30019860 DOI: 10.1111/1749-4877.12352] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The explosive accumulation of mammalian genomes has provided a valuable resource to characterize the evolution of the Y chromosome. Unexpectedly, the Y-chromosome sequence has been characterized in only a small handful of species, with the majority being model organisms. Thus, identification of Y-linked scaffolds from unordered genome sequences is becoming more important. Here, we used a syntenic-based approach to generate the scaffolds of the male-specific region of the Y chromosome (MSY) from the genome sequence of 6 male carnivore species. Our results identified 14, 15, 9, 28, 14 and 11 Y-linked scaffolds in polar bears, pacific walruses, red pandas, cheetahs, ferrets and tigers, covering 1.55 Mbp, 2.62 Mbp, 964 Kb, 1.75 Mb, 2.17 Mbp and 1.84 Mb MSY, respectively. All the candidate Y-linked scaffolds in 3 selected species (red pandas, polar bears and tigers) were successfully verified using polymerase chain reaction. We re-annotated 8 carnivore MSYs including these 6 Y-linked scaffolds and domestic dog and cat MSY; a total of 11 orthologous genes conserved in at least 7 of the 8 carnivores were identified. These 11 Y-linked genes have significantly higher evolutionary rates compared with their X-linked counterparts, indicating less purifying selection for MSY genes. Taken together, our study shows that the approach of synteny search is a reliable and easily affordable strategy to identify Y-linked scaffolds from unordered carnivore genomes and provides a preliminary evolutionary study for carnivore MSY genes.
Collapse
Affiliation(s)
- Huizhong Fan
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Yibo Hu
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| | - Lei Shan
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Lijun Yu
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Bing Wang
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Min Li
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Qi Wu
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China
| | - Fuwen Wei
- Key Laboratory of Animal Ecology and Conservation Biology, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.,University of Chinese Academy of Sciences, Beijing, China.,Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences, Kunming, China
| |
Collapse
|
11
|
Bourque G, Burns KH, Gehring M, Gorbunova V, Seluanov A, Hammell M, Imbeault M, Izsvák Z, Levin HL, Macfarlan TS, Mager DL, Feschotte C. Ten things you should know about transposable elements. Genome Biol 2018; 19:199. [PMID: 30454069 PMCID: PMC6240941 DOI: 10.1186/s13059-018-1577-z] [Citation(s) in RCA: 690] [Impact Index Per Article: 98.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Transposable elements (TEs) are major components of eukaryotic genomes. However, the extent of their impact on genome evolution, function, and disease remain a matter of intense interrogation. The rise of genomics and large-scale functional assays has shed new light on the multi-faceted activities of TEs and implies that they should no longer be marginalized. Here, we introduce the fundamental properties of TEs and their complex interactions with their cellular environment, which are crucial to understanding their impact and manifold consequences for organismal biology. While we draw examples primarily from mammalian systems, the core concepts outlined here are relevant to a broad range of organisms.
Collapse
Affiliation(s)
- Guillaume Bourque
- Department of Human Genetics, McGill University, Montréal, Québec, H3A 0G1, Canada.
- Canadian Center for Computational Genomics, McGill University, Montréal, Québec, H3A 0G1, Canada.
| | - Kathleen H Burns
- Department of Pathology, Johns Hopkins University School of Medicine, Baltimore, MD, 21205, USA
| | - Mary Gehring
- Whitehead Institute for Biomedical Research and Department of Biology, Massachusetts Institute of Technology, Cambridge, MA, 02142, USA
| | - Vera Gorbunova
- Department of Biology, University of Rochester, Rochester, NY, 14627, USA
| | - Andrei Seluanov
- Department of Biology, University of Rochester, Rochester, NY, 14627, USA
| | - Molly Hammell
- Cold Spring Harbor Laboratory, Cold Spring Harbor, NY, 11724, USA
| | - Michaël Imbeault
- Department of Genetics, University of Cambridge, Cambridge, CB2 3EH, UK
| | - Zsuzsanna Izsvák
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association, 13125, Berlin, Germany
| | - Henry L Levin
- The Eunice Kennedy Shriver National Institute of Child Health and Human Development, The National Institutes of Health, Bethesda, Maryland, USA
| | - Todd S Macfarlan
- The Eunice Kennedy Shriver National Institute of Child Health and Human Development, The National Institutes of Health, Bethesda, Maryland, USA
| | - Dixie L Mager
- Terry Fox Laboratory, British Columbia Cancer Agency and Department of Medical Genetics, University of BC, Vancouver, BC, V5Z1L3, Canada
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, 14850, USA.
| |
Collapse
|
12
|
Chakraborty A, Ay F. The role of 3D genome organization in disease: From compartments to single nucleotides. Semin Cell Dev Biol 2018; 90:104-113. [PMID: 30017907 DOI: 10.1016/j.semcdb.2018.07.005] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2018] [Accepted: 07/03/2018] [Indexed: 12/26/2022]
Abstract
Since the advent of the chromosome conformation capture technology, our understanding of the human genome 3D organization has grown rapidly and we now know that human interphase chromosomes are folded into multiple layers of hierarchical structures and each layer can play a critical role in transcriptional regulation. Alterations in any one of these finely-tuned layers can lead to unwanted cascade of molecular events and ultimately drive the manifestation of diseases and phenotypes. Here we discuss, starting from chromosome level organization going down to single nucleotide changes, recent studies linking diseases or phenotypes to changes in the 3D genome architecture.
Collapse
Affiliation(s)
| | - Ferhat Ay
- La Jolla Institute for Allergy and Immunology, La Jolla, CA, 92037, USA; UC San Diego, School of Medicine, La Jolla, 92093, CA, USA.
| |
Collapse
|
13
|
Buckley RM, Kortschak RD, Raison JM, Adelson DL. Similar Evolutionary Trajectories for Retrotransposon Accumulation in Mammals. Genome Biol Evol 2018; 9:2336-2353. [PMID: 28945883 PMCID: PMC5610350 DOI: 10.1093/gbe/evx179] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/01/2017] [Indexed: 12/19/2022] Open
Abstract
The factors guiding retrotransposon insertion site preference are not well understood. Different types of retrotransposons share common replication machinery and yet occupy distinct genomic domains. Autonomous long interspersed elements accumulate in gene-poor domains and their nonautonomous short interspersed elements accumulate in gene-rich domains. To determine genomic factors that contribute to this discrepancy we analyzed the distribution of retrotransposons within the framework of chromosomal domains and regulatory elements. Using comparative genomics, we identified large-scale conserved patterns of retrotransposon accumulation across several mammalian genomes. Importantly, retrotransposons that were active after our sample-species diverged accumulated in orthologous regions. This suggested a similar evolutionary interaction between retrotransposon activity and conserved genome architecture across our species. In addition, we found that retrotransposons accumulated at regulatory element boundaries in open chromatin, where accumulation of particular retrotransposon types depended on insertion size and local regulatory element density. From our results, we propose a model where density and distribution of genes and regulatory elements canalize retrotransposon accumulation. Through conservation of synteny, gene regulation and nuclear organization, mammalian genomes with dissimilar retrotransposons follow similar evolutionary trajectories.
Collapse
Affiliation(s)
- Reuben M Buckley
- Department of Genetics and Evolution, The University of Adelaide, South Australia, Australia
| | - R Daniel Kortschak
- Department of Genetics and Evolution, The University of Adelaide, South Australia, Australia
| | - Joy M Raison
- Department of Genetics and Evolution, The University of Adelaide, South Australia, Australia
| | - David L Adelson
- Department of Genetics and Evolution, The University of Adelaide, South Australia, Australia
| |
Collapse
|
14
|
Kvikstad EM, Piazza P, Taylor JC, Lunter G. A high throughput screen for active human transposable elements. BMC Genomics 2018; 19:115. [PMID: 29390960 PMCID: PMC5796560 DOI: 10.1186/s12864-018-4485-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 01/16/2018] [Indexed: 11/30/2022] Open
Abstract
Background Transposable elements (TEs) are mobile genetic sequences that randomly propagate within their host’s genome. This mobility has the potential to affect gene transcription and cause disease. However, TEs are technically challenging to identify, which complicates efforts to assess the impact of TE insertions on disease. Here we present a targeted sequencing protocol and computational pipeline to identify polymorphic and novel TE insertions using next-generation sequencing: TE-NGS. The method simultaneously targets the three subfamilies that are responsible for the majority of recent TE activity (L1HS, AluYa5/8, and AluYb8/9) thereby obviating the need for multiple experiments and reducing the amount of input material required. Results Here we describe the laboratory protocol and detection algorithm, and a benchmark experiment for the reference genome NA12878. We demonstrate a substantial enrichment for on-target fragments, and high sensitivity and precision to both reference and NA12878-specific insertions. We report 17 previously unreported loci for this individual which are supported by orthogonal long-read evidence, and we identify 1470 polymorphic and novel TEs in 12 additional samples that were previously undocumented in databases of insertion polymorphisms. Conclusions We anticipate that future applications of TE-NGS alongside exome sequencing of patients with sporadic disease will reduce the number of unresolved cases, and improve estimates of the contribution of TEs to human genetic disease. Electronic supplementary material The online version of this article (10.1186/s12864-018-4485-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Erika M Kvikstad
- Wellcome Trust Centre for Human Genetics, Oxford, UK. .,National Institute for Health Research Comprehensive Biomedical Research Centre, Oxford, UK.
| | - Paolo Piazza
- Wellcome Trust Centre for Human Genetics, Oxford, UK.,Department of Medicine, Imperial College London, London, UK
| | - Jenny C Taylor
- Wellcome Trust Centre for Human Genetics, Oxford, UK.,National Institute for Health Research Comprehensive Biomedical Research Centre, Oxford, UK
| | - Gerton Lunter
- Wellcome Trust Centre for Human Genetics, Oxford, UK
| |
Collapse
|
15
|
Platt RN, Vandewege MW, Ray DA. Mammalian transposable elements and their impacts on genome evolution. Chromosome Res 2018; 26:25-43. [PMID: 29392473 PMCID: PMC5857283 DOI: 10.1007/s10577-017-9570-z] [Citation(s) in RCA: 145] [Impact Index Per Article: 20.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2017] [Revised: 12/12/2017] [Accepted: 12/28/2017] [Indexed: 12/22/2022]
Abstract
Transposable elements (TEs) are genetic elements with the ability to mobilize and replicate themselves in a genome. Mammalian genomes are dominated by TEs, which can reach copy numbers in the hundreds of thousands. As a result, TEs have had significant impacts on mammalian evolution. Here we summarize the current understanding of TE content in mammal genomes and find that, with a few exceptions, most fall within a predictable range of observations. First, one third to one half of the genome is derived from TEs. Second, most mammalian genomes are dominated by LINE and SINE retrotransposons, more limited LTR retrotransposons, and minimal DNA transposon accumulation. Third, most mammal genome contains at least one family of actively accumulating retrotransposon. Finally, horizontal transfer of TEs among lineages is rare. TE exaptation events are being recognized with increasing frequency. Despite these beneficial aspects of TE content and activity, the majority of TE insertions are neutral or deleterious. To limit the deleterious effects of TE proliferation, the genome has evolved several defense mechanisms that act at the epigenetic, transcriptional, and post-transcriptional levels. The interaction between TEs and these defense mechanisms has led to an evolutionary arms race where TEs are suppressed, evolve to escape suppression, then are suppressed again as the defense mechanisms undergo compensatory change. The result is complex and constantly evolving interactions between TEs and host genomes.
Collapse
Affiliation(s)
- Roy N Platt
- Department of Biological Sciences, Texas Tech University, Lubbock, TX, USA.
| | | | - David A Ray
- Department of Biological Sciences, Texas Tech University, Lubbock, TX, USA
| |
Collapse
|
16
|
Differential chromosomal organization between Saguinus midas and Saguinus bicolor with accumulation of differences the repetitive sequence DNA. Genetica 2017. [PMID: 28634866 DOI: 10.1007/s10709-017-9971-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
Saguinus is the largest and most complex genus of the subfamily Callitrichinae, with 23 species distributed from the south of Central America to the north of South America with Saguinus midas having the largest geographical distribution while Saguinus bicolor has a very restricted one, affected by the population expansion in the state of Amazonas. Considering the phylogenetic proximity of the two species along with evidence on the existence of hybrids between them, as well as cytogenetic studies on Saguinus describing a conserved karyotypic macrostructure, we carried out a physical mapping of DNA repeated sequences in the mitotic chromosome of both species, since these sequences are less susceptible to evolutionary pressure and possibly perform an important function in speciation. Both species presented 2n = 46 chromosomes; in S. midas, chromosome Y is the smallest. Multiple ribosomal sites occur in both species, but chromosome pairs three and four may be regarded as markers that differ the species when subjected to G banding and distribution of retroelement LINE 1, suggesting that it may be cytogenetic marker in which it can contribute to identification of first generation hybrids in contact zone. Saguinus bicolor also presented differences in the LINE 1 distribution pattern for sexual chromosome X in individuals from different urban fragments, probably due to geographical isolation. In this context, cytogenetic analyses reveal a differential genomic organization pattern between species S. midas and S. bicolor, in addition to indicating that individuals from different urban fragments have been accumulating differences because of the isolation between them.
Collapse
|
17
|
Chuong EB, Elde NC, Feschotte C. Regulatory activities of transposable elements: from conflicts to benefits. Nat Rev Genet 2016; 18:71-86. [PMID: 27867194 DOI: 10.1038/nrg.2016.139] [Citation(s) in RCA: 818] [Impact Index Per Article: 90.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Transposable elements (TEs) are a prolific source of tightly regulated, biochemically active non-coding elements, such as transcription factor-binding sites and non-coding RNAs. Many recent studies reinvigorate the idea that these elements are pervasively co-opted for the regulation of host genes. We argue that the inherent genetic properties of TEs and the conflicting relationships with their hosts facilitate their recruitment for regulatory functions in diverse genomes. We review recent findings supporting the long-standing hypothesis that the waves of TE invasions endured by organisms for eons have catalysed the evolution of gene-regulatory networks. We also discuss the challenges of dissecting and interpreting the phenotypic effect of regulatory activities encoded by TEs in health and disease.
Collapse
Affiliation(s)
- Edward B Chuong
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| | - Nels C Elde
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, Utah 84103, USA
| |
Collapse
|
18
|
Grégoire L, Haudry A, Lerat E. The transposable element environment of human genes is associated with histone and expression changes in cancer. BMC Genomics 2016; 17:588. [PMID: 27506777 PMCID: PMC4979156 DOI: 10.1186/s12864-016-2970-1] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2016] [Accepted: 07/27/2016] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Only 2 % of the human genome code for proteins. Among the remaining 98 %, transposable elements (TEs) represent millions of sequences. TEs have an impact on genome evolution by promoting mutations. Especially, TEs possess their own regulatory sequences and can alter the expression pattern of neighboring genes. Since they can potentially be harmful, TE activity is regulated by epigenetic mechanisms. These mechanisms participate in the modulation of gene expression and can be associated with some human diseases resulting from gene expression deregulation. The fact that the TE silencing can be removed in cancer could explain a part of the changes in gene expression. Indeed, epigenetic modifications associated locally with TE sequences could impact neighboring genes since these modifications can spread to adjacent sequences. RESULTS We compared the histone enrichment, TE neighborhood, and expression divergence of human genes between a normal and a cancer conditions. We show that the presence of TEs near genes is associated with greater changes in histone enrichment and that differentially expressed genes harbor larger histone enrichment variation related to the presence of particular TEs. CONCLUSIONS Taken together, these results suggest that the presence of TEs near genes could favor important variation in gene expression when the cell environment is modified.
Collapse
Affiliation(s)
- Laura Grégoire
- Université de Lyon; F-69000, France; Université Lyon 1, CNRS, UMR 5558, Laboratoire Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France
| | - Annabelle Haudry
- Université de Lyon; F-69000, France; Université Lyon 1, CNRS, UMR 5558, Laboratoire Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France
| | - Emmanuelle Lerat
- Université de Lyon; F-69000, France; Université Lyon 1, CNRS, UMR 5558, Laboratoire Biométrie et Biologie Evolutive, F-69622, Villeurbanne, France.
| |
Collapse
|
19
|
Campos-Sánchez R, Cremona MA, Pini A, Chiaromonte F, Makova KD. Integration and Fixation Preferences of Human and Mouse Endogenous Retroviruses Uncovered with Functional Data Analysis. PLoS Comput Biol 2016; 12:e1004956. [PMID: 27309962 PMCID: PMC4911145 DOI: 10.1371/journal.pcbi.1004956] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2016] [Accepted: 04/29/2016] [Indexed: 01/24/2023] Open
Abstract
Endogenous retroviruses (ERVs), the remnants of retroviral infections in the germ line, occupy ~8% and ~10% of the human and mouse genomes, respectively, and affect their structure, evolution, and function. Yet we still have a limited understanding of how the genomic landscape influences integration and fixation of ERVs. Here we conducted a genome-wide study of the most recently active ERVs in the human and mouse genome. We investigated 826 fixed and 1,065 in vitro HERV-Ks in human, and 1,624 fixed and 242 polymorphic ETns, as well as 3,964 fixed and 1,986 polymorphic IAPs, in mouse. We quantitated >40 human and mouse genomic features (e.g., non-B DNA structure, recombination rates, and histone modifications) in ±32 kb of these ERVs' integration sites and in control regions, and analyzed them using Functional Data Analysis (FDA) methodology. In one of the first applications of FDA in genomics, we identified genomic scales and locations at which these features display their influence, and how they work in concert, to provide signals essential for integration and fixation of ERVs. The investigation of ERVs of different evolutionary ages (young in vitro and polymorphic ERVs, older fixed ERVs) allowed us to disentangle integration vs. fixation preferences. As a result of these analyses, we built a comprehensive model explaining the uneven distribution of ERVs along the genome. We found that ERVs integrate in late-replicating AT-rich regions with abundant microsatellites, mirror repeats, and repressive histone marks. Regions favoring fixation are depleted of genes and evolutionarily conserved elements, and have low recombination rates, reflecting the effects of purifying selection and ectopic recombination removing ERVs from the genome. In addition to providing these biological insights, our study demonstrates the power of exploiting multiple scales and localization with FDA. These powerful techniques are expected to be applicable to many other genomic investigations.
Collapse
Affiliation(s)
- Rebeca Campos-Sánchez
- Genetics Graduate Program, The Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania, United States of America
| | - Marzia A. Cremona
- MOX—Modeling and Scientific Computing, Department of Mathematics, Politecnico di Milano, Milano, Italy
- Department of Statistics, Penn State University, University Park, Pennsylvania, United States of America
| | - Alessia Pini
- MOX—Modeling and Scientific Computing, Department of Mathematics, Politecnico di Milano, Milano, Italy
| | - Francesca Chiaromonte
- Department of Statistics, Penn State University, University Park, Pennsylvania, United States of America
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania, United States of America
| | - Kateryna D. Makova
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, Pennsylvania, United States of America
- Department of Biology, Penn State University, University Park, Pennsylvania, United States of America
| |
Collapse
|
20
|
Piatek MJ, Henderson V, Zynad HS, Werner A. Natural antisense transcription from a comparative perspective. Genomics 2016; 108:56-63. [PMID: 27241791 PMCID: PMC4996343 DOI: 10.1016/j.ygeno.2016.05.004] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2016] [Revised: 05/08/2016] [Accepted: 05/25/2016] [Indexed: 12/28/2022]
Abstract
Natural antisense transcripts (NATs) can interfere with the expression of complementary sense transcripts with exquisite specificity. We have previously cloned NATs of Slc34a loci (encoding Na-phosphate transporters) from fish and mouse. Here we report the cloning of a human SLC34A1-related NAT that represents an alternatively spliced PFN3 transcript (Profilin3). The transcript is predominantly expressed in testis. Phylogenetic comparison suggests two distinct mechanisms producing Slc34a-related NATs: Alternative splicing of a transcript from a protein coding downstream gene (Pfn3, human/mouse) and transcription from the bi-directional promoter (Rbpja, zebrafish). Expression analysis suggested independent regulation of the complementary Slc34a mRNAs. Analysis of randomly selected bi-directionally transcribed human/mouse loci revealed limited phylogenetic conservation and independent regulation of NATs. They were reduced on X chromosomes and clustered in regions that escape inactivation. Locus structure and expression pattern suggest a NATs-associated regulatory mechanisms in testis unrelated to the physiological role of the sense transcript encoded protein.
Collapse
Affiliation(s)
- Monica J Piatek
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Victoria Henderson
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Hany S Zynad
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom
| | - Andreas Werner
- RNA Interest Group, Institute for Cell and Molecular Biosciences, Newcastle University, Framlington Place, Newcastle upon Tyne NE2 4HH, United Kingdom.
| |
Collapse
|
21
|
Bidon T, Schreck N, Hailer F, Nilsson MA, Janke A. Genome-Wide Search Identifies 1.9 Mb from the Polar Bear Y Chromosome for Evolutionary Analyses. Genome Biol Evol 2015; 7:2010-22. [PMID: 26019166 PMCID: PMC4524476 DOI: 10.1093/gbe/evv103] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
The male-inherited Y chromosome is the major haploid fraction of the mammalian genome, rendering Y-linked sequences an indispensable resource for evolutionary research. However, despite recent large-scale genome sequencing approaches, only a handful of Y chromosome sequences have been characterized to date, mainly in model organisms. Using polar bear (Ursus maritimus) genomes, we compare two different in silico approaches to identify Y-linked sequences: 1) Similarity to known Y-linked genes and 2) difference in the average read depth of autosomal versus sex chromosomal scaffolds. Specifically, we mapped available genomic sequencing short reads from a male and a female polar bear against the reference genome and identify 112 Y-chromosomal scaffolds with a combined length of 1.9 Mb. We verified the in silico findings for the longer polar bear scaffolds by male-specific in vitro amplification, demonstrating the reliability of the average read depth approach. The obtained Y chromosome sequences contain protein-coding sequences, single nucleotide polymorphisms, microsatellites, and transposable elements that are useful for evolutionary studies. A high-resolution phylogeny of the polar bear patriline shows two highly divergent Y chromosome lineages, obtained from analysis of the identified Y scaffolds in 12 previously published male polar bear genomes. Moreover, we find evidence of gene conversion among ZFX and ZFY sequences in the giant panda lineage and in the ancestor of ursine and tremarctine bears. Thus, the identification of Y-linked scaffold sequences from unordered genome sequences yields valuable data to infer phylogenomic and population-genomic patterns in bears.
Collapse
Affiliation(s)
- Tobias Bidon
- Senckenberg Biodiversity and Climate Research Centre Frankfurt, Frankfurt am Main, Germany International Graduate School of Science and Engineering (IGSSE), Technische Universität München, Garching, Germany
| | - Nancy Schreck
- Senckenberg Biodiversity and Climate Research Centre Frankfurt, Frankfurt am Main, Germany
| | - Frank Hailer
- Senckenberg Biodiversity and Climate Research Centre Frankfurt, Frankfurt am Main, Germany School of Biosciences, Cardiff University, Wales, United Kingdom
| | - Maria A Nilsson
- Senckenberg Biodiversity and Climate Research Centre Frankfurt, Frankfurt am Main, Germany
| | - Axel Janke
- Senckenberg Biodiversity and Climate Research Centre Frankfurt, Frankfurt am Main, Germany Institute for Ecology, Evolution & Diversity, Goethe University Frankfurt, Germany
| |
Collapse
|
22
|
Makova KD, Hardison RC. The effects of chromatin organization on variation in mutation rates in the genome. Nat Rev Genet 2015; 16:213-23. [PMID: 25732611 PMCID: PMC4500049 DOI: 10.1038/nrg3890] [Citation(s) in RCA: 160] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
The variation in local rates of mutations can affect both the evolution of genes and their function in normal and cancer cells. Deciphering the molecular determinants of this variation will be aided by the elucidation of distinct types of mutations, as they differ in regional preferences and in associations with genomic features. Chromatin organization contributes to regional variation in mutation rates, but its contribution differs among mutation types. In both germline and somatic mutations, base substitutions are more abundant in regions of closed chromatin, perhaps reflecting error accumulation late in replication. By contrast, a distinctive mutational state with very high levels of insertions and deletions (indels) and substitutions is enriched in regions of open chromatin. These associations indicate an intricate interplay between the nucleotide sequence of DNA and its dynamic packaging into chromatin, and have important implications for current biomedical research. This Review focuses on recent studies showing associations between chromatin state and mutation rates, including pairwise and multivariate investigations of germline and somatic (particularly cancer) mutations.
Collapse
Affiliation(s)
- Kateryna D Makova
- Department of Biology, Huck Institute for Genome Sciences, The Pennsylvania State University, University Park, State College, Pennsylvania 16802, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, Huck Institute for Genome Sciences, The Pennsylvania State University, University Park, State College, Pennsylvania 16802, USA
| |
Collapse
|
23
|
Farré M, Robinson TJ, Ruiz-Herrera A. An Integrative Breakage Model of genome architecture, reshuffling and evolution: The Integrative Breakage Model of genome evolution, a novel multidisciplinary hypothesis for the study of genome plasticity. Bioessays 2015; 37:479-88. [PMID: 25739389 DOI: 10.1002/bies.201400174] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2014] [Revised: 02/12/2015] [Accepted: 02/13/2015] [Indexed: 12/23/2022]
Abstract
Our understanding of genomic reorganization, the mechanics of genomic transmission to offspring during germ line formation, and how these structural changes contribute to the speciation process, and genetic disease is far from complete. Earlier attempts to understand the mechanism(s) and constraints that govern genome remodeling suffered from being too narrowly focused, and failed to provide a unified and encompassing view of how genomes are organized and regulated inside cells. Here, we propose a new multidisciplinary Integrative Breakage Model for the study of genome evolution. The analysis of the high-level structural organization of genomes (nucleome), together with the functional constrains that accompany genome reshuffling, provide insights into the origin and plasticity of genome organization that may assist with the detection and isolation of therapeutic targets for the treatment of complex human disorders.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Campus UAB, Barcelona, Spain
| | | | | |
Collapse
|
24
|
Keane TM, Wong K, Adams DJ, Flint J, Reymond A, Yalcin B. Identification of structural variation in mouse genomes. Front Genet 2014; 5:192. [PMID: 25071822 PMCID: PMC4079067 DOI: 10.3389/fgene.2014.00192] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2014] [Accepted: 06/12/2014] [Indexed: 01/25/2023] Open
Abstract
Structural variation is variation in structure of DNA regions affecting DNA sequence length and/or orientation. It generally includes deletions, insertions, copy-number gains, inversions, and transposable elements. Traditionally, the identification of structural variation in genomes has been challenging. However, with the recent advances in high-throughput DNA sequencing and paired-end mapping (PEM) methods, the ability to identify structural variation and their respective association to human diseases has improved considerably. In this review, we describe our current knowledge of structural variation in the mouse, one of the prime model systems for studying human diseases and mammalian biology. We further present the evolutionary implications of structural variation on transposable elements. We conclude with future directions on the study of structural variation in mouse genomes that will increase our understanding of molecular architecture and functional consequences of structural variation.
Collapse
Affiliation(s)
| | - Kim Wong
- Wellcome Trust Sanger Institute Hinxton, Cambridge, UK
| | - David J Adams
- Wellcome Trust Sanger Institute Hinxton, Cambridge, UK
| | | | - Alexandre Reymond
- Center for Integrative Genomics, University of Lausanne Lausanne, Switzerland
| | - Binnaz Yalcin
- Center for Integrative Genomics, University of Lausanne Lausanne, Switzerland ; Institute of Genetics and Molecular and Cellular Biology Illkirch, France
| |
Collapse
|
25
|
Bailly-Bechet M, Haudry A, Lerat E. “One code to find them all”: a perl tool to conveniently parse RepeatMasker output files. Mob DNA 2014. [PMCID: PMC4021974 DOI: 10.1186/1759-8753-5-13] [Citation(s) in RCA: 93] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Background Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes.
Collapse
|
26
|
Campos-Sánchez R, Kapusta A, Feschotte C, Chiaromonte F, Makova KD. Genomic landscape of human, bat, and ex vivo DNA transposon integrations. Mol Biol Evol 2014; 31:1816-32. [PMID: 24809961 DOI: 10.1093/molbev/msu138] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The integration and fixation preferences of DNA transposons, one of the major classes of eukaryotic transposable elements, have never been evaluated comprehensively on a genome-wide scale. Here, we present a detailed study of the distribution of DNA transposons in the human and bat genomes. We studied three groups of DNA transposons that integrated at different evolutionary times: 1) ancient (>40 My) and currently inactive human elements, 2) younger (<40 My) bat elements, and 3) ex vivo integrations of piggyBat and Sleeping Beauty elements in HeLa cells. Although the distribution of ex vivo elements reflected integration preferences, the distribution of human and (to a lesser extent) bat elements was also affected by selection. We used regression techniques (linear, negative binomial, and logistic regression models with multiple predictors) applied to 20-kb and 1-Mb windows to investigate how the genomic landscape in the vicinity of DNA transposons contributes to their integration and fixation. Our models indicate that genomic landscape explains 16-79% of variability in DNA transposon genome-wide distribution. Importantly, we not only confirmed previously identified predictors (e.g., DNA conformation and recombination hotspots) but also identified several novel predictors (e.g., signatures of double-strand breaks and telomere hexamer). Ex vivo integrations showed a bias toward actively transcribed regions. Older DNA transposons were located in genomic regions scarce in most conserved elements-likely reflecting purifying selection. Our study highlights how DNA transposons are integral to the evolution of bat and human genomes, and has implications for the development of DNA transposon assays for gene therapy and mutagenesis applications.
Collapse
Affiliation(s)
- Rebeca Campos-Sánchez
- Genetics Program, The Huck Institutes of the Life Sciences, Penn State University, University Park, PA
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
| | - Francesca Chiaromonte
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, PADepartment of Statistics, Penn State University, University Park, PA
| | - Kateryna D Makova
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, PADepartment of Biology, Penn State University, University Park, PA
| |
Collapse
|
27
|
Xue B, He L. An expanding universe of the non-coding genome in cancer biology. Carcinogenesis 2014; 35:1209-16. [PMID: 24747961 DOI: 10.1093/carcin/bgu099] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022] Open
Abstract
Neoplastic transformation is caused by accumulation of genetic and epigenetic alterations that ultimately convert normal cells into tumor cells with uncontrolled proliferation and survival, unlimited replicative potential and invasive growth [Hanahan,D. et al. (2011) Hallmarks of cancer: the next generation. Cell, 144, 646-674]. Although the majority of the cancer studies have focused on the functions of protein-coding genes, emerging evidence has started to reveal the importance of the vast non-coding genome, which constitutes more than 98% of the human genome. A number of non-coding RNAs (ncRNAs) derived from the 'dark matter' of the human genome exhibit cancer-specific differential expression and/or genomic alterations, and it is increasingly clear that ncRNAs, including small ncRNAs and long ncRNAs (lncRNAs), play an important role in cancer development by regulating protein-coding gene expression through diverse mechanisms. In addition to ncRNAs, nearly half of the mammalian genomes consist of transposable elements, particularly retrotransposons. Once depicted as selfish genomic parasites that propagate at the expense of host fitness, retrotransposon elements could also confer regulatory complexity to the host genomes during development and disease. Reactivation of retrotransposons in cancer, while capable of causing insertional mutagenesis and genome rearrangements to promote oncogenesis, could also alter host gene expression networks to favor tumor development. Taken together, the functional significance of non-coding genome in tumorigenesis has been previously underestimated, and diverse transcripts derived from the non-coding genome could act as integral functional components of the oncogene and tumor suppressor network.
Collapse
Affiliation(s)
- Bin Xue
- Department of Molecular and Cell Biology, Division of Cellular and Developmental Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| | - Lin He
- Department of Molecular and Cell Biology, Division of Cellular and Developmental Biology, University of California at Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
28
|
Kim YJ, Jung YD, Kim TO, Kim HS. Alu-related transcript of TJP2 gene as a marker for colorectal cancer. Gene 2013; 524:268-74. [DOI: 10.1016/j.gene.2013.04.006] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2012] [Revised: 04/08/2013] [Accepted: 04/10/2013] [Indexed: 12/30/2022]
|
29
|
Wagstaff BJ, Hedges DJ, Derbes RS, Campos Sanchez R, Chiaromonte F, Makova KD, Roy-Engel AM. Rescuing Alu: recovery of new inserts shows LINE-1 preserves Alu activity through A-tail expansion. PLoS Genet 2012; 8:e1002842. [PMID: 22912586 PMCID: PMC3415434 DOI: 10.1371/journal.pgen.1002842] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2011] [Accepted: 05/30/2012] [Indexed: 12/15/2022] Open
Abstract
Alu elements are trans-mobilized by the autonomous non-LTR retroelement, LINE-1 (L1). Alu-induced insertion mutagenesis contributes to about 0.1% human genetic disease and is responsible for the majority of the documented instances of human retroelement insertion-induced disease. Here we introduce a SINE recovery method that provides a complementary approach for comprehensive analysis of the impact and biological mechanisms of Alu retrotransposition. Using this approach, we recovered 226 de novo tagged Alu inserts in HeLa cells. Our analysis reveals that in human cells marked Alu inserts driven by either exogenously supplied full length L1 or ORF2 protein are indistinguishable. Four percent of de novo Alu inserts were associated with genomic deletions and rearrangements and lacked the hallmarks of retrotransposition. In contrast to L1 inserts, 5′ truncations of Alu inserts are rare, as most of the recovered inserts (96.5%) are full length. De novo Alus show a random pattern of insertion across chromosomes, but further characterization revealed an Alu insertion bias exists favoring insertion near other SINEs, highly conserved elements, with almost 60% landing within genes. De novo Alu inserts show no evidence of RNA editing. Priming for reverse transcription rarely occurred within the first 20 bp (most 5′) of the A-tail. The A-tails of recovered inserts show significant expansion, with many at least doubling in length. Sequence manipulation of the construct led to the demonstration that the A-tail expansion likely occurs during insertion due to slippage by the L1 ORF2 protein. We postulate that the A-tail expansion directly impacts Alu evolution by reintroducing new active source elements to counteract the natural loss of active Alus and minimizing Alu extinction. SINEs are mobile elements that are found ubiquitously throughout a large diversity of genomes from plants to mammals. The human SINE, Alu, is among the most successful mobile elements, with more than one million copies in the genome. Due to its high activity and ability to insert throughout the genome, Alu retrotransposition is responsible for the majority of diseases reported to be caused by mobile element activity. To further evaluate the genomic impact of SINEs, we recovered and characterized over 200 de novo Alu inserts under controlled conditions. Our data reinforce observations on the mutagenic potential of Alu, with newly retrotransposed Alu elements favoring insertion into genic and highly conserved elements. Alu-mediated deletions and rearrangements are infrequent and lack the typical hallmarks of TPRT retrotransposition, suggesting the use of an alternate method for resolving retrotransposition intermediates or an atypical insertion mechanism. Our data also provide novel insights into SINE retrotransposition biology. We found that slippage of L1 ORF2 protein during reverse transcription expands the A-tails of de novo insertions. We propose that the L1 ORF2 protein plays a major role in minimizing Alu extinction by reintroducing active Alu elements to counter the natural loss of Alu source elements.
Collapse
Affiliation(s)
- Bradley J. Wagstaff
- Tulane Cancer Center, Department of Epidemiology, Tulane University, New Orleans, Louisiana, United States of America
| | - Dale J. Hedges
- Hussman Institute for Human Genomics, Dr. John T. Macdonald Foundation Department of Human Genetics, Miller School of Medicine, University of Miami, Miami, Florida, United States of America
| | - Rebecca S. Derbes
- Tulane Cancer Center, Department of Epidemiology, Tulane University, New Orleans, Louisiana, United States of America
| | - Rebeca Campos Sanchez
- Department of Biology, Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Francesca Chiaromonte
- Department of Biology, Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Kateryna D. Makova
- Department of Biology, Center for Medical Genomics, Pennsylvania State University, University Park, Pennsylvania, United States of America
| | - Astrid M. Roy-Engel
- Tulane Cancer Center, Department of Epidemiology, Tulane University, New Orleans, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
30
|
Nellåker C, Keane TM, Yalcin B, Wong K, Agam A, Belgard TG, Flint J, Adams DJ, Frankel WN, Ponting CP. The genomic landscape shaped by selection on transposable elements across 18 mouse strains. Genome Biol 2012; 13:R45. [PMID: 22703977 PMCID: PMC3446317 DOI: 10.1186/gb-2012-13-6-r45] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2012] [Revised: 05/25/2012] [Accepted: 06/15/2012] [Indexed: 12/20/2022] Open
Abstract
Background Transposable element (TE)-derived sequence dominates the landscape of mammalian genomes and can modulate gene function by dysregulating transcription and translation. Our current knowledge of TEs in laboratory mouse strains is limited primarily to those present in the C57BL/6J reference genome, with most mouse TEs being drawn from three distinct classes, namely short interspersed nuclear elements (SINEs), long interspersed nuclear elements (LINEs) and the endogenous retrovirus (ERV) superfamily. Despite their high prevalence, the different genomic and gene properties controlling whether TEs are preferentially purged from, or are retained by, genetic drift or positive selection in mammalian genomes remain poorly defined. Results Using whole genome sequencing data from 13 classical laboratory and 4 wild-derived mouse inbred strains, we developed a comprehensive catalogue of 103,798 polymorphic TE variants. We employ this extensive data set to characterize TE variants across the Mus lineage, and to infer neutral and selective processes that have acted over 2 million years. Our results indicate that the majority of TE variants are introduced though the male germline and that only a minority of TE variants exert detectable changes in gene expression. However, among genes with differential expression across the strains there are twice as many TE variants identified as being putative causal variants as expected. Conclusions Most TE variants that cause gene expression changes appear to be purged rapidly by purifying selection. Our findings demonstrate that past TE insertions have often been highly deleterious, and help to prioritize TE variants according to their likely contribution to gene expression or phenotype variation.
Collapse
Affiliation(s)
- Christoffer Nellåker
- MRC Functional Genomics Unit, Department of Physiology, Anatomy and Genetics, University of Oxford, Oxford, UK.
| | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Ashida H, Asai K, Hamada M. Shape-based alignment of genomic landscapes in multi-scale resolution. Nucleic Acids Res 2012; 40:6435-48. [PMID: 22561376 PMCID: PMC3413149 DOI: 10.1093/nar/gks354] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
Due to dramatic advances in DNA technology, quantitative measures of annotation data can now be obtained in continuous coordinates across the entire genome, allowing various heterogeneous ‘genomic landscapes’ to emerge. Although much effort has been devoted to comparing DNA sequences, not much attention has been given to comparing these large quantities of data comprehensively. In this article, we introduce a method for rapidly detecting local regions that show high correlations between genomic landscapes. We overcame the size problem for genome-wide data by converting the data into series of symbols and then carrying out sequence alignment. We also decomposed the oscillation of the landscape data into different frequency bands before analysis, since the real genomic landscape is a mixture of embedded and confounded biological processes working at different scales in the cell nucleus. To verify the usefulness and generality of our method, we applied our approach to well investigated landscapes from the human genome, including several histone modifications. Furthermore, by applying our method to over 20 genomic landscapes in human and 12 in mouse, we found that DNA replication timing and the density of Alu insertions are highly correlated genome-wide in both species, even though the Alu elements have amplified independently in the two genomes. To our knowledge, this is the first method to align genomic landscapes at multiple scales according to their shape.
Collapse
Affiliation(s)
- Hiroki Ashida
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa-shi, Chiba 277-8561, Japan.
| | | | | |
Collapse
|
32
|
Popa A, Samollow P, Gautier C, Mouchiroud D. The sex-specific impact of meiotic recombination on nucleotide composition. Genome Biol Evol 2012; 4:412-22. [PMID: 22417915 PMCID: PMC3318449 DOI: 10.1093/gbe/evs023] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Meiotic recombination is an important evolutionary force shaping the nucleotide landscape of genomes. For most vertebrates, the frequency of recombination varies slightly or considerably between the sexes (heterochiasmy). In humans, male, rather than female, recombination rate has been found to be more highly correlated with the guanine and cytosine (GC) content across the genome. In the present study, we review the results in human and extend the examination of the evolutionary impact of heterochiasmy beyond primates to include four additional eutherian mammals (mouse, dog, pig, and sheep), a metatherian mammal (opossum), and a bird (chicken). Specifically, we compared sex-specific recombination rates (RRs) with nucleotide substitution patterns evaluated in transposable elements. Our results, based on a comparative approach, reveal a great diversity in the relationship between heterochiasmy and nucleotide composition. We find that the stronger male impact on this relationship is a conserved feature of human, mouse, dog, and sheep. In contrast, variation in genomic GC content in pig and opossum is more strongly correlated with female, rather than male, RR. Moreover, we show that the sex-differential impact of recombination is mainly driven by the chromosomal localization of recombination events. Independent of sex, the higher the RR in a genomic region and the longer this recombination activity is conserved in time, the stronger the bias in nucleotide substitution pattern, through such mechanisms as biased gene conversion. Over time, this bias will increase the local GC content of the region.
Collapse
|
33
|
Klimopoulos A, Sellis D, Almirantis Y. Widespread occurrence of power-law distributions in inter-repeat distances shaped by genome dynamics. Gene 2012; 499:88-98. [PMID: 22370293 DOI: 10.1016/j.gene.2012.02.005] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2011] [Revised: 02/05/2012] [Accepted: 02/06/2012] [Indexed: 11/25/2022]
Abstract
Repetitive DNA sequences derived from transposable elements (TE) are distributed in a non-random way, co-clustering with other classes of repeat elements, genes and other genomic components. In a previous work we reported power-law-like size distributions (linearity in log-log scale) in the spatial arrangement of Alu and LINE1 elements in the human genome. Here we investigate the large-scale features of the spatial arrangement of all principal classes of TEs in 14 genomes from phylogenetically distant organisms by studying the size distribution of inter-repeat distances. Power-law-like size distributions are found to be widespread, extending up to several orders of magnitude. In order to understand the emergence of this distributional pattern, we introduce an evolutionary scenario, which includes (i) Insertions of DNA segments (e.g., more recent repeats) into the considered sequence and (ii) Eliminations of members of the studied TE family. In the proposed model we also incorporate the potential for transposition events (characteristic of the DNA transposons' life-cycle) and segmental duplications. Simulations reproduce the main features of the observed size distributions. Furthermore, we investigate the effects of various genomic features on the presence and extent of power-law size distributions including TE class and age, mode of parental TE transmission, GC content, deletion and recombination rates in the studied genomic region, etc. Our observations corroborate the hypothesis that insertions of genomic material and eliminations of repeats are at the basis of power-laws in inter-repeat distances. The existence of these power-laws could facilitate the formation of the recently proposed "fractal globule" for the confined chromatin organization.
Collapse
Affiliation(s)
- Alexandros Klimopoulos
- National Center for Scientific Research "Demokritos," Institute of Biology, 153 10 Athens, Greece.
| | | | | |
Collapse
|
34
|
Abstract
In many species the mutation rate is higher in males than in females, a phenomenon denoted as male mutation bias. This is often observed in animals where males produce many more sperm than females produce eggs, and is thought to result from differences in the number of replication-associated mutations accumulated in each sex. Thus, studies of male mutation bias have the capacity to reveal information about the replication-dependent or replication-independent nature of different mutations. The availability of whole genome sequences for many species, as well as for multiple individuals within a species, has opened the door to studying factors, both sequence-specific and those acting on the genome globally, that affect differences in mutation rates between males and females. Here, we assess the advantages that genomic sequences provide for studies of male mutation bias and general mutation mechanisms, discuss major challenges left unresolved, and speculate about the direction of future studies.
Collapse
Affiliation(s)
- Melissa A. Wilson Sayres
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, PA, USA
| | - Kateryna D. Makova
- Center for Comparative Genomics and Bioinformatics, The Pennsylvania State University, University Park, PA, USA
| |
Collapse
|
35
|
Farré M, Bosch M, López-Giráldez F, Ponsà M, Ruiz-Herrera A. Assessing the role of tandem repeats in shaping the genomic architecture of great apes. PLoS One 2011; 6:e27239. [PMID: 22076140 PMCID: PMC3208591 DOI: 10.1371/journal.pone.0027239] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2011] [Accepted: 10/12/2011] [Indexed: 11/18/2022] Open
Abstract
Background Ancestral reconstructions of mammalian genomes have revealed that evolutionary breakpoint regions are clustered in regions that are more prone to break and reorganize. What is still unclear to evolutionary biologists is whether these regions are physically unstable due solely to sequence composition and/or genome organization, or do they represent genomic areas where the selection against breakpoints is minimal. Methodology and Principal Findings Here we present a comprehensive study of the distribution of tandem repeats in great apes. We analyzed the distribution of tandem repeats in relation to the localization of evolutionary breakpoint regions in the human, chimpanzee, orangutan and macaque genomes. We observed an accumulation of tandem repeats in the genomic regions implicated in chromosomal reorganizations. In the case of the human genome our analyses revealed that evolutionary breakpoint regions contained more base pairs implicated in tandem repeats compared to synteny blocks, being the AAAT motif the most frequently involved in evolutionary regions. We found that those AAAT repeats located in evolutionary regions were preferentially associated with Alu elements. Significance Our observations provide evidence for the role of tandem repeats in shaping mammalian genome architecture. We hypothesize that an accumulation of specific tandem repeats in evolutionary regions can promote genome instability by altering the state of the chromatin conformation or by promoting the insertion of transposable elements.
Collapse
Affiliation(s)
- Marta Farré
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | | | - Francesc López-Giráldez
- Department of Ecology and Evolutionary Biology, Yale University, New Haven, Connecticut, United States of America
| | - Montserrat Ponsà
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
| | - Aurora Ruiz-Herrera
- Departament de Biologia Cel·lular, Fisiologia i Immunologia, Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- Institut de Biotecnologia i Biomedicina (IBB), Universitat Autònoma de Barcelona, Cerdanyola del Vallès, Spain
- * E-mail:
| |
Collapse
|
36
|
Jurka J, Bao W, Kojima KK. Families of transposable elements, population structure and the origin of species. Biol Direct 2011; 6:44. [PMID: 21929767 PMCID: PMC3183009 DOI: 10.1186/1745-6150-6-44] [Citation(s) in RCA: 91] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2011] [Accepted: 09/19/2011] [Indexed: 11/23/2022] Open
Abstract
Background Eukaryotic genomes harbor diverse families of repetitive DNA derived from transposable elements (TEs) that are able to replicate and insert into genomic DNA. The biological role of TEs remains unclear, although they have profound mutagenic impact on eukaryotic genomes and the origin of repetitive families often correlates with speciation events. We present a new hypothesis to explain the observed correlations based on classical concepts of population genetics. Presentation of the hypothesis The main thesis presented in this paper is that the TE-derived repetitive families originate primarily by genetic drift in small populations derived mostly by subdivisions of large populations into subpopulations. We outline the potential impact of the emerging repetitive families on genetic diversification of different subpopulations, and discuss implications of such diversification for the origin of new species. Testing the hypothesis Several testable predictions of the hypothesis are examined. First, we focus on the prediction that the number of diverse families of TEs fixed in a representative genome of a particular species positively correlates with the cumulative number of subpopulations (demes) in the historical metapopulation from which the species has emerged. Furthermore, we present evidence indicating that human AluYa5 and AluYb8 families might have originated in separate proto-human subpopulations. We also revisit prior evidence linking the origin of repetitive families to mammalian phylogeny and present additional evidence linking repetitive families to speciation based on mammalian taxonomy. Finally, we discuss evidence that mammalian orders represented by the largest numbers of species may be subject to relatively recent population subdivisions and speciation events. Implications of the hypothesis The hypothesis implies that subdivision of a population into small subpopulations is the major step in the origin of new families of TEs as well as of new species. The origin of new subpopulations is likely to be driven by the availability of new biological niches, consistent with the hypothesis of punctuated equilibria. The hypothesis also has implications for the ongoing debate on the role of genetic drift in genome evolution. Reviewers This article was reviewed by Eugene Koonin, Juergen Brosius and I. King Jordan.
Collapse
Affiliation(s)
- Jerzy Jurka
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043, USA.
| | | | | |
Collapse
|
37
|
Castañeda J, Genzor P, Bortvin A. piRNAs, transposon silencing, and germline genome integrity. Mutat Res 2011; 714:95-104. [PMID: 21600904 DOI: 10.1016/j.mrfmmm.2011.05.002] [Citation(s) in RCA: 74] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2010] [Accepted: 05/04/2011] [Indexed: 12/17/2022]
Abstract
Integrity of the germline genome is essential for the production of viable gametes and successful reproduction. In mammals, the generation of gametes involves extensive epigenetic changes (DNA methylation and histone modification) in conjunction with changes in chromosome structure to ensure flawless progression through meiotic recombination and packaging of the genome into mature gametes. Although epigenetic reprogramming is essential for mammalian reproduction, reprogramming also provides a permissive window for exploitation by transposable elements (TEs), autonomously replicating endogenous elements. Expression and propagation of TEs during the reprogramming period can result in insertional mutagenesis that compromises genome integrity leading to reproductive problems and sporadic inherited diseases in offspring. Recent work has identified the germ cell associated PIWI Interacting RNA (piRNA) pathway in conjunction with the DNA methylation and histone modification machinery in silencing TEs. In this review we will highlight these recent advances in piRNA mediated regulation of TEs in the mouse germline, as well as mention the repercussions of failure to properly regulate TEs.
Collapse
Affiliation(s)
- Julio Castañeda
- Biology Department, Johns Hopkins University, Baltimore, MD 21218, USA
| | | | | |
Collapse
|
38
|
Zhang W, Edwards A, Fan W, Deininger P, Zhang K. Alu distribution and mutation types of cancer genes. BMC Genomics 2011; 12:157. [PMID: 21429208 PMCID: PMC3074553 DOI: 10.1186/1471-2164-12-157] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2010] [Accepted: 03/23/2011] [Indexed: 12/24/2022] Open
Abstract
Background Alu elements are the most abundant retrotransposable elements comprising ~11% of the human genome. Many studies have highlighted the role that Alu elements have in genetic instability and how their contribution to the assortment of mutagenic events can lead to cancer. As of yet, little has been done to quantitatively assess the association between Alu distribution and genes that are causally implicated in oncogenesis. Results We have investigated the effect of various Alu densities on the mutation type based classifications of cancer genes. In order to establish the direct relationship between Alus and the cancer genes of interest, genome wide Alu-related densities were measured using genes rather than the sliding windows of fixed length as the units. Several novel genomic features, such as the density of the adjacent Alu pairs and the number of Alu-Exon-Alu triplets, were developed in order to extend the investigation via the multivariate statistical analysis toward more advanced biological insight. In addition, we characterized the genome-wide intron Alu distribution with a mixture model that distinguished genes containing Alu elements from those with no Alus, and evaluated the gene-level effect of the 5'-TTAAAA motif associated with Alu insertion sites using a two-step regression analysis method. Conclusions The study resulted in several novel findings worthy of further investigation. They include: (1) Recessive cancer genes (tumor suppressor genes) are enriched with Alu elements (p < 0.01) compared to dominant cancer genes (oncogenes) and the entire set of genes in the human genome; (2) Alu-related genomic features can be used to cluster cancer genes into biological meaningful groups; (3) The retention of exon Alus has been restricted in the human genome development, and an upper limit to the chromosome-level exon Alu densities is suggested by the distribution profile; (4) For the genes with at least one intron Alu repeat in individual chromosomes, the intron Alu densities can be well fitted by a Gamma distribution; (5) The effect of the 5'-TTAAAA motif on Alu densities varies across different chromosomes.
Collapse
Affiliation(s)
- Wensheng Zhang
- Department of Computer Science, Xavier University of Louisiana, 1 Drexel Drive, New Orleans, LA 70125, USA
| | | | | | | | | |
Collapse
|
39
|
Ananda G, Chiaromonte F, Makova KD. A genome-wide view of mutation rate co-variation using multivariate analyses. Genome Biol 2011; 12:R27. [PMID: 21426544 PMCID: PMC3129677 DOI: 10.1186/gb-2011-12-3-r27] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2010] [Revised: 02/21/2011] [Accepted: 03/22/2011] [Indexed: 01/03/2023] Open
Abstract
Background While the abundance of available sequenced genomes has led to many studies of regional heterogeneity in mutation rates, the co-variation among rates of different mutation types remains largely unexplored, hindering a deeper understanding of mutagenesis and genome dynamics. Here, utilizing primate and rodent genomic alignments, we apply two multivariate analysis techniques (principal components and canonical correlations) to investigate the structure of rate co-variation for four mutation types and simultaneously explore the associations with multiple genomic features at different genomic scales and phylogenetic distances. Results We observe a consistent, largely linear co-variation among rates of nucleotide substitutions, small insertions and small deletions, with some non-linear associations detected among these rates on chromosome X and near autosomal telomeres. This co-variation appears to be shaped by a common set of genomic features, some previously investigated and some novel to this study (nuclear lamina binding sites, methylated non-CpG sites and nucleosome-free regions). Strong non-linear relationships are also detected among genomic features near the centromeres of large chromosomes. Microsatellite mutability co-varies with other mutation rates at finer scales, but not at 1 Mb, and shows varying degrees of association with genomic features at different scales. Conclusions Our results allow us to speculate about the role of different molecular mechanisms, such as replication, recombination, repair and local chromatin environment, in mutagenesis. The software tools developed for our analyses are available through Galaxy, an open-source genomics portal, to facilitate the use of multivariate techniques in future large-scale genomics studies.
Collapse
Affiliation(s)
- Guruprasad Ananda
- Center for Medical Genomics, Penn State University, University Park, PA 16802, USA
| | | | | |
Collapse
|