1
|
Loreto ELS, Melo ESD, Wallau GL, Gomes TMFF. The good, the bad and the ugly of transposable elements annotation tools. Genet Mol Biol 2024; 46:e20230138. [PMID: 38373163 PMCID: PMC10876081 DOI: 10.1590/1678-4685-gmb-2023-0138] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Accepted: 11/26/2023] [Indexed: 02/21/2024] Open
Abstract
Transposable elements are repetitive and mobile DNA segments that can be found in virtually all organisms investigated to date. Their complex structure and variable nature are particularly challenging from the genomic annotation point of view. Many softwares have been developed to automate and facilitate TEs annotation at the genomic level, but they are highly heterogeneous regarding documentation, usability and methods. In this review, we revisited the existing software for TE genomic annotation, concentrating on the most often used ones, the methodologies they apply, and usability. Building on the state of the art of TE annotation software we propose best practices and highlight the strengths and weaknesses from the available solutions.
Collapse
Affiliation(s)
- Elgion L S Loreto
- Universidade Federal do Rio Grande do Sul, Programa de Pós-Graduação em Genética e Biologia Molecular, Porto Alegre, RS, Brazil
- Universidade Federal de Santa Maria, Departamento de Bioquímica e Biologia Molecular, Santa Maria, RS, Brazil
| | - Elverson S de Melo
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Entomologia, Recife, PE, Brazil
| | - Gabriel L Wallau
- Fundação Oswaldo Cruz, Instituto Aggeu Magalhães, Departamento de Entomologia, Recife, PE, Brazil
| | - Tiago M F F Gomes
- Universidade Federal do Rio Grande do Sul, Programa de Pós-Graduação em Genética e Biologia Molecular, Porto Alegre, RS, Brazil
| |
Collapse
|
2
|
Flynn JM, Ahmed-Braimah YH, Long M, Wing RA, Clark AG. High-Quality Genome Assemblies Reveal Evolutionary Dynamics of Repetitive DNA and Structural Rearrangements in the Drosophila virilis Subgroup. Genome Biol Evol 2024; 16:evad238. [PMID: 38159044 PMCID: PMC10783647 DOI: 10.1093/gbe/evad238] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Revised: 12/18/2023] [Accepted: 12/23/2023] [Indexed: 01/03/2024] Open
Abstract
High-quality genome assemblies across a range of nontraditional model organisms can accelerate the discovery of novel aspects of genome evolution. The Drosophila virilis group has several attributes that distinguish it from more highly studied species in the Drosophila genus, such as an unusual abundance of repetitive elements and extensive karyotype evolution, in addition to being an attractive model for speciation genetics. Here, we used long-read sequencing to assemble five genomes of three virilis group species and characterized sequence and structural divergence and repetitive DNA evolution. We find that our contiguous genome assemblies allow characterization of chromosomal arrangements with ease and can facilitate analysis of inversion breakpoints. We also leverage a small panel of resequenced strains to explore the genomic pattern of divergence and polymorphism in this species and show that known demographic histories largely predicts the extent of genome-wide segregating polymorphism. We further find that a neo-X chromosome in Drosophila americana displays X-like levels of nucleotide diversity. We also found that unusual repetitive elements were responsible for much of the divergence in genome composition among species. Helitron-derived tandem repeats tripled in abundance on the Y chromosome in D. americana compared to Drosophila novamexicana, accounting for most of the difference in repeat content between these sister species. Repeats with characteristics of both transposable elements and satellite DNAs expanded by 3-fold, mostly in euchromatin, in both D. americana and D. novamexicana compared to D. virilis. Our results represent a major advance in our understanding of genome biology in this emerging model clade.
Collapse
Affiliation(s)
- Jullien M Flynn
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | | | - Manyuan Long
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Rod A Wing
- School of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, AZ, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| |
Collapse
|
3
|
Xu MRX, Liao ZY, Brock JR, Du K, Li GY, Chen ZQ, Wang YH, Gao ZN, Agarwal G, Wei KHC, Shao F, Pang S, Platts AE, van de Velde J, Lin HM, Teresi SJ, Bird K, Niederhuth CE, Xu JG, Yu GH, Yang JY, Dai SF, Nelson A, Braasch I, Zhang XG, Schartl M, Edger PP, Han MJ, Zhang HH. Maternal dominance contributes to subgenome differentiation in allopolyploid fishes. Nat Commun 2023; 14:8357. [PMID: 38102128 PMCID: PMC10724154 DOI: 10.1038/s41467-023-43740-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2023] [Accepted: 11/17/2023] [Indexed: 12/17/2023] Open
Abstract
Teleost fishes, which are the largest and most diverse group of living vertebrates, have a rich history of ancient and recent polyploidy. Previous studies of allotetraploid common carp and goldfish (cyprinids) reported a dominant subgenome, which is more expressed and exhibits biased gene retention. However, the underlying mechanisms contributing to observed 'subgenome dominance' remains poorly understood. Here we report high-quality genomes of twenty-one cyprinids to investigate the origin and subsequent subgenome evolution patterns following three independent allopolyploidy events. We identify the closest extant relatives of the diploid progenitor species, investigate genetic and epigenetic differences among subgenomes, and conclude that observed subgenome dominance patterns are likely due to a combination of maternal dominance and transposable element densities in each polyploid. These findings provide an important foundation to understanding subgenome dominance patterns observed in teleost fishes, and ultimately the role of polyploidy in contributing to evolutionary innovations.
Collapse
Affiliation(s)
- Min-Rui-Xuan Xu
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | - Zhen-Yang Liao
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Jordan R Brock
- Department of Horticulture, Michigan State University, East Lansing, MI, USA
| | - Kang Du
- The Xiphophorus Genetic Stock Center, Texas State University, San Marcos, TX, USA
| | - Guo-Yin Li
- College of Life Science and Agronomy, Zhoukou Normal University, Zhoukou, Henan, China
| | | | - Ying-Hao Wang
- Shenzhen Branch, Guangdong Laboratory for Lingnan Modern Agriculture, Genome Analysis Laboratory of the Ministry of Agriculture, Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China
| | - Zhong-Nan Gao
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | - Gaurav Agarwal
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Kevin H-C Wei
- Department of Integrative Biology, University of California Berkeley, Berkeley, CA, USA
- Department of Zoology, University of British Columbia, Vancouver, British Columbia, Canada
| | - Feng Shao
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University, School of Life Sciences, Chongqing, China
| | | | - Adrian E Platts
- Department of Horticulture, Michigan State University, East Lansing, MI, USA
| | - Jozefien van de Velde
- Department of Chromosome Biology, Max Planck Institute for Plant Breeding Research, Cologne, Germany
| | - Hong-Min Lin
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | - Scott J Teresi
- Department of Horticulture, Michigan State University, East Lansing, MI, USA
| | - Kevin Bird
- Department of Horticulture, Michigan State University, East Lansing, MI, USA
| | - Chad E Niederhuth
- Department of Plant Biology, Michigan State University, East Lansing, MI, USA
| | - Jin-Gen Xu
- Jiujiang Academy of Agricultural Sciences, Jiujiang, China
| | - Guo-Hua Yu
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | - Jian-Yuan Yang
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | - Si-Fa Dai
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China
| | | | - Ingo Braasch
- Department of Integrative Biology, Michigan State University, East Lansing, MI, USA
| | - Xiao-Gu Zhang
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China.
| | - Manfred Schartl
- The Xiphophorus Genetic Stock Center, Texas State University, San Marcos, TX, USA.
- Developmental Biochemistry, Biocenter, University of Würzburg, Würzburg, Bayern, Germany.
| | - Patrick P Edger
- Department of Horticulture, Michigan State University, East Lansing, MI, USA.
| | - Min-Jin Han
- State Key Laboratory of Resource Insects, Key Laboratory for Sericulture Functional Genomics and Biotechnology of Agricultural Ministry, Southwest University, Chongqing, China.
| | - Hua-Hao Zhang
- College of Pharmacy and Life Science, Jiujiang University, Jiujiang, China.
| |
Collapse
|
4
|
Mokhtar MM, El Allali A. MegaLTR: a web server and standalone pipeline for detecting and annotating LTR-retrotransposons in plant genomes. FRONTIERS IN PLANT SCIENCE 2023; 14:1237426. [PMID: 37810401 PMCID: PMC10552921 DOI: 10.3389/fpls.2023.1237426] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2023] [Accepted: 08/21/2023] [Indexed: 10/10/2023]
Abstract
LTR-retrotransposons (LTR-RTs) are a class of RNA-replicating transposon elements (TEs) that can alter genome structure and function by moving positions, repositioning genes, shifting exons, and causing chromosomal rearrangements. LTR-RTs are widespread in many plant genomes and constitute a significant portion of the genome. Their movement and activity in eukaryotic genomes can provide insight into genome evolution and gene function, especially when LTR-RTs are located near or within genes. Building the redundant and non-redundant LTR-RTs libraries and their annotations for species lacking this resource requires extensive bioinformatics pipelines and expensive computing power to analyze large amounts of genomic data. This increases the need for online services that provide computational resources with minimal overhead and maximum efficiency. Here, we present MegaLTR as a web server and standalone pipeline that detects intact LTR-RTs at the whole-genome level and integrates multiple tools for structure-based, homologybased, and de novo identification, classification, annotation, insertion time determination, and LTR-RT gene chimera analysis. MegaLTR also provides statistical analysis and visualization with multiple tools and can be used to accelerate plant species discovery and assist breeding programs in their efforts to improve genomic resources. We hope that the development of online services such as MegaLTR, which can analyze large amounts of genomic data, will become increasingly important for the automated detection and annotation of LTR-RT elements.
Collapse
Affiliation(s)
- Morad M. Mokhtar
- African Genome Center, Mohammed VI Polytechnic University, Benguerir, Morocco
| | - Achraf El Allali
- African Genome Center, Mohammed VI Polytechnic University, Benguerir, Morocco
| |
Collapse
|
5
|
Liao X, Zhu W, Zhou J, Li H, Xu X, Zhang B, Gao X. Repetitive DNA sequence detection and its role in the human genome. Commun Biol 2023; 6:954. [PMID: 37726397 PMCID: PMC10509279 DOI: 10.1038/s42003-023-05322-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/04/2023] [Indexed: 09/21/2023] Open
Abstract
Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.
Collapse
Affiliation(s)
- Xingyu Liao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Wufei Zhu
- Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, 443000, Yichang, P.R. China
| | - Juexiao Zhou
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Haoyang Li
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xiaopeng Xu
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Bin Zhang
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
6
|
Flynn JM, Ahmed-Braimah YH, Long M, Wing RA, Clark AG. High quality genome assemblies reveal evolutionary dynamics of repetitive DNA and structural rearrangements in the Drosophila virilis sub-group. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.08.13.553086. [PMID: 37645834 PMCID: PMC10462019 DOI: 10.1101/2023.08.13.553086] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
High-quality genome assemblies across a range of non-traditional model organisms can accelerate the discovery of novel aspects of genome evolution. The Drosophila virilis group has several attributes that distinguish it from more highly studied species in the Drosophila genus, such as an unusual abundance of repetitive elements and extensive karyotype evolution, in addition to being an attractive model for speciation genetics. Here we used long-read sequencing to assemble five genomes of three virilis group species and characterized sequence and structural divergence and repetitive DNA evolution. We find that our contiguous genome assemblies allow characterization of chromosomal arrangements with ease and can facilitate analysis of inversion breakpoints. We also leverage a small panel of resequenced strains to explore the genomic pattern of divergence and polymorphism in this species and show that known demographic histories largely predicts the extent of genome-wide segregating polymorphism. We further find that a neo-X chromosome in D. americana displays X-like levels of nucleotide diversity. We also found that unusual repetitive elements were responsible for much of the divergence in genome composition among species. Helitron-derived tandem repeats tripled in abundance on the Y chromosome in D. americana compared to D. novamexicana, accounting for most of the difference in repeat content between these sister species. Repeats with characteristics of both transposable elements and satellite DNAs expanded by three-fold, mostly in euchromatin, in both D. americana and D. novamexicana compared to D. virilis. Our results represent a major advance in our understanding of genome biology in this emerging model clade.
Collapse
Affiliation(s)
- Jullien M. Flynn
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Whitehead Institute for Biomedical Research, Cambridge, MA, USA
| | | | - Manyuan Long
- Department of Ecology and Evolution, University of Chicago, Chicago, IL, USA
| | - Rod A. Wing
- School of Plant Sciences, Arizona Genomics Institute, University of Arizona, Tucson, AZ
| | - Andrew G. Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| |
Collapse
|
7
|
Identification of Daboia siamensis venome using integrated multi-omics data. Sci Rep 2022; 12:13140. [PMID: 35907887 PMCID: PMC9338987 DOI: 10.1038/s41598-022-17300-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2022] [Accepted: 07/22/2022] [Indexed: 11/08/2022] Open
Abstract
Snakebite, classified by World Health Organization as a neglected tropical disease, causes more than 100,000 deaths and 2 million injuries per year. Currently, available antivenoms do not bind with strong specificity to target toxins, which means that severe complications can still occur despite treatment. Moreover, the cost of antivenom is expensive. Knowledge of venom compositions is fundamental for producing a specific antivenom that has high effectiveness, low side effects, and ease of manufacture. With advances in mass spectrometry techniques, venom proteomes can now be analyzed in great depth at high efficiency. However, these techniques require genomic and transcriptomic data for interpreting mass spectrometry data. This study aims to establish and incorporate genomics, transcriptomics, and proteomics data to study venomics of a venomous snake, Daboia siamensis. Multiple proteins that have not been reported as venom components of this snake such as hyaluronidase-1, phospholipase B, and waprin were discovered. Thus, multi-omics data are advantageous for venomics studies. These findings will be valuable not only for antivenom production but also for the development of novel therapeutics.
Collapse
|
8
|
Riehl K, Riccio C, Miska EA, Hemberg M. TransposonUltimate: software for transposon classification, annotation and detection. Nucleic Acids Res 2022; 50:e64. [PMID: 35234904 PMCID: PMC9226531 DOI: 10.1093/nar/gkac136] [Citation(s) in RCA: 30] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 02/09/2022] [Accepted: 02/14/2022] [Indexed: 12/17/2022] Open
Abstract
Most genomes harbor a large number of transposons, and they play an important role in evolution and gene regulation. They are also of interest to clinicians as they are involved in several diseases, including cancer and neurodegeneration. Although several methods for transposon identification are available, they are often highly specialised towards specific tasks or classes of transposons, and they lack common standards such as a unified taxonomy scheme and output file format. We present TransposonUltimate, a powerful bundle of three modules for transposon classification, annotation, and detection of transposition events. TransposonUltimate comes as a Conda package under the GPL-3.0 licence, is well documented and it is easy to install through https://github.com/DerKevinRiehl/TransposonUltimate. We benchmark the classification module on the large TransposonDB covering 891,051 sequences to demonstrate that it outperforms the currently best existing solutions. The annotation and detection modules combine sixteen existing softwares, and we illustrate its use by annotating Caenorhabditis elegans, Rhizophagus irregularis and Oryza sativa subs. japonica genomes. Finally, we use the detection module to discover 29 554 transposition events in the genomes of 20 wild type strains of C. elegans. Databases, assemblies, annotations and further findings can be downloaded from (https://doi.org/10.5281/zenodo.5518085).
Collapse
Affiliation(s)
- Kevin Riehl
- Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK
| | - Cristian Riccio
- Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
| | - Eric A Miska
- Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
- Department of Genetics, University of Cambridge, Downing Street, Cambridge CB2 3EH, UK
| | - Martin Hemberg
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton CB10 1SA, UK
- Evergrande Center for Immunologic Diseases, Harvard Medical School and Brigham and Women’s Hospital, 75 Francis Street, Boston, MA 02215, USA
| |
Collapse
|
9
|
Zhang RG, Li GY, Wang XL, Dainat J, Wang ZX, Ou S, Ma Y. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. HORTICULTURE RESEARCH 2022; 9:uhac017. [PMID: 35184178 PMCID: PMC9002660 DOI: 10.1093/hr/uhac017] [Citation(s) in RCA: 57] [Impact Index Per Article: 28.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/07/2021] [Revised: 10/17/2021] [Accepted: 12/23/2021] [Indexed: 05/04/2023]
Affiliation(s)
- Ren-Gang Zhang
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
- Department of Bioinformatics, Ori (Shandong) Gene Science and Technology Co., Ltd., Weifang, Shandong 261322, China
| | - Guang-Yuan Li
- Department of Bioinformatics, Ori (Shandong) Gene Science and Technology Co., Ltd., Weifang, Shandong 261322, China
| | | | - Jacques Dainat
- Department of Medical Biochemistry and Microbiology, National Bioinformatics Infrastructure Sweden, Science for Life Laboratory, Uppsala University, Uppsala, Sweden
| | - Zhao-Xuan Wang
- Shijiazhuang People’s Medical College, Shijiazhuang, Hebei 050091, China
| | - Shujun Ou
- Department of Ecology, Evolution, and Organismal Biology (EEOB), Iowa State University, Ames, IA 50010, USA
| | - Yongpeng Ma
- Yunnan Key Laboratory for Integrative Conservation of Plant Species with Extremely Small Populations, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming 650201, China
| |
Collapse
|
10
|
Oliveira LS, Patera AC, Domingues DS, Sanches DS, Lopes FM, Bugatti PH, Saito PTM, Maracaja-Coutinho V, Durham AM, Paschoal AR. Computational Analysis of Transposable Elements and CircRNAs in Plants. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2362:147-172. [PMID: 34195962 DOI: 10.1007/978-1-0716-1645-1_9] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
This chapter provides two main contributions: (1) a description of computational tools and databases used to identify and analyze transposable elements (TEs) and circRNAs in plants; and (2) data analysis on public TE and circRNA data. Our goal is to highlight the primary information available in the literature on circular noncoding RNAs and transposable elements in plants. The exploratory analysis performed on publicly available circRNA and TEs data help discuss four sequence features. Finally, we investigate the association on circRNAs:TE in plants in the model organism Arabidopsis thaliana.
Collapse
Affiliation(s)
- Liliane Santana Oliveira
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil. .,Embrapa Soja, Londrina, Paraná, Brazil.
| | - Andressa Caroline Patera
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil
| | - Douglas Silva Domingues
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil.,Group of Genomics and Transcriptomes in Plants, Instituto de Biociências de Rio Claro, Universidade Estadual Paulista (UNESP), Rio Claro, SP, Brazil
| | - Danilo Sipoli Sanches
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil
| | - Fabricio Martins Lopes
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil
| | - Pedro Henrique Bugatti
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil
| | - Priscila Tiemi Maeda Saito
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil
| | - Vinicius Maracaja-Coutinho
- Centro de Modelamiento Molecular, Biofísica y Bioinformática-CM2B2, Facultad de Ciencias Quimicas y Farmaceuticas, Universidad de Chile, Santiago, Chile
| | - Alan Mitchell Durham
- Department of Computer Science, Instituto de Matemática e Estatística, Universidade de São Paulo (USP), Cidade Universitária, SP, Brazil
| | - Alexandre Rossi Paschoal
- Department of Computer Science, Federal University of Technology-Paraná (UTFPR), Cornélio Procópio, PR, Brazil.
| |
Collapse
|
11
|
Zeng C, Takeda A, Sekine K, Osato N, Fukunaga T, Hamada M. Bioinformatics Approaches for Determining the Functional Impact of Repetitive Elements on Non-coding RNAs. Methods Mol Biol 2022; 2509:315-340. [PMID: 35796972 DOI: 10.1007/978-1-0716-2380-0_19] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
With a large number of annotated non-coding RNAs (ncRNAs), repetitive sequences are found to constitute functional components (termed as repetitive elements) in ncRNAs that perform specific biological functions. Bioinformatics analysis is a powerful tool for improving our understanding of the role of repetitive elements in ncRNAs. This chapter summarizes recent findings that reveal the role of repetitive elements in ncRNAs. Furthermore, relevant bioinformatics approaches are systematically reviewed, which promises to provide valuable resources for studying the functional impact of repetitive elements on ncRNAs.
Collapse
Affiliation(s)
- Chao Zeng
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan.
- AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), Tokyo, Japan.
| | - Atsushi Takeda
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Kotaro Sekine
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Naoki Osato
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan
| | - Tsukasa Fukunaga
- Waseda Institute for Advanced Study, Waseda University, Tokyo, Japan
| | - Michiaki Hamada
- Faculty of Science and Engineering, Waseda University, Tokyo, Japan.
- AIST-Waseda University Computational Bio Big-Data Open Innovation Laboratory (CBBD-OIL), Tokyo, Japan.
| |
Collapse
|
12
|
Ma T, Wei X, Zhang Y, Li J, Wu F, Yan Q, Yan Z, Zhang Z, Kanzana G, Zhao Y, Yang Y, Zhang J. Development of molecular markers based on LTR retrotransposon in the Cleistogenes songorica genome. J Appl Genet 2021; 63:61-72. [PMID: 34554437 DOI: 10.1007/s13353-021-00658-9] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2020] [Revised: 08/09/2021] [Accepted: 08/23/2021] [Indexed: 11/26/2022]
Abstract
Long terminal repeat retrotransposons (LTR-RTs) contribute a large fraction of many sequenced plant genomes and play important roles in genomic diversity and phenotypic variations. LTR-RTs are abundantly distributed in plant genomes, facilitating the development of markers based on LTR-RTs for a variety of genotyping purposes. Whole-genome analysis of LTR-RTs was performed in Cleistogenes songorica. A total of 299,079 LTR-RTs were identified and classified as Gypsy type, Copia type, or other type. LTR-RTs were widely distributed in the genome, enriched in the heterochromatic region of the chromosome, and negatively correlated with gene distribution. However, approximately one-fifth of genes were still interrupted by LTR-RTs, and these genes are annotated. Furthermore, four types of primer pairs (PPs) were designed, namely, retrotransposon-based insertion polymorphisms, inter-retrotransposon amplified polymorphisms, insertion site-based polymorphisms, and retrotransposon-microsatellite amplified polymorphisms. A total of 350 PPs were screened in 23 accessions of the genus Cleistogenes, of which 80 PPs showed polymorphism, and 72 PPs showed transferability among Gramineae and non-Gramineae species. In addition, a comparative analysis of homologous LTR-RTs was performed with other related grasses. Taken together, the study will serve as a valuable resource for genotyping applications for C. songorica and related grasses.
Collapse
Affiliation(s)
- Tiantian Ma
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Xingyi Wei
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Yufei Zhang
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Jie Li
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Fan Wu
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Qi Yan
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Zhuanzhuan Yan
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Zhengshe Zhang
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Gisele Kanzana
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Yufeng Zhao
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Yingbo Yang
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China
| | - Jiyu Zhang
- State Key Laboratory of Grassland Agro-ecosystems, Key Laboratory of Grassland Livestock Industry Innovation, Ministry of Agriculture and Rural Affairs, Engineering Research Center of Grassland Industry, Ministry of Education, College of Pastoral Agriculture Science and Technology, Lanzhou University, Lanzhou, 730020, China.
| |
Collapse
|
13
|
Panta M, Mishra A, Hoque MT, Atallah J. ClassifyTE: A stacking based prediction of hierarchical classification of transposable elements. Bioinformatics 2021; 37:2529-2536. [PMID: 33682878 DOI: 10.1093/bioinformatics/btab146] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Revised: 02/10/2021] [Accepted: 03/01/2021] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Transposable Elements (TEs) or jumping genes are DNA sequences that have an intrinsic capability to move within a host genome from one genomic location to another. Studies show that the presence of a TE within or adjacent to a functional gene may alter its expression. TEs can also cause an increase in the rate of mutation and can even mediate duplications and large insertions and deletions in the genome, promoting gross genetic rearrangements. The proper classification of identified jumping genes is important for analyzing their genetic and evolutionary effects. An effective classifier, which can explain the role of TEs in germline and somatic evolution more accurately, is needed. In this study, we examine the performance of a variety of machine learning (ML) techniques and propose a robust method, ClassifyTE, for the hierarchical classification of TEs with high accuracy, using a stacking-based ML method. RESULTS We propose a stacking-based approach for the hierarchical classification of TEs. When trained on three different benchmark datasets, our proposed system achieved 4%, 10.68%, and 10.13% average percentage improvement (using the hF measure) compared to several state-of-the-art methods. We developed an end-to-end automated hierarchical classification tool based on the proposed approach, ClassifyTE, to classify TEs up to the super-family level. We further evaluated our method on a new TE library generated by a homology-based classification method and found relatively high concordance at higher taxonomic levels. Thus, ClassifyTE paves the way for a more accurate analysis of the role of TEs. AVAILABILITY The source code and data are available at https://github.com/manisa/ClassifyTE. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Manisha Panta
- Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA
| | - Avdesh Mishra
- Department of Electrical Engineering and Computer Science, Texas A&M University-Kingsville, Kingsville, TX, 78363, USA
| | - Md Tamjidul Hoque
- Department of Computer Science, University of New Orleans, New Orleans, LA 70148, USA
| | - Joel Atallah
- Department of Biological Sciences, University of New Orleans, New Orleans, LA 70148, USA
| |
Collapse
|
14
|
Storer J, Hubley R, Rosen J, Wheeler TJ, Smit AF. The Dfam community resource of transposable element families, sequence models, and genome annotations. Mob DNA 2021; 12:2. [PMID: 33436076 PMCID: PMC7805219 DOI: 10.1186/s13100-020-00230-y] [Citation(s) in RCA: 243] [Impact Index Per Article: 81.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 12/28/2020] [Indexed: 02/02/2023] Open
Abstract
Dfam is an open access database of repetitive DNA families, sequence models, and genome annotations. The 3.0-3.3 releases of Dfam ( https://dfam.org ) represent an evolution from a proof-of-principle collection of transposable element families in model organisms into a community resource for a broad range of species, and for both curated and uncurated datasets. In addition, releases since Dfam 3.0 provide auxiliary consensus sequence models, transposable element protein alignments, and a formalized classification system to support the growing diversity of organisms represented in the resource. The latest release includes 266,740 new de novo generated transposable element families from 336 species contributed by the EBI. This expansion demonstrates the utility of many of Dfam's new features and provides insight into the long term challenges ahead for improving de novo generated transposable element datasets.
Collapse
Affiliation(s)
| | - Robert Hubley
- Institute for Systems Biology, Seattle, WA, 98109, USA.
| | - Jeb Rosen
- Institute for Systems Biology, Seattle, WA, 98109, USA
| | | | - Arian F Smit
- Institute for Systems Biology, Seattle, WA, 98109, USA.
| |
Collapse
|
15
|
Greenhalgh R, Dermauw W, Glas JJ, Rombauts S, Wybouw N, Thomas J, Alba JM, Pritham EJ, Legarrea S, Feyereisen R, Van de Peer Y, Van Leeuwen T, Clark RM, Kant MR. Genome streamlining in a minute herbivore that manipulates its host plant. eLife 2020; 9:56689. [PMID: 33095158 PMCID: PMC7738191 DOI: 10.7554/elife.56689] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2020] [Accepted: 10/22/2020] [Indexed: 12/12/2022] Open
Abstract
The tomato russet mite, Aculops lycopersici, is among the smallest animals on earth. It is a worldwide pest on tomato and can potently suppress the host's natural resistance. We sequenced its genome, the first of an eriophyoid, and explored whether there are genomic features associated with the mite's minute size and lifestyle. At only 32.5 Mb, the genome is the smallest yet reported for any arthropod and, reminiscent of microbial eukaryotes, exceptionally streamlined. It has few transposable elements, tiny intergenic regions, and is remarkably intron-poor, as more than 80% of coding genes are intronless. Furthermore, in accordance with ecological specialization theory, this defense-suppressing herbivore has extremely reduced environmental response gene families such as those involved in chemoreception and detoxification. Other losses associate with this species' highly derived body plan. Our findings accelerate the understanding of evolutionary forces underpinning metazoan life at the limits of small physical and genome size.
Collapse
Affiliation(s)
- Robert Greenhalgh
- School of Biological Sciences, University of Utah, Salt Lake City, United States
| | - Wannes Dermauw
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Joris J Glas
- Department of Evolutionary and Population Biology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands
| | - Stephane Rombauts
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium
| | - Nicky Wybouw
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Jainy Thomas
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Juan M Alba
- Department of Evolutionary and Population Biology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands
| | - Ellen J Pritham
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Saioa Legarrea
- Department of Evolutionary and Population Biology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands
| | - René Feyereisen
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium.,Department of Plant and Environmental Sciences, University of Copenhagen, Copenhagen, Denmark
| | - Yves Van de Peer
- Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.,Center for Plant Systems Biology, VIB, Ghent, Belgium.,Centre for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, Pretoria, South Africa
| | - Thomas Van Leeuwen
- Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
| | - Richard M Clark
- School of Biological Sciences, University of Utah, Salt Lake City, United States.,Henry Eyring Center for Cell and Genome Science, University of Utah, Salt Lake City, United States
| | - Merijn R Kant
- Department of Evolutionary and Population Biology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
16
|
da Cruz MHP, Domingues DS, Saito PTM, Paschoal AR, Bugatti PH. TERL: classification of transposable elements by convolutional neural networks. Brief Bioinform 2020; 22:5900933. [PMID: 34020551 DOI: 10.1093/bib/bbaa185] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2020] [Revised: 07/07/2020] [Accepted: 07/20/2020] [Indexed: 11/12/2022] Open
Abstract
Transposable elements (TEs) are the most represented sequences occurring in eukaryotic genomes. Few methods provide the classification of these sequences into deeper levels, such as superfamily level, which could provide useful and detailed information about these sequences. Most methods that classify TE sequences use handcrafted features such as k-mers and homology-based search, which could be inefficient for classifying non-homologous sequences. Here we propose an approach, called transposable elements pepresentation learner (TERL), that preprocesses and transforms one-dimensional sequences into two-dimensional space data (i.e., image-like data of the sequences) and apply it to deep convolutional neural networks. This classification method tries to learn the best representation of the input data to classify it correctly. We have conducted six experiments to test the performance of TERL against other methods. Our approach obtained macro mean accuracies and F1-score of 96.4% and 85.8% for superfamilies and 95.7% and 91.5% for the order sequences from RepBase, respectively. We have also obtained macro mean accuracies and F1-score of 95.0% and 70.6% for sequences from seven databases into superfamily level and 89.3% and 73.9% for the order level, respectively. We surpassed accuracy, recall and specificity obtained by other methods on the experiment with the classification of order level sequences from seven databases and surpassed by far the time elapsed of any other method for all experiments. Therefore, TERL can learn how to predict any hierarchical level of the TEs classification system and is about 20 times and three orders of magnitude faster than TEclass and PASTEC, respectively https://github.com/muriloHoracio/TERL. Contact:murilocruz@alunos.utfpr.edu.br.
Collapse
Affiliation(s)
- Murilo Horacio Pereira da Cruz
- Federal University of Technology - Parana (UTFPR), Brazil.,Bioinformatics Graduation Program (PPGBIOINFO), Department of Computer Science, Federal University of Technology - Parana (UTFPR), Brazil
| | - Douglas Silva Domingues
- São Paulo State University at Botucatu, Brazil.,University of São Paulo, Brazil.,Department of Biodiversity, São Paulo State University at Rio Claro, Brazil
| | - Priscila Tiemi Maeda Saito
- Euripides Soares da Rocha University of Marilia, Brazil.,University of São Paulo (ICMC-USP), Brazil.,University of Campinas (IC-UNICAMP), Brazil.,Department of Computing, Federal University of Technology - Parana (UTFPR), Brazil
| | | | - Pedro Henrique Bugatti
- Euripides Soares da Rocha University of Marilia, Brazil.,University of São Paulo (ICMC-USP), Brazil.,Department of Computing, Federal University of Technology - Parana (UTFPR), Brazil
| |
Collapse
|
17
|
Flynn JM, Hubley R, Goubert C, Rosen J, Clark AG, Feschotte C, Smit AF. RepeatModeler2 for automated genomic discovery of transposable element families. Proc Natl Acad Sci U S A 2020; 117:9451-9457. [PMID: 32300014 PMCID: PMC7196820 DOI: 10.1073/pnas.1921046117] [Citation(s) in RCA: 1265] [Impact Index Per Article: 316.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
The accelerating pace of genome sequencing throughout the tree of life is driving the need for improved unsupervised annotation of genome components such as transposable elements (TEs). Because the types and sequences of TEs are highly variable across species, automated TE discovery and annotation are challenging and time-consuming tasks. A critical first step is the de novo identification and accurate compilation of sequence models representing all of the unique TE families dispersed in the genome. Here we introduce RepeatModeler2, a pipeline that greatly facilitates this process. This program brings substantial improvements over the original version of RepeatModeler, one of the most widely used tools for TE discovery. In particular, this version incorporates a module for structural discovery of complete long terminal repeat (LTR) retroelements, which are widespread in eukaryotic genomes but recalcitrant to automated identification because of their size and sequence complexity. We benchmarked RepeatModeler2 on three model species with diverse TE landscapes and high-quality, manually curated TE libraries: Drosophila melanogaster (fruit fly), Danio rerio (zebrafish), and Oryza sativa (rice). In these three species, RepeatModeler2 identified approximately 3 times more consensus sequences matching with >95% sequence identity and sequence coverage to the manually curated sequences than the original RepeatModeler. As expected, the greatest improvement is for LTR retroelements. Thus, RepeatModeler2 represents a valuable addition to the genome annotation toolkit that will enhance the identification and study of TEs in eukaryotic genome sequences. RepeatModeler2 is available as source code or a containerized package under an open license (https://github.com/Dfam-consortium/RepeatModeler, http://www.repeatmasker.org/RepeatModeler/).
Collapse
Affiliation(s)
- Jullien M Flynn
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853
| | | | - Clément Goubert
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853
| | - Jeb Rosen
- Institute for Systems Biology, Seattle, WA 98109
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853;
| | - Cédric Feschotte
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY 14853;
| | - Arian F Smit
- Institute for Systems Biology, Seattle, WA 98109
| |
Collapse
|
18
|
Touati R, Messaoudi I, Oueslati AE, Lachiri Z, Kharrat M. Classification of intra-genomic helitrons based on features extracted from different orders of FCGS. INFORMATICS IN MEDICINE UNLOCKED 2020. [DOI: 10.1016/j.imu.2019.100271] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
|
19
|
Orozco-Arias S, Isaza G, Guyot R, Tabares-Soto R. A systematic review of the application of machine learning in the detection and classification of transposable elements. PeerJ 2019; 7:e8311. [PMID: 31976169 PMCID: PMC6967008 DOI: 10.7717/peerj.8311] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2019] [Accepted: 11/28/2019] [Indexed: 12/16/2022] Open
Abstract
Background Transposable elements (TEs) constitute the most common repeated sequences in eukaryotic genomes. Recent studies demonstrated their deep impact on species diversity, adaptation to the environment and diseases. Although there are many conventional bioinformatics algorithms for detecting and classifying TEs, none have achieved reliable results on different types of TEs. Machine learning (ML) techniques can automatically extract hidden patterns and novel information from labeled or non-labeled data and have been applied to solving several scientific problems. Methodology We followed the Systematic Literature Review (SLR) process, applying the six stages of the review protocol from it, but added a previous stage, which aims to detect the need for a review. Then search equations were formulated and executed in several literature databases. Relevant publications were scanned and used to extract evidence to answer research questions. Results Several ML approaches have already been tested on other bioinformatics problems with promising results, yet there are few algorithms and architectures available in literature focused specifically on TEs, despite representing the majority of the nuclear DNA of many organisms. Only 35 articles were found and categorized as relevant in TE or related fields. Conclusions ML is a powerful tool that can be used to address many problems. Although ML techniques have been used widely in other biological tasks, their utilization in TE analyses is still limited. Following the SLR, it was possible to notice that the use of ML for TE analyses (detection and classification) is an open problem, and this new field of research is growing in interest.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia.,Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales, Caldas, Colombia
| | - Romain Guyot
- Institut de Recherche pour le Développement, CIRAD, University of Montpellier, Montpellier, France.,Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Caldas, Colombia
| |
Collapse
|
20
|
Wu S, Wang X, Reddy U, Sun H, Bao K, Gao L, Mao L, Patel T, Ortiz C, Abburi VL, Nimmakayala P, Branham S, Wechter P, Massey L, Ling K, Kousik C, Hammar SA, Tadmor Y, Portnoy V, Gur A, Katzir N, Guner N, Davis A, Hernandez AG, Wright CL, McGregor C, Jarret R, Zhang X, Xu Y, Wehner TC, Grumet R, Levi A, Fei Z. Genome of 'Charleston Gray', the principal American watermelon cultivar, and genetic characterization of 1,365 accessions in the U.S. National Plant Germplasm System watermelon collection. PLANT BIOTECHNOLOGY JOURNAL 2019; 17:2246-2258. [PMID: 31022325 PMCID: PMC6835170 DOI: 10.1111/pbi.13136] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2019] [Revised: 04/16/2019] [Accepted: 04/18/2019] [Indexed: 05/14/2023]
Abstract
Years of selection for desirable fruit quality traits in dessert watermelon (Citrullus lanatus) has resulted in a narrow genetic base in modern cultivars. Development of novel genomic and genetic resources offers great potential to expand genetic diversity and improve important traits in watermelon. Here, we report a high-quality genome sequence of watermelon cultivar 'Charleston Gray', a principal American dessert watermelon, to complement the existing reference genome from '97103', an East Asian cultivar. Comparative analyses between genomes of 'Charleston Gray' and '97103' revealed genomic variants that may underlie phenotypic differences between the two cultivars. We then genotyped 1365 watermelon plant introduction (PI) lines maintained at the U.S. National Plant Germplasm System using genotyping-by-sequencing (GBS). These PI lines were collected throughout the world and belong to three Citrullus species, C. lanatus, C. mucosospermus and C. amarus. Approximately 25 000 high-quality single nucleotide polymorphisms (SNPs) were derived from the GBS data using the 'Charleston Gray' genome as the reference. Population genomic analyses using these SNPs discovered a close relationship between C. lanatus and C. mucosospermus and identified four major groups in these two species correlated to their geographic locations. Citrullus amarus was found to have a distinct genetic makeup compared to C. lanatus and C. mucosospermus. The SNPs also enabled identification of genomic regions associated with important fruit quality and disease resistance traits through genome-wide association studies. The high-quality 'Charleston Gray' genome and the genotyping data of this large collection of watermelon accessions provide valuable resources for facilitating watermelon research, breeding and improvement.
Collapse
Affiliation(s)
- Shan Wu
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
| | - Xin Wang
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
| | - Umesh Reddy
- Department of BiologyWest Virginia State UniversityInstituteWVUSA
| | - Honghe Sun
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
- National Engineering Research Center for VegetablesBeijing Academy of Agriculture and Forestry SciencesKey Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China)BeijingChina
| | - Kan Bao
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
| | - Lei Gao
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
| | - Linyong Mao
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
- Department of Biochemistry and Molecular BiologyHoward University College of MedicineWashingtonDCUSA
| | - Takshay Patel
- Horticultural Science DepartmentNorth Carolina State UniversityRaleighNCUSA
| | - Carlos Ortiz
- Department of BiologyWest Virginia State UniversityInstituteWVUSA
| | | | | | - Sandra Branham
- U.S. Department of Agriculture‐Agricultural Research ServiceU.S. Vegetable LaboratoryCharlestonSCUSA
| | - Pat Wechter
- U.S. Department of Agriculture‐Agricultural Research ServiceU.S. Vegetable LaboratoryCharlestonSCUSA
| | - Laura Massey
- U.S. Department of Agriculture‐Agricultural Research ServiceU.S. Vegetable LaboratoryCharlestonSCUSA
| | - Kai‐Shu Ling
- U.S. Department of Agriculture‐Agricultural Research ServiceU.S. Vegetable LaboratoryCharlestonSCUSA
| | - Chandrasekar Kousik
- U.S. Department of Agriculture‐Agricultural Research ServiceU.S. Vegetable LaboratoryCharlestonSCUSA
| | - Sue A. Hammar
- Department of HorticultureMichigan State UniversityEast LansingMIUSA
| | - Yaakov Tadmor
- Department of Vegetable ResearchAgricultural Research OrganizationNewe Ya'ar Research CenterRamat YishayIsrael
| | - Vitaly Portnoy
- Department of Vegetable ResearchAgricultural Research OrganizationNewe Ya'ar Research CenterRamat YishayIsrael
| | - Amit Gur
- Department of Vegetable ResearchAgricultural Research OrganizationNewe Ya'ar Research CenterRamat YishayIsrael
| | - Nurit Katzir
- Department of Vegetable ResearchAgricultural Research OrganizationNewe Ya'ar Research CenterRamat YishayIsrael
| | | | - Angela Davis
- Sakata Seed AmericaWoodland Research StationWoodlandCAUSA
| | - Alvaro G. Hernandez
- Roy J. Carver Biotechnology CenterUniversity of Illinois at Urbana‐ChampaignUrbanaILUSA
| | - Chris L. Wright
- Roy J. Carver Biotechnology CenterUniversity of Illinois at Urbana‐ChampaignUrbanaILUSA
| | | | - Robert Jarret
- U.S. Department of Agriculture‐Agricultural Research ServicePlant Genetic Resources Conservation UnitGriffinGAUSA
| | | | - Yong Xu
- National Engineering Research Center for VegetablesBeijing Academy of Agriculture and Forestry SciencesKey Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China)BeijingChina
| | - Todd C. Wehner
- Horticultural Science DepartmentNorth Carolina State UniversityRaleighNCUSA
| | - Rebecca Grumet
- Department of HorticultureMichigan State UniversityEast LansingMIUSA
| | - Amnon Levi
- U.S. Department of Agriculture‐Agricultural Research ServiceU.S. Vegetable LaboratoryCharlestonSCUSA
| | - Zhangjun Fei
- Boyce Thompson InstituteCornell UniversityIthacaNYUSA
- U.S. Department of Agriculture‐Agricultural Research ServiceRobert W. Holley Center for Agriculture and HealthIthacaNYUSA
| |
Collapse
|
21
|
Orozco-Arias S, Isaza G, Guyot R. Retrotransposons in Plant Genomes: Structure, Identification, and Classification through Bioinformatics and Machine Learning. Int J Mol Sci 2019; 20:E3837. [PMID: 31390781 PMCID: PMC6696364 DOI: 10.3390/ijms20153837] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2019] [Revised: 07/31/2019] [Accepted: 08/02/2019] [Indexed: 01/26/2023] Open
Abstract
Transposable elements (TEs) are genomic units able to move within the genome of virtually all organisms. Due to their natural repetitive numbers and their high structural diversity, the identification and classification of TEs remain a challenge in sequenced genomes. Although TEs were initially regarded as "junk DNA", it has been demonstrated that they play key roles in chromosome structures, gene expression, and regulation, as well as adaptation and evolution. A highly reliable annotation of these elements is, therefore, crucial to better understand genome functions and their evolution. To date, much bioinformatics software has been developed to address TE detection and classification processes, but many problematic aspects remain, such as the reliability, precision, and speed of the analyses. Machine learning and deep learning are algorithms that can make automatic predictions and decisions in a wide variety of scientific applications. They have been tested in bioinformatics and, more specifically for TEs, classification with encouraging results. In this review, we will discuss important aspects of TEs, such as their structure, importance in the evolution and architecture of the host, and their current classifications and nomenclatures. We will also address current methods and their limitations in identifying and classifying TEs.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales 170001, Colombia
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170001, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales 170001, Colombia
| | - Romain Guyot
- Department of Electronics and Automatization, Universidad Autónoma de Manizales, Manizales 170001, Colombia.
- Institut de Recherche pour le Développement, CIRAD, University Montpellier, 34000 Montpellier, France.
| |
Collapse
|
22
|
da Cruz MHP, Saito PTM, Paschoal AR, Bugatti PH. Classification of Transposable Elements by Convolutional Neural Networks. ARTIFICIAL INTELLIGENCE AND SOFT COMPUTING 2019. [DOI: 10.1007/978-3-030-20915-5_15] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
|
23
|
Transposable Elements: Classification, Identification, and Their Use As a Tool For Comparative Genomics. Methods Mol Biol 2019; 1910:177-207. [PMID: 31278665 DOI: 10.1007/978-1-4939-9074-0_6] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Most genomes are populated by hundreds of thousands of sequences originated from mobile elements. On the one hand, these sequences present a real challenge in the process of genome analysis and annotation. On the other hand, they are very interesting biological subjects involved in many cellular processes. Here we present an overview of transposable elements biodiversity, and we discuss different approaches to transposable elements detection and analyses.
Collapse
|
24
|
Genome-Wide Survey and Comparative Analysis of Long Terminal Repeat (LTR) Retrotransposon Families in Four Gossypium Species. Sci Rep 2018; 8:9399. [PMID: 29925876 PMCID: PMC6010443 DOI: 10.1038/s41598-018-27589-6] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2017] [Accepted: 06/06/2018] [Indexed: 11/08/2022] Open
Abstract
Long terminal repeat (LTR) retrotransposon is the most abundant DNA component and is largely responsible for plant genome size variation. Although it has been studied in plant species, very limited data is available for cotton, the most important fiber and texture crop. In this study, we performed a comprehensive analysis of LTR retrotransposon families across four cotton species. In tetraploid Gossypium species, LTR retrotransposon families from the progenitor D genome had more copies in D-subgenome, and families from the progenitor A genome had more copies in A-subgenome. Some LTR retrotransposon families that insert after polyploid formation may still distribute the majority of its copies in one of the subgenomes. The data also shows that families of 10~200 copies are abundant and they have a great influence on the Gossypium genome size; on the contrary, a small number of high copy LTR retrotransposon families have less contribution to the genome size. Kimura distance distribution indicates that high copy number family is not a recent outbreak, and there is no obvious relationship between family copy number and the period of evolution. Further analysis reveals that each LTR retrotransposon family may have their own distribution characteristics in cotton.
Collapse
|
25
|
Inpactor, Integrated and Parallel Analyzer and Classifier of LTR Retrotransposons and Its Application for Pineapple LTR Retrotransposons Diversity and Dynamics. BIOLOGY 2018; 7:biology7020032. [PMID: 29799487 PMCID: PMC6022998 DOI: 10.3390/biology7020032] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/03/2018] [Revised: 05/16/2018] [Accepted: 05/22/2018] [Indexed: 12/22/2022]
Abstract
One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.
Collapse
|
26
|
Holt C, Campbell M, Keays DA, Edelman N, Kapusta A, Maclary E, T Domyan E, Suh A, Warren WC, Yandell M, Gilbert MTP, Shapiro MD. Improved Genome Assembly and Annotation for the Rock Pigeon ( Columba livia). G3 (BETHESDA, MD.) 2018; 8:1391-1398. [PMID: 29519939 PMCID: PMC5940132 DOI: 10.1534/g3.117.300443] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2017] [Accepted: 03/06/2018] [Indexed: 01/25/2023]
Abstract
The domestic rock pigeon (Columba livia) is among the most widely distributed and phenotypically diverse avian species. C. livia is broadly studied in ecology, genetics, physiology, behavior, and evolutionary biology, and has recently emerged as a model for understanding the molecular basis of anatomical diversity, the magnetic sense, and other key aspects of avian biology. Here we report an update to the C. livia genome reference assembly and gene annotation dataset. Greatly increased scaffold lengths in the updated reference assembly, along with an updated annotation set, provide improved tools for evolutionary and functional genetic studies of the pigeon, and for comparative avian genomics in general.
Collapse
Affiliation(s)
- Carson Holt
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Michael Campbell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
| | - David A Keays
- Research Institute of Molecular Pathology, Vienna, Austria
| | | | - Aurélie Kapusta
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - Emily Maclary
- Department of Biology, University of Utah, Salt Lake City, UT, USA
| | - Eric T Domyan
- Department of Biology, University of Utah, Salt Lake City, UT, USA
- Department of Biology, Utah Valley University, Orem, UT, USA
| | - Alexander Suh
- Department of Evolutionary Biology (EBC), University of Uppsala, Uppsala, Sweden
| | - Wesley C Warren
- Genome Institute at Washington University, St. Louis, MO, USA
| | - Mark Yandell
- Department of Human Genetics, University of Utah, Salt Lake City, UT, USA
- USTAR Center for Genetic Discovery, University of Utah, Salt Lake City, UT, USA
| | - M Thomas P Gilbert
- Natural History Museum of Denmark, University of Copenhagen, Copenhagen, Denmark
- Norwegian University of Science and Technology, University Museum, 7491 Trondheim, Norway
| | | |
Collapse
|
27
|
Schietgat L, Vens C, Cerri R, Fischer CN, Costa E, Ramon J, Carareto CMA, Blockeel H. A machine learning based framework to identify and classify long terminal repeat retrotransposons. PLoS Comput Biol 2018; 14:e1006097. [PMID: 29684010 PMCID: PMC5933816 DOI: 10.1371/journal.pcbi.1006097] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2017] [Revised: 05/03/2018] [Accepted: 03/19/2018] [Indexed: 12/03/2022] Open
Abstract
Transposable elements (TEs) are repetitive nucleotide sequences that make up a large portion of eukaryotic genomes. They can move and duplicate within a genome, increasing genome size and contributing to genetic diversity within and across species. Accurate identification and classification of TEs present in a genome is an important step towards understanding their effects on genes and their role in genome evolution. We introduce TE-Learner, a framework based on machine learning that automatically identifies TEs in a given genome and assigns a classification to them. We present an implementation of our framework towards LTR retrotransposons, a particular type of TEs characterized by having long terminal repeats (LTRs) at their boundaries. We evaluate the predictive performance of our framework on the well-annotated genomes of Drosophila melanogaster and Arabidopsis thaliana and we compare our results for three LTR retrotransposon superfamilies with the results of three widely used methods for TE identification or classification: RepeatMasker, Censor and LtrDigest. In contrast to these methods, TE-Learner is the first to incorporate machine learning techniques, outperforming these methods in terms of predictive performance, while able to learn models and make predictions efficiently. Moreover, we show that our method was able to identify TEs that none of the above method could find, and we investigated TE-Learner’s predictions which did not correspond to an official annotation. It turns out that many of these predictions are in fact strongly homologous to a known TE. Over the years, with the increase of the acquisition of biological data, the extraction of knowledge from this data is getting more important. To understand how biology works is very important to increase the quality of the products and services which use biological data. This directly influences companies and governments, which need to remain in the knowledge frontier of an increasing competitive economy. Transposable Elements (TEs) are an example of very important biological data, and to understand their role in the genomes of organisms is very important for the development of products based on biological data. As an example, we can cite the production biofuels such as the sugar-cane-based ones. Many studies have revealed the presence of active TEs in this plant, which has gained economic importance in many countries. To understand how TEs influence the plant should help researchers to develop more resistant varieties of sugar-cane, increasing the production. Thus, the development of computational methods able to help biologists in the correct identification and classification of TEs is very important from both theoretical and practical perspectives.
Collapse
Affiliation(s)
| | - Celine Vens
- Department of Computer Science, KU Leuven, Leuven, Belgium
- Department of Public Health and Primary Care, KU Leuven Kulak, Kortrijk, Belgium
- Department of Respiratory Medicine, Ghent University, and VIB Inflammation Research Center, Ghent, Belgium
- * E-mail:
| | - Ricardo Cerri
- Department of Computer Science, UFSCar Federal University of São Carlos, São Carlos, São Paulo, Brazil
| | - Carlos N. Fischer
- Department of Statistics, Applied Mathematics, and Computer Science, UNESP São Paulo State University, Rio Claro, São Paulo, Brazil
| | - Eduardo Costa
- Department of Computer Science, KU Leuven, Leuven, Belgium
- Instituto de Ciências Matemáticas e de Computação, Universidade de São Paulo, São Carlos, São Paulo, Brazil
| | - Jan Ramon
- Department of Computer Science, KU Leuven, Leuven, Belgium
- INRIA Lille Nord Europe, 40 avenue Halley, 59650 Villeneuve d’Ascq, France
| | - Claudia M. A. Carareto
- Department of Biology, UNESP São Paulo State University, São José do Rio Preto, São Paulo, Brazil
| | | |
Collapse
|
28
|
Zeng Z, Sun H, Vainio EJ, Raffaello T, Kovalchuk A, Morin E, Duplessis S, Asiegbu FO. Intraspecific comparative genomics of isolates of the Norway spruce pathogen (Heterobasidion parviporum) and identification of its potential virulence factors. BMC Genomics 2018; 19:220. [PMID: 29580224 PMCID: PMC5870257 DOI: 10.1186/s12864-018-4610-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Accepted: 03/20/2018] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Heterobasidion parviporum is an economically most important fungal forest pathogen in northern Europe, causing root and butt rot disease of Norway spruce (Picea abies (L.) Karst.). The mechanisms underlying the pathogenesis and virulence of this species remain elusive. No reference genome to facilitate functional analysis is available for this species. RESULTS To better understand the virulence factor at both phenotypic and genomic level, we characterized 15 H. parviporum isolates originating from different locations across Finland for virulence, vegetative growth, sporulation and saprotrophic wood decay. Wood decay capability and latitude of fungal origins exerted interactive effects on their virulence and appeared important for H. parviporum virulence. We sequenced the most virulent isolate, the first full genome sequences of H. parviporum as a reference genome, and re-sequenced the remaining 14 H. parviporum isolates. Genome-wide alignments and intrinsic polymorphism analysis showed that these isolates exhibited overall high genomic similarity with an average of at least 96% nucleotide identity when compared to the reference, yet had remarkable intra-specific level of polymorphism with a bias for CpG to TpG mutations. Reads mapping coverage analysis enabled the classification of all predicted genes into five groups and uncovered two genomic regions exclusively present in the reference with putative contribution to its higher virulence. Genes enriched for copy number variations (deletions and duplications) and nucleotide polymorphism were involved in oxidation-reduction processes and encoding domains relevant to transcription factors. Some secreted protein coding genes based on the genome-wide selection pressure, or the presence of variants were proposed as potential virulence candidates. CONCLUSION Our study reported on the first reference genome sequence for this Norway spruce pathogen (H. parviporum). Comparative genomics analysis gave insight into the overall genomic variation among this fungal species and also facilitated the identification of several secreted protein coding genes as putative virulence factors for the further functional analysis. We also analyzed and identified phenotypic traits potentially linked to its virulence.
Collapse
Affiliation(s)
- Zhen Zeng
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
| | - Hui Sun
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
- Collaborative Innovation Center of Sustainable Forestry in Southern China, College of Forestry, Nanjing Forestry University, Nanjing, China
| | - Eeva J. Vainio
- Natural Resources Institute Finland (Luke), Helsinki, Finland
| | - Tommaso Raffaello
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
| | - Andriy Kovalchuk
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
| | - Emmanuelle Morin
- INRA UMR 1136 Interactions Arbres Micro-organismes, INRA Centre Grand Est Nancy, Champenoux, France
| | - Sébastien Duplessis
- INRA UMR 1136 Interactions Arbres Micro-organismes, INRA Centre Grand Est Nancy, Champenoux, France
- UMR 1136 Interactions Arbres/Microorganismes, Faculté des Sciences et Technologies, Université de Lorraine, Vandoeuvre-lès-Nancy, France
| | - Fred O. Asiegbu
- Department of Forest Sciences, University of Helsinki, Helsinki, Finland
| |
Collapse
|
29
|
Arkhipova IR. Using bioinformatic and phylogenetic approaches to classify transposable elements and understand their complex evolutionary histories. Mob DNA 2017; 8:19. [PMID: 29225705 PMCID: PMC5718144 DOI: 10.1186/s13100-017-0103-2] [Citation(s) in RCA: 60] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2017] [Accepted: 11/28/2017] [Indexed: 12/11/2022] Open
Abstract
In recent years, much attention has been paid to comparative genomic studies of transposable elements (TEs) and the ensuing problems of their identification, classification, and annotation. Different approaches and diverse automated pipelines are being used to catalogue and categorize mobile genetic elements in the ever-increasing number of prokaryotic and eukaryotic genomes, with little or no connectivity between different domains of life. Here, an overview of the current picture of TE classification and evolutionary relationships is presented, updating the diversity of TE types uncovered in sequenced genomes. A tripartite TE classification scheme is proposed to account for their replicative, integrative, and structural components, and the need to expand in vitro and in vivo studies of their structural and biological properties is emphasized. Bioinformatic studies have now become front and center of novel TE discovery, and experimental pursuits of these discoveries hold great promise for both basic and applied science.
Collapse
Affiliation(s)
- Irina R Arkhipova
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, MA 02543 USA
| |
Collapse
|
30
|
Wu S, Shamimuzzaman M, Sun H, Salse J, Sui X, Wilder A, Wu Z, Levi A, Xu Y, Ling KS, Fei Z. The bottle gourd genome provides insights into Cucurbitaceae evolution and facilitates mapping of a Papaya ring-spot virus resistance locus. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2017; 92:963-975. [PMID: 28940759 DOI: 10.1111/tpj.13722] [Citation(s) in RCA: 58] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Revised: 09/01/2017] [Accepted: 09/07/2017] [Indexed: 05/20/2023]
Abstract
Bottle gourd (Lagenaria siceraria) is an important vegetable crop as well as a rootstock for other cucurbit crops. In this study, we report a high-quality 313.4-Mb genome sequence of a bottle gourd inbred line, USVL1VR-Ls, with a scaffold N50 of 8.7 Mb and the longest of 19.0 Mb. About 98.3% of the assembled scaffolds are anchored to the 11 pseudomolecules. Our comparative genomic analysis identifies chromosome-level syntenic relationships between bottle gourd and other cucurbits, as well as lineage-specific gene family expansions in bottle gourd. We reconstructed the genome of the most recent common ancestor of Cucurbitaceae, which revealed that the ancestral Cucurbitaceae karyotypes consisted of 12 protochromosomes with 18 534 protogenes. The 12 protochromosomes are largely retained in the modern melon genome, while have undergone different degrees of shuffling events in other investigated cucurbit genomes. The 11 bottle gourd chromosomes derive from the ancestral Cucurbitaceae karyotypes followed by 19 chromosomal fissions and 20 fusions. The bottle gourd genome sequence has facilitated the mapping of a dominant monogenic locus, Prs, conferring Papaya ring-spot virus (PRSV) resistance in bottle gourd, to a 317.8-kb region on chromosome 1. We have developed a cleaved amplified polymorphic sequence (CAPS) marker tightly linked to the Prs locus and demonstrated its potential application in marker-assisted selection of PRSV resistance in bottle gourd. This study provides insights into the paleohistory of Cucurbitaceae genome evolution, and the high-quality genome sequence of bottle gourd provides a useful resource for plant comparative genomics studies and cucurbit improvement.
Collapse
Affiliation(s)
- Shan Wu
- Boyce Thompson Institute, Cornell University, Ithaca, NY, 14853, USA
| | - Md Shamimuzzaman
- US Vegetable Laboratory, USDA-Agriculture Research Service, Charleston, SC, USA
| | - Honghe Sun
- Boyce Thompson Institute, Cornell University, Ithaca, NY, 14853, USA
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Jerome Salse
- Institut National de la Recherche Agrinomique, Unités Mixtes de Recherche 1095, Genetics, Diversity and Ecophysiology of Cereals, Paleogenomics & Evolution (PaleoEvo) Group, Clermont-Ferrand, France
| | - Xuelian Sui
- US Vegetable Laboratory, USDA-Agriculture Research Service, Charleston, SC, USA
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Province Key Laboratory of Plant Virology, Institute of Plant Virology, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- Department of Plant Protection, Fujian Agriculture and Forest University, Fuzhou, China
| | - Alan Wilder
- US Vegetable Laboratory, USDA-Agriculture Research Service, Charleston, SC, USA
| | - Zujian Wu
- State Key Laboratory of Ecological Pest Control for Fujian and Taiwan Crops, Fujian Province Key Laboratory of Plant Virology, Institute of Plant Virology, Fujian Agriculture and Forestry University, Fuzhou, Fujian, China
- Department of Plant Protection, Fujian Agriculture and Forest University, Fuzhou, China
| | - Amnon Levi
- US Vegetable Laboratory, USDA-Agriculture Research Service, Charleston, SC, USA
| | - Yong Xu
- Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Kai-Shu Ling
- US Vegetable Laboratory, USDA-Agriculture Research Service, Charleston, SC, USA
| | - Zhangjun Fei
- Boyce Thompson Institute, Cornell University, Ithaca, NY, 14853, USA
- USDA-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, NY, USA
| |
Collapse
|
31
|
Sun H, Wu S, Zhang G, Jiao C, Guo S, Ren Y, Zhang J, Zhang H, Gong G, Jia Z, Zhang F, Tian J, Lucas WJ, Doyle JJ, Li H, Fei Z, Xu Y. Karyotype Stability and Unbiased Fractionation in the Paleo-Allotetraploid Cucurbita Genomes. MOLECULAR PLANT 2017; 10:1293-1306. [PMID: 28917590 DOI: 10.1016/j.molp.2017.09.003] [Citation(s) in RCA: 162] [Impact Index Per Article: 23.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/22/2017] [Revised: 09/06/2017] [Accepted: 09/06/2017] [Indexed: 05/18/2023]
Abstract
The Cucurbita genus contains several economically important species in the Cucurbitaceae family. Here, we report high-quality genome sequences of C. maxima and C. moschata and provide evidence supporting an allotetraploidization event in Cucurbita. We are able to partition the genome into two homoeologous subgenomes based on different genetic distances to melon, cucumber, and watermelon in the Benincaseae tribe. We estimate that the two diploid progenitors successively diverged from Benincaseae around 31 and 26 million years ago (Mya), respectively, and the allotetraploidization happened at some point between 26 Mya and 3 Mya, the estimated date when C. maxima and C. moschata diverged. The subgenomes have largely maintained the chromosome structures of their diploid progenitors. Such long-term karyotype stability after polyploidization has not been commonly observed in plant polyploids. The two subgenomes have retained similar numbers of genes, and neither subgenome is globally dominant in gene expression. Allele-specific expression analysis in the C. maxima × C. moschata interspecific F1 hybrid and their two parents indicates the predominance of trans-regulatory effects underlying expression divergence of the parents, and detects transgressive gene expression changes in the hybrid correlated with heterosis in important agronomic traits. Our study provides insights into polyploid genome evolution and valuable resources for genetic improvement of cucurbit crops.
Collapse
Affiliation(s)
- Honghe Sun
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China; Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
| | - Shan Wu
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA.
| | - Guoyu Zhang
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Chen Jiao
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA
| | - Shaogui Guo
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Yi Ren
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Jie Zhang
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Haiying Zhang
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Guoyi Gong
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Zhangcai Jia
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Fan Zhang
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Jiaxing Tian
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - William J Lucas
- Department of Plant Biology, College of Biological Sciences, University of California, Davis, CA 95616, USA
| | - Jeff J Doyle
- Section of Plant Breeding & Genetics, School of Integrated Plant Sciences, Cornell University, Ithaca, NY 14853, USA
| | - Haizhen Li
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China
| | - Zhangjun Fei
- Boyce Thompson Institute, Cornell University, Ithaca, NY 14853, USA; USDA-ARS Robert W. Holley Center for Agriculture and Health, Ithaca, NY 14853, USA.
| | - Yong Xu
- National Engineering Research Center for Vegetables, Beijing Academy of Agriculture and Forestry Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops (North China), Beijing Key Laboratory of Vegetable Germplasm Improvement, Beijing 100097, China.
| |
Collapse
|
32
|
Hiraki H, Kagoshima H, Kraus C, Schiffer PH, Ueta Y, Kroiher M, Schierenberg E, Kohara Y. Genome analysis of Diploscapter coronatus: insights into molecular peculiarities of a nematode with parthenogenetic reproduction. BMC Genomics 2017; 18:478. [PMID: 28646875 PMCID: PMC5483258 DOI: 10.1186/s12864-017-3860-x] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2016] [Accepted: 06/13/2017] [Indexed: 12/03/2022] Open
Abstract
BACKGROUND Sexual reproduction involving the fusion of egg and sperm is prevailing among eukaryotes. In contrast, the nematode Diploscapter coronatus, a close relative of the model Caenorhabditis elegans, reproduces parthenogenetically. Neither males nor sperm have been observed and some steps of meiosis are apparently skipped in this species. To uncover the genomic changes associated with the evolution of parthenogenesis in this nematode, we carried out a genome analysis. RESULTS We obtained a 170 Mbp draft genome in only 511 scaffolds with a N50 length of 1 Mbp. Nearly 90% of these scaffolds constitute homologous pairs with a 5.7% heterozygosity on average and inversions and translocations, meaning that the 170 Mbp sequences correspond to the diploid genome. Fluorescent staining shows that the D. coronatus genome consists of two chromosomes (2n = 2). In our genome annotation, we found orthologs of 59% of the C. elegans genes. However, a number of genes were missing or very divergent. These include genes involved in sex determination (e.g. xol-1, tra-2) and meiosis (e.g. the kleisins rec-8 and coh-3/4) giving a possible explanation for the absence of males and the second meiotic division. The high degree of heterozygosity allowed us to analyze the expression level of individual alleles. Most of the homologous pairs show very similar expression levels but others exhibit a 2-5-fold difference. CONCLUSIONS Our high-quality draft genome of D. coronatus reveals the peculiarities of the genome of parthenogenesis and provides some clues to the genetic basis for parthenogenetic reproduction. This draft genome should be the basis to elucidate fundamental questions related to parthenogenesis such as its origin and mechanisms through comparative analyses with other nematodes. Furthermore, being the closest outgroup to the genus Caenorhabditis, the draft genome will help to disclose many idiosyncrasies of the model C. elegans and its congeners in future studies.
Collapse
Affiliation(s)
- Hideaki Hiraki
- Genome Biology Laboratory, National Institute of Genetics, Mishima, Japan
| | - Hiroshi Kagoshima
- Genome Biology Laboratory, National Institute of Genetics, Mishima, Japan
- Transdisciplinary Research Integration Center, Research Organization of Information and Systems, Tokyo, Japan
| | | | | | - Yumiko Ueta
- Genome Biology Laboratory, National Institute of Genetics, Mishima, Japan
| | - Michael Kroiher
- Zoologisches Institut, Universität zu Köln, Cologne, NRW Germany
| | | | - Yuji Kohara
- Genome Biology Laboratory, National Institute of Genetics, Mishima, Japan
| |
Collapse
|
33
|
Xu C, Jiao C, Sun H, Cai X, Wang X, Ge C, Zheng Y, Liu W, Sun X, Xu Y, Deng J, Zhang Z, Huang S, Dai S, Mou B, Wang Q, Fei Z, Wang Q. Draft genome of spinach and transcriptome diversity of 120 Spinacia accessions. Nat Commun 2017; 8:15275. [PMID: 28537264 PMCID: PMC5458060 DOI: 10.1038/ncomms15275] [Citation(s) in RCA: 102] [Impact Index Per Article: 14.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2016] [Accepted: 03/07/2017] [Indexed: 01/21/2023] Open
Abstract
Spinach is an important leafy vegetable enriched with multiple necessary nutrients. Here we report the draft genome sequence of spinach (Spinacia oleracea, 2n=12), which contains 25,495 protein-coding genes. The spinach genome is highly repetitive with 74.4% of its content in the form of transposable elements. No recent whole genome duplication events are observed in spinach. Genome syntenic analysis between spinach and sugar beet suggests substantial inter- and intra-chromosome rearrangements during the Caryophyllales genome evolution. Transcriptome sequencing of 120 cultivated and wild spinach accessions reveals more than 420 K variants. Our data suggests that S. turkestanica is likely the direct progenitor of cultivated spinach and spinach domestication has a weak bottleneck. We identify 93 domestication sweeps in the spinach genome, some of which are associated with important agronomic traits including bolting, flowering and leaf numbers. This study offers insights into spinach evolution and domestication and provides resources for spinach research and improvement. Spinach is an economically important vegetable crop but previous genomic resources were of limited use for comparative and functional analyses. Here, Xu et al. present a high quality draft spinach genome and transcriptome data for multiple Spinacia accessions providing insight into Caryophyllales genome evolution.
Collapse
Affiliation(s)
- Chenxi Xu
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Chen Jiao
- Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA
| | - Honghe Sun
- Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA
| | - Xiaofeng Cai
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Xiaoli Wang
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Chenhui Ge
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Yi Zheng
- Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA
| | - Wenli Liu
- Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA
| | - Xuepeng Sun
- Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA
| | - Yimin Xu
- Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA
| | - Jie Deng
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Zhonghua Zhang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Sanwen Huang
- Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Beijing 100081, China
| | - Shaojun Dai
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Beiquan Mou
- USDA-Agricultural Research Service, Crop Improvement and Protection Research Unit, Salinas, California 93905, USA
| | - Quanxi Wang
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| | - Zhangjun Fei
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China.,Boyce Thompson Institute, Cornell University, Ithaca, New York 14853, USA.,USDA-Agricultural Research Service, Robert W. Holley Center for Agriculture and Health, Ithaca, New York 14853, USA
| | - Quanhua Wang
- Development and Collaborative Innovation Center of Plant Germplasm Resources, College of Life and Environmental Sciences, Shanghai Normal University, Shanghai 200234, China
| |
Collapse
|
34
|
Hamilton EP, Kapusta A, Huvos PE, Bidwell SL, Zafar N, Tang H, Hadjithomas M, Krishnakumar V, Badger JH, Caler EV, Russ C, Zeng Q, Fan L, Levin JZ, Shea T, Young SK, Hegarty R, Daza R, Gujja S, Wortman JR, Birren BW, Nusbaum C, Thomas J, Carey CM, Pritham EJ, Feschotte C, Noto T, Mochizuki K, Papazyan R, Taverna SD, Dear PH, Cassidy-Hanley DM, Xiong J, Miao W, Orias E, Coyne RS. Structure of the germline genome of Tetrahymena thermophila and relationship to the massively rearranged somatic genome. eLife 2016; 5. [PMID: 27892853 PMCID: PMC5182062 DOI: 10.7554/elife.19090] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2016] [Accepted: 11/14/2016] [Indexed: 12/30/2022] Open
Abstract
The germline genome of the binucleated ciliate Tetrahymena thermophila undergoes programmed chromosome breakage and massive DNA elimination to generate the somatic genome. Here, we present a complete sequence assembly of the germline genome and analyze multiple features of its structure and its relationship to the somatic genome, shedding light on the mechanisms of genome rearrangement as well as the evolutionary history of this remarkable germline/soma differentiation. Our results strengthen the notion that a complex, dynamic, and ongoing interplay between mobile DNA elements and the host genome have shaped Tetrahymena chromosome structure, locally and globally. Non-standard outcomes of rearrangement events, including the generation of short-lived somatic chromosomes and excision of DNA interrupting protein-coding regions, may represent novel forms of developmental gene regulation. We also compare Tetrahymena's germline/soma differentiation to that of other characterized ciliates, illustrating the wide diversity of adaptations that have occurred within this phylum.
Collapse
Affiliation(s)
- Eileen P Hamilton
- Department of Molecular, Cellular, and Developmental Biology, University of California, Santa Barbara, Santa Barbara, United States
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Piroska E Huvos
- Biochemistry and Molecular Biology, Southern Illinois University, Carbondale, United States
| | | | - Nikhat Zafar
- J. Craig Venter Institute, Rockville, United States
| | - Haibao Tang
- J. Craig Venter Institute, Rockville, United States
| | | | | | | | | | - Carsten Russ
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Qiandong Zeng
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Lin Fan
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Joshua Z Levin
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Terrance Shea
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Sarah K Young
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Ryan Hegarty
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Riza Daza
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Sharvari Gujja
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Jennifer R Wortman
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Bruce W Birren
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Chad Nusbaum
- Eli and Edythe L. Broad Institute of Harvard and MIT, Cambridge, United States
| | - Jainy Thomas
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Clayton M Carey
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Ellen J Pritham
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, United States
| | - Tomoko Noto
- Institute of Molecular Biotechnology, Vienna, Austria
| | | | - Romeo Papazyan
- Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, Baltimore, United States
| | - Sean D Taverna
- Department of Pharmacology and Molecular Sciences, The Johns Hopkins University School of Medicine, Baltimore, United States
| | - Paul H Dear
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | - Jie Xiong
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Wei Miao
- Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China
| | - Eduardo Orias
- Department of Molecular, Cellular, and Developmental Biology, University of California, Santa Barbara, Santa Barbara, United States
| | | |
Collapse
|
35
|
Monat C, Tando N, Tranchant-Dubreuil C, Sabot F. LTRclassifier: A website for fast structural LTR retrotransposons classification in plants. Mob Genet Elements 2016; 6:e1241050. [PMID: 28090381 DOI: 10.1080/2159256x.2016.1241050] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2016] [Revised: 09/20/2016] [Accepted: 09/20/2016] [Indexed: 10/20/2022] Open
Abstract
Automatic classification of LTR retrotransposons is a big challenge in the area of massive genomics. Many tools were developed to detect them but automatic classification is somehow challenging. Here we propose a simple approach, LTRclassifier, based on HMM recognition followed by BLAST analyses (i) to classify plant LTR retrotransposons in their respective superfamily, and (ii) to provide automatically a basic functional annotation of these elements. The method was tested on various TE databases, and shown to be robust and fast. This tool is available as a web service implemented at IRD bioinformatics facility, http://LTRclassifier.ird.fr/.
Collapse
Affiliation(s)
- Cecile Monat
- UMR DIADE IRD/UM, IRD , Montpellier Cedex 5 , France
| | | | | | | |
Collapse
|
36
|
Rius N, Guillén Y, Delprat A, Kapusta A, Feschotte C, Ruiz A. Exploration of the Drosophila buzzatii transposable element content suggests underestimation of repeats in Drosophila genomes. BMC Genomics 2016; 17:344. [PMID: 27164953 PMCID: PMC4862133 DOI: 10.1186/s12864-016-2648-8] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2015] [Accepted: 04/22/2016] [Indexed: 11/10/2022] Open
Abstract
Background Many new Drosophila genomes have been sequenced in recent years using new-generation sequencing platforms and assembly methods. Transposable elements (TEs), being repetitive sequences, are often misassembled, especially in the genomes sequenced with short reads. Consequently, the mobile fraction of many of the new genomes has not been analyzed in detail or compared with that of other genomes sequenced with different methods, which could shed light into the understanding of genome and TE evolution. Here we compare the TE content of three genomes: D. buzzatii st-1, j-19, and D. mojavensis. Results We have sequenced a new D. buzzatii genome (j-19) that complements the D. buzzatii reference genome (st-1) already published, and compared their TE contents with that of D. mojavensis. We found an underestimation of TE sequences in Drosophila genus NGS-genomes when compared to Sanger-genomes. To be able to compare genomes sequenced with different technologies, we developed a coverage-based method and applied it to the D. buzzatii st-1 and j-19 genome. Between 10.85 and 11.16 % of the D. buzzatii st-1 genome is made up of TEs, between 7 and 7,5 % of D. buzzatii j-19 genome, while TEs represent 15.35 % of the D. mojavensis genome. Helitrons are the most abundant order in the three genomes. Conclusions TEs in D. buzzatii are less abundant than in D. mojavensis, as expected according to the genome size and TE content positive correlation. However, TEs alone do not explain the genome size difference. TEs accumulate in the dot chromosomes and proximal regions of D. buzzatii and D. mojavensis chromosomes. We also report a significantly higher TE density in D. buzzatii and D. mojavensis X chromosomes, which is not expected under the current models. Our easy-to-use correction method allowed us to identify recently active families in D. buzzatii st-1 belonging to the LTR-retrotransposon superfamily Gypsy. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2648-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nuria Rius
- Department de Genética i Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain.
| | - Yolanda Guillén
- Department de Genética i Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain
| | - Alejandra Delprat
- Department de Genética i Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT, USA
| | - Alfredo Ruiz
- Department de Genética i Microbiologia, Universitat Autònoma de Barcelona, Bellaterra (Barcelona), Spain
| |
Collapse
|
37
|
Abstract
Helitrons, the eukaryotic rolling-circle transposable elements, are widespread but most prevalent among plant and animal genomes. Recent studies have identified three additional coding and structural variants of Helitrons called Helentrons, Proto-Helentron, and Helitron2. Helitrons and Helentrons make up a substantial fraction of many genomes where nonautonomous elements frequently outnumber the putative autonomous partner. This includes the previously ambiguously classified DINE-1-like repeats, which are highly abundant in Drosophila and many other animal genomes. The purpose of this review is to summarize what we have learned about Helitrons in the decade since their discovery. First, we describe the history of autonomous Helitrons, and their variants. Second, we explain the common coding features and difference in structure of canonical Helitrons versus the endonuclease-encoding Helentrons. Third, we review how Helitrons and Helentrons are classified and discuss why the system used for other transposable element families is not applicable. We also touch upon how genome-wide identification of candidate Helitrons is carried out and how to validate candidate Helitrons. We then shift our focus to a model of transposition and the report of an excision event. We discuss the different proposed models for the mechanism of gene capture. Finally, we will talk about where Helitrons are found, including discussions of vertical versus horizontal transfer, the propensity of Helitrons and Helentrons to capture and shuffle genes and how they impact the genome. We will end the review with a summary of open questions concerning the biology of this intriguing group of transposable elements.
Collapse
|
38
|
Platt RN, Blanco-Berdugo L, Ray DA. Accurate Transposable Element Annotation Is Vital When Analyzing New Genome Assemblies. Genome Biol Evol 2016; 8:403-10. [PMID: 26802115 PMCID: PMC4779615 DOI: 10.1093/gbe/evw009] [Citation(s) in RCA: 64] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Transposable elements (TEs) are mobile genetic elements with the ability to replicate themselves throughout the host genome. In some taxa TEs reach copy numbers in hundreds of thousands and can occupy more than half of the genome. The increasing number of reference genomes from nonmodel species has begun to outpace efforts to identify and annotate TE content and methods that are used vary significantly between projects. Here, we demonstrate variation that arises in TE annotations when less than optimal methods are used. We found that across a variety of taxa, the ability to accurately identify TEs based solely on homology decreased as the phylogenetic distance between the queried genome and a reference increased. Next we annotated repeats using homology alone, as is often the case in new genome analyses, and a combination of homology and de novo methods as well as an additional manual curation step. Reannotation using these methods identified a substantial number of new TE subfamilies in previously characterized genomes, recognized a higher proportion of the genome as repetitive, and decreased the average genetic distance within TE families, implying recent TE accumulation. Finally, these finding-increased recognition of younger TEs-were confirmed via an analysis of the postman butterfly (Heliconius melpomene). These observations imply that complete TE annotation relies on a combination of homology and de novo-based repeat identification, manual curation, and classification and that relying on simple, homology-based methods is insufficient to accurately describe the TE landscape of a newly sequenced genome.
Collapse
Affiliation(s)
- Roy N Platt
- Department of Biological Sciences, Texas Tech University
| | | | - David A Ray
- Department of Biological Sciences, Texas Tech University
| |
Collapse
|
39
|
The genome sequence of Sea-Island cotton (Gossypium barbadense) provides insights into the allopolyploidization and development of superior spinnable fibres. Sci Rep 2015; 5:17662. [PMID: 26634818 PMCID: PMC4669482 DOI: 10.1038/srep17662] [Citation(s) in RCA: 181] [Impact Index Per Article: 20.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2015] [Accepted: 10/30/2015] [Indexed: 01/24/2023] Open
Abstract
Gossypium hirsutum contributes the most production of cotton fibre, but G. barbadense is valued for its better comprehensive resistance and superior fibre properties. However, the allotetraploid genome of G. barbadense has not been comprehensively analysed. Here we present a high-quality assembly of the 2.57 gigabase genome of G. barbadense, including 80,876 protein-coding genes. The double-sized genome of the A (or At) (1.50 Gb) against D (or Dt) (853 Mb) primarily resulted from the expansion of Gypsy elements, including Peabody and Retrosat2 subclades in the Del clade, and the Athila subclade in the Athila/Tat clade. Substantial gene expansion and contraction were observed and rich homoeologous gene pairs with biased expression patterns were identified, suggesting abundant gene sub-functionalization occurred by allopolyploidization. More specifically, the CesA gene family has adapted differentially temporal expression patterns, suggesting an integrated regulatory mechanism of CesA genes from At and Dt subgenomes for the primary and secondary cellulose biosynthesis of cotton fibre in a “relay race”-like fashion. We anticipate that the G. barbadense genome sequence will advance our understanding the mechanism of genome polyploidization and underpin genome-wide comparison research in this genus.
Collapse
|
40
|
Bast J, Schaefer I, Schwander T, Maraun M, Scheu S, Kraaijeveld K. No Accumulation of Transposable Elements in Asexual Arthropods. Mol Biol Evol 2015; 33:697-706. [PMID: 26560353 PMCID: PMC4760076 DOI: 10.1093/molbev/msv261] [Citation(s) in RCA: 53] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Transposable elements (TEs) and other repetitive DNA can accumulate in the absence of recombination, a process contributing to the degeneration of Y-chromosomes and other nonrecombining genome portions. A similar accumulation of repetitive DNA is expected for asexually reproducing species, given their entire genome is effectively nonrecombining. We tested this expectation by comparing the whole-genome TE loads of five asexual arthropod lineages and their sexual relatives, including asexual and sexual lineages of crustaceans (Daphnia water fleas), insects (Leptopilina wasps), and mites (Oribatida). Surprisingly, there was no evidence for increased TE load in genomes of asexual as compared to sexual lineages, neither for all classes of repetitive elements combined nor for specific TE families. Our study therefore suggests that nonrecombining genomes do not accumulate TEs like nonrecombining genomic regions of sexual lineages. Even if a slight but undetected increase of TEs were caused by asexual reproduction, it appears to be negligible compared to variance between species caused by processes unrelated to reproductive mode. It remains to be determined if molecular mechanisms underlying genome regulation in asexuals hamper TE activity. Alternatively, the differences in TE dynamics between nonrecombining genomes in asexual lineages versus nonrecombining genome portions in sexual species might stem from selection for benign TEs in asexual lineages because of the lack of genetic conflict between TEs and their hosts and/or because asexual lineages may only arise from sexual ancestors with particularly low TE loads.
Collapse
Affiliation(s)
- Jens Bast
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Ina Schaefer
- J.F. Blumenbach Institute of Zoology and Anthropology, Georg August University Goettingen, Goettingen, Germany
| | - Tanja Schwander
- Department of Ecology and Evolution, University of Lausanne, Lausanne, Switzerland
| | - Mark Maraun
- J.F. Blumenbach Institute of Zoology and Anthropology, Georg August University Goettingen, Goettingen, Germany
| | - Stefan Scheu
- J.F. Blumenbach Institute of Zoology and Anthropology, Georg August University Goettingen, Goettingen, Germany
| | - Ken Kraaijeveld
- Department of Ecological Science, VU University Amsterdam, Amsterdam, The Netherlands Leiden Genome Technology Center, Department of Human genetics, Leiden University Medical Center, Leiden, The Netherlands
| |
Collapse
|
41
|
Hoen DR, Hickey G, Bourque G, Casacuberta J, Cordaux R, Feschotte C, Fiston-Lavier AS, Hua-Van A, Hubley R, Kapusta A, Lerat E, Maumus F, Pollock DD, Quesneville H, Smit A, Wheeler TJ, Bureau TE, Blanchette M. A call for benchmarking transposable element annotation methods. Mob DNA 2015; 6:13. [PMID: 26244060 PMCID: PMC4524446 DOI: 10.1186/s13100-015-0044-6] [Citation(s) in RCA: 65] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Accepted: 07/22/2015] [Indexed: 12/31/2022] Open
Abstract
DNA derived from transposable elements (TEs) constitutes large parts of the genomes of complex eukaryotes, with major impacts not only on genomic research but also on how organisms evolve and function. Although a variety of methods and tools have been developed to detect and annotate TEs, there are as yet no standard benchmarks-that is, no standard way to measure or compare their accuracy. This lack of accuracy assessment calls into question conclusions from a wide range of research that depends explicitly or implicitly on TE annotation. In the absence of standard benchmarks, toolmakers are impeded in improving their tools, annotators cannot properly assess which tools might best suit their needs, and downstream researchers cannot judge how accuracy limitations might impact their studies. We therefore propose that the TE research community create and adopt standard TE annotation benchmarks, and we call for other researchers to join the authors in making this long-overdue effort a success.
Collapse
Affiliation(s)
- Douglas R Hoen
- School of Computer Science, McGill University, McConnell Engineering Bldg., Rm. 318, 3480 Rue University, Montréal, Québec H3A 0E9 Canada ; Department of Biology, McGill University, Stewart Biology Bldg., 1205 Ave. du Docteur-Penfield, Montréal, Québec H3A 1B1 Canada
| | - Glenn Hickey
- School of Computer Science, McGill University, McConnell Engineering Bldg., Rm. 318, 3480 Rue University, Montréal, Québec H3A 0E9 Canada ; McGill Centre for Bioinformatics, McGill University, Montréal, Québec Canada
| | - Guillaume Bourque
- Department of Human Genetics, McGill University, Montréal, Québec Canada ; McGill University and Génome Québec Innovation Center, Montréal, Québec Canada
| | - Josep Casacuberta
- Centre for Research in Agricultural Genomics CSIC-IRTA-UAB-UB, 08193 Barcelona, Spain
| | - Richard Cordaux
- Université de Poitiers, UMR CNRS 7267 Ecologie et Biologie des Interactions, Equipe Ecologie Evolution Symbiose, 5 Rue Albert Turpin, 86073 Poitiers Cedex 9, France
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - Anna-Sophie Fiston-Lavier
- Institut des Sciences de l'Evolution de Montpellier (ISE-M), Equipe Evolution, Vecteurs, Adaptation et Symbiose, UMR5554 CNRS-Université Montpellier, Montpellier, 34090 cedex 05 France
| | - Aurélie Hua-Van
- Laboratoire Evolution, Génomes, Comportement Ecologie, CNRS-Université Paris-Sud (UMR 9191)-IRD (UMR 247)-Université Paris-Saclay, F-91198 Gif-sur-Yvette, France
| | - Robert Hubley
- Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA 98109 USA
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112 USA
| | - Emmanuelle Lerat
- Laboratoire Biometrie et Biologie Evolutive, Universite Claude Bernard-Lyon 1, UMR-CNRS 5558-Bat. Mendel, 43 bd du 11 novembre 1918, 69622 Villeurbanne cedex, France
| | - Florian Maumus
- INRA, UR1164 URGI-Research Unit in Genomics-Info, INRA de Versailles-Grignon, Route de Saint-Cyr, Versailles, 78026 France
| | - David D Pollock
- University of Colorado School of Medicine, Aurora, CO 80045 USA
| | - Hadi Quesneville
- INRA, UR1164 URGI-Research Unit in Genomics-Info, INRA de Versailles-Grignon, Route de Saint-Cyr, Versailles, 78026 France
| | - Arian Smit
- Institute for Systems Biology, 401 Terry Ave. N, Seattle, WA 98109 USA
| | - Travis J Wheeler
- Department of Computer Science, University of Montana, Missoula, MT 59812 USA
| | - Thomas E Bureau
- Department of Biology, McGill University, Stewart Biology Bldg., 1205 Ave. du Docteur-Penfield, Montréal, Québec H3A 1B1 Canada
| | - Mathieu Blanchette
- School of Computer Science, McGill University, McConnell Engineering Bldg., Rm. 318, 3480 Rue University, Montréal, Québec H3A 0E9 Canada ; McGill Centre for Bioinformatics, McGill University, Montréal, Québec Canada
| |
Collapse
|
42
|
Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mob DNA 2015; 6:11. [PMID: 26045719 PMCID: PMC4455052 DOI: 10.1186/s13100-015-0041-9] [Citation(s) in RCA: 1699] [Impact Index Per Article: 188.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2015] [Accepted: 04/17/2015] [Indexed: 02/08/2023] Open
Abstract
Repbase Update (RU) is a database of representative repeat sequences in eukaryotic genomes. Since its first development as a database of human repetitive sequences in 1992, RU has been serving as a well-curated reference database fundamental for almost all eukaryotic genome sequence analyses. Here, we introduce recent updates of RU, focusing on technical issues concerning the submission and updating of Repbase entries and will give short examples of using RU data. RU sincerely invites a broader submission of repeat sequences from the research community.
Collapse
Affiliation(s)
- Weidong Bao
- Genetic Information Research Institute, 5150 El Camino Real, Ste B-30, Los Altos, CA 94022 USA
| | - Kenji K Kojima
- Genetic Information Research Institute, 5150 El Camino Real, Ste B-30, Los Altos, CA 94022 USA ; Department of Computational Biology and Medical Sciences, Graduate School of Frontier Sciences, University of Tokyo, Minato-ku, Tokyo Japan ; Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai Minato-ku, Tokyo, 108-8639 Japan
| | - Oleksiy Kohany
- Genetic Information Research Institute, 5150 El Camino Real, Ste B-30, Los Altos, CA 94022 USA
| |
Collapse
|
43
|
Ji Y, Marra NJ, DeWoody JA. Comparative analysis of active retrotransposons in the transcriptomes of three species of heteromyid rodents. Gene 2015; 562:95-106. [DOI: 10.1016/j.gene.2015.02.058] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2014] [Revised: 02/16/2015] [Accepted: 02/17/2015] [Indexed: 10/24/2022]
|
44
|
Maumus F, Fiston-Lavier AS, Quesneville H. Impact of transposable elements on insect genomes and biology. CURRENT OPINION IN INSECT SCIENCE 2015; 7:30-36. [PMID: 32846669 DOI: 10.1016/j.cois.2015.01.001] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Revised: 12/30/2014] [Accepted: 01/06/2015] [Indexed: 06/11/2023]
Affiliation(s)
- Florian Maumus
- Unité de recherche en Génomique-Info (URGI), UR1164, INRA, RD10 route de Saint Cyr, 78026 Versailles, France.
| | - Anna-Sophie Fiston-Lavier
- Institut des Sciences de l'Evolution de Montpellier (ISEM), UMR5554 CNRS-Université Montpellier II, 2 place Eugene Bataillon, bat. 22, CC065 34095 Montpellier Cedex 05, France
| | - Hadi Quesneville
- Unité de recherche en Génomique-Info (URGI), UR1164, INRA, RD10 route de Saint Cyr, 78026 Versailles, France
| |
Collapse
|
45
|
Green RE, Braun EL, Armstrong J, Earl D, Nguyen N, Hickey G, Vandewege MW, St John JA, Capella-Gutiérrez S, Castoe TA, Kern C, Fujita MK, Opazo JC, Jurka J, Kojima KK, Caballero J, Hubley RM, Smit AF, Platt RN, Lavoie CA, Ramakodi MP, Finger JW, Suh A, Isberg SR, Miles L, Chong AY, Jaratlerdsiri W, Gongora J, Moran C, Iriarte A, McCormack J, Burgess SC, Edwards SV, Lyons E, Williams C, Breen M, Howard JT, Gresham CR, Peterson DG, Schmitz J, Pollock DD, Haussler D, Triplett EW, Zhang G, Irie N, Jarvis ED, Brochu CA, Schmidt CJ, McCarthy FM, Faircloth BC, Hoffmann FG, Glenn TC, Gabaldón T, Paten B, Ray DA. Three crocodilian genomes reveal ancestral patterns of evolution among archosaurs. Science 2014; 346:1254449. [PMID: 25504731 PMCID: PMC4386873 DOI: 10.1126/science.1254449] [Citation(s) in RCA: 230] [Impact Index Per Article: 23.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
To provide context for the diversification of archosaurs--the group that includes crocodilians, dinosaurs, and birds--we generated draft genomes of three crocodilians: Alligator mississippiensis (the American alligator), Crocodylus porosus (the saltwater crocodile), and Gavialis gangeticus (the Indian gharial). We observed an exceptionally slow rate of genome evolution within crocodilians at all levels, including nucleotide substitutions, indels, transposable element content and movement, gene family evolution, and chromosomal synteny. When placed within the context of related taxa including birds and turtles, this suggests that the common ancestor of all of these taxa also exhibited slow genome evolution and that the comparatively rapid evolution is derived in birds. The data also provided the opportunity to analyze heterozygosity in crocodilians, which indicates a likely reduction in population size for all three taxa through the Pleistocene. Finally, these data combined with newly published bird genomes allowed us to reconstruct the partial genome of the common ancestor of archosaurs, thereby providing a tool to investigate the genetic starting material of crocodilians, birds, and dinosaurs.
Collapse
Affiliation(s)
- Richard E Green
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA.
| | - Edward L Braun
- Department of Biology and Genetics Institute, University of Florida, Gainesville, FL 32611, USA
| | - Joel Armstrong
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Dent Earl
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Ngan Nguyen
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Glenn Hickey
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA. Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Michael W Vandewege
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - John A St John
- Department of Biomolecular Engineering, University of California, Santa Cruz, CA 95064, USA
| | - Salvador Capella-Gutiérrez
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, 08003 Barcelona, Spain. Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Todd A Castoe
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA. Department of Biology, University of Texas, Arlington, TX 76019, USA
| | - Colin Kern
- Department of Computer and Information Sciences, University of Delaware, Newark, DE 19717, USA
| | - Matthew K Fujita
- Department of Biology, University of Texas, Arlington, TX 76019, USA
| | - Juan C Opazo
- Instituto de Ciencias Ambientales y Evolutivas, Facultad de Ciencias, Universidad Austral de Chile, Valdivia, Chile
| | - Jerzy Jurka
- Genetic Information Research Institute, Mountain View, CA 94043, USA
| | - Kenji K Kojima
- Genetic Information Research Institute, Mountain View, CA 94043, USA
| | | | | | - Arian F Smit
- Institute for Systems Biology, Seattle, WA 98109, USA
| | - Roy N Platt
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Christine A Lavoie
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Meganathan P Ramakodi
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - John W Finger
- Department of Environmental Health Science, University of Georgia, Athens, GA 30602, USA
| | - Alexander Suh
- Institute of Experimental Pathology (ZMBE), University of Münster, D-48149 Münster, Germany. Department of Evolutionary Biology (EBC), Uppsala University, SE-752 36 Uppsala, Sweden
| | - Sally R Isberg
- Porosus Pty. Ltd., Palmerston, NT 0831, Australia. Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia. Centre for Crocodile Research, Noonamah, NT 0837, Australia
| | - Lee Miles
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Amanda Y Chong
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | | | - Jaime Gongora
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Christopher Moran
- Faculty of Veterinary Science, University of Sydney, Sydney, NSW 2006, Australia
| | - Andrés Iriarte
- Departamento de Desarrollo Biotecnológico, Instituto de Higiene, Facultad de Medicina, Universidad de la República, Montevideo, Uruguay
| | - John McCormack
- Moore Laboratory of Zoology, Occidental College, Los Angeles, CA 90041, USA
| | - Shane C Burgess
- College of Agriculture and Life Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Scott V Edwards
- Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA
| | - Eric Lyons
- School of Plant Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Christina Williams
- Department of Molecular Biomedical Sciences, North Carolina State University, Raleigh, NC 27607, USA
| | - Matthew Breen
- Department of Molecular Biomedical Sciences, North Carolina State University, Raleigh, NC 27607, USA
| | - Jason T Howard
- Howard Hughes Medical Institute, Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Cathy R Gresham
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Daniel G Peterson
- Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA. Department of Plant and Soil Sciences, Mississippi State University, Mississippi State, MS 39762, USA
| | - Jürgen Schmitz
- Institute of Experimental Pathology (ZMBE), University of Münster, D-48149 Münster, Germany
| | - David D Pollock
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, CO 80045, USA
| | - David Haussler
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA. Howard Hughes Medical Institute, Bethesda, MD 20814, USA
| | - Eric W Triplett
- Department of Microbiology and Cell Science, University of Florida, Gainesville, FL 32611, USA
| | - Guojie Zhang
- China National GeneBank, BGI-Shenzhen, Shenzhen, China. Center for Social Evolution, Department of Biology, University of Copenhagen, Copenhagen, Denmark
| | - Naoki Irie
- Department of Biological Sciences, Graduate School of Science, University of Tokyo, Tokyo, Japan
| | - Erich D Jarvis
- Howard Hughes Medical Institute, Department of Neurobiology, Duke University Medical Center, Durham, NC 27710, USA
| | - Christopher A Brochu
- Department of Earth and Environmental Sciences, University of Iowa, Iowa City, IA 52242, USA
| | - Carl J Schmidt
- Department of Animal and Food Sciences, University of Delaware, Newark, DE 19717, USA
| | - Fiona M McCarthy
- School of Animal and Comparative Biomedical Sciences, University of Arizona, Tucson, AZ 85721, USA
| | - Brant C Faircloth
- Department of Ecology and Evolutionary Biology, University of California, Los Angeles, CA 90019, USA. Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Federico G Hoffmann
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA
| | - Travis C Glenn
- Department of Environmental Health Science, University of Georgia, Athens, GA 30602, USA
| | - Toni Gabaldón
- Bioinformatics and Genomics Programme, Centre for Genomic Regulation, 08003 Barcelona, Spain. Universitat Pompeu Fabra, 08003 Barcelona, Spain. Institució Catalana de Recerca i Estudis Avançats, 08010 Barcelona, Spain
| | - Benedict Paten
- Center for Biomolecular Science and Engineering, University of California, Santa Cruz, CA 95064, USA
| | - David A Ray
- Department of Biochemistry, Molecular Biology, Entomology and Plant Pathology, Mississippi State University, Mississippi State, MS 39762, USA. Institute for Genomics, Biocomputing and Biotechnology, Mississippi State University, Mississippi State, MS 39762, USA. Department of Biological Sciences, Texas Tech University, Lubbock, TX 79409, USA.
| |
Collapse
|
46
|
Castanera R, Pérez G, López L, Sancho R, Santoyo F, Alfaro M, Gabaldón T, Pisabarro AG, Oguiza JA, Ramírez L. Highly expressed captured genes and cross-kingdom domains present in Helitrons create novel diversity in Pleurotus ostreatus and other fungi. BMC Genomics 2014; 15:1071. [PMID: 25480150 PMCID: PMC4289320 DOI: 10.1186/1471-2164-15-1071] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 11/14/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Helitrons are class-II eukaryotic transposons that transpose via a rolling circle mechanism. Due to their ability to capture and mobilize gene fragments, they play an important role in the evolution of their host genomes. We have used a bioinformatics approach for the identification of helitrons in two Pleurotus ostreatus genomes using de novo detection and homology-based searching. We have analyzed the presence of helitron-captured genes as well as the expansion of helitron-specific helicases in fungi and performed a phylogenetic analysis of their conserved domains with other representative eukaryotic species. RESULTS Our results show the presence of two helitron families in P. ostreatus that disrupt gene colinearity and cause a lack of synteny between their genomes. Both putative autonomous and non-autonomous helitrons were transcriptionally active, and some of them carried highly expressed captured genes of unknown origin and function. In addition, both families contained eukaryotic, bacterial and viral domains within the helitron's boundaries. A phylogenetic reconstruction of RepHel helicases using the Helitron-like and PIF1-like helicase conserved domains revealed a polyphyletic origin for eukaryotic helitrons. CONCLUSION P. ostreatus helitrons display features similar to other eukaryotic helitrons and do not tend to capture host genes or gene fragments. The occurrence of genes probably captured from other hosts inside the helitrons boundaries pose the hypothesis that an ancient horizontal transfer mechanism could have taken place. The viral domains found in some of these genes and the polyphyletic origin of RepHel helicases in the eukaryotic kingdom suggests that virus could have played a role in a putative lateral transfer of helitrons within the eukaryotic kingdom. The high similarity of some helitrons, along with the transcriptional activity of its RepHel helicases indicates that these elements are still active in the genome of P. ostreatus.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Lucía Ramírez
- Department of Agrarian Production, Genetics and Microbiology Research Group, Public University of Navarre, 31006 Pamplona, Navarre, Spain.
| |
Collapse
|
47
|
Thomas J, Phillips CD, Baker RJ, Pritham EJ. Rolling-circle transposons catalyze genomic innovation in a mammalian lineage. Genome Biol Evol 2014; 6:2595-610. [PMID: 25223768 PMCID: PMC4224331 DOI: 10.1093/gbe/evu204] [Citation(s) in RCA: 40] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Rolling-circle transposons (Helitrons) are a newly discovered group of mobile DNA widespread in plant and invertebrate genomes but limited to the bat family Vespertilionidae among mammals. Little is known about the long-term impact of Helitron activity because the genomes where Helitron activity has been extensively studied are predominated by young families. Here, we report a comprehensive catalog of vetted Helitrons from the 7× Myotis lucifugus genome assembly. To estimate the timing of transposition, we scored presence/absence across related vespertilionid genome sequences with estimated divergence times. This analysis revealed that the Helibat family has been a persistent source of genomic innovation throughout the vespertilionid diversification from approximately 30–36 Ma to as recently as approximately 1.8–6 Ma. This is the first report of persistent Helitron transposition over an extended evolutionary timeframe. These findings illustrate that the pattern of Helitron activity is akin to the vertical persistence of LINE retrotransposons in primates and other mammalian lineages. Like retrotransposition in primates, rolling-circle transposition has generated lineage-specific variation and accounts for approximately 110 Mb, approximately 6% of the genome of M. lucifugus. The Helitrons carry a heterogeneous assortment of host sequence including retroposed messenger RNAs, retrotransposons, DNA transposons, as well as introns, exons and regulatory regions (promoters, 5′-untranslated regions [UTRs], and 3′-UTRs) of which some are evolving in a pattern suggestive of purifying selection. Evidence that Helitrons have contributed putative promoters, exons, splice sites, polyadenylation sites, and microRNA-binding sites to transcripts otherwise conserved across mammals is presented, and the implication of Helitron activity to innovation in these unique mammals is discussed.
Collapse
Affiliation(s)
- Jainy Thomas
- Department of Human Genetics, University of Utah
| | - Caleb D Phillips
- Department of Biological Sciences and Museum, Texas Tech University
| | - Robert J Baker
- Department of Biological Sciences and Museum, Texas Tech University
| | | |
Collapse
|
48
|
Ahmed S, Cock JM, Pessia E, Luthringer R, Cormier A, Robuchon M, Sterck L, Peters AF, Dittami SM, Corre E, Valero M, Aury JM, Roze D, Van de Peer Y, Bothwell J, Marais GAB, Coelho SM. A haploid system of sex determination in the brown alga Ectocarpus sp. Curr Biol 2014; 24:1945-57. [PMID: 25176635 DOI: 10.1016/j.cub.2014.07.042] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Revised: 02/11/2014] [Accepted: 07/15/2014] [Indexed: 11/15/2022]
Abstract
BACKGROUND A common feature of most genetic sex-determination systems studied so far is that sex is determined by nonrecombining genomic regions, which can be of various sizes depending on the species. These regions have evolved independently and repeatedly across diverse groups. A number of such sex-determining regions (SDRs) have been studied in animals, plants, and fungi, but very little is known about the evolution of sexes in other eukaryotic lineages. RESULTS We report here the sequencing and genomic analysis of the SDR of Ectocarpus, a brown alga that has been evolving independently from plants, animals, and fungi for over one giga-annum. In Ectocarpus, sex is expressed during the haploid phase of the life cycle, and both the female (U) and the male (V) sex chromosomes contain nonrecombining regions. The U and V of this species have been diverging for more than 70 mega-annum, yet gene degeneration has been modest, and the SDR is relatively small, with no evidence for evolutionary strata. These features may be explained by the occurrence of strong purifying selection during the haploid phase of the life cycle and the low level of sexual dimorphism. V is dominant over U, suggesting that femaleness may be the default state, adopted when the male haplotype is absent. CONCLUSIONS The Ectocarpus UV system has clearly had a distinct evolutionary trajectory not only to the well-studied XY and ZW systems but also to the UV systems described so far. Nonetheless, some striking similarities exist, indicating remarkable universality of the underlying processes shaping sex chromosome evolution across distant lineages.
Collapse
Affiliation(s)
- Sophia Ahmed
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France; Medical Biology Centre, Queens University Belfast, Belfast BT9 7BL, Northern Ireland, UK
| | - J Mark Cock
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Eugenie Pessia
- Laboratoire de Biométrie et Biologie Évolutive, UMR 5558, Centre National de la Recherche Scientifique, Université Lyon 1, 69622 Villeurbanne, France
| | - Remy Luthringer
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Alexandre Cormier
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Marine Robuchon
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France; Evolutionary Biology and Ecology of Algae, CNRS UMI 3604, Sorbonne Université, UPMC, PUCCh, UACH, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Lieven Sterck
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics (Ghent University), Technologiepark 927, 9052 Gent, Belgium
| | | | - Simon M Dittami
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Erwan Corre
- ABiMS Platform, FR2424, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Myriam Valero
- Evolutionary Biology and Ecology of Algae, CNRS UMI 3604, Sorbonne Université, UPMC, PUCCh, UACH, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Jean-Marc Aury
- Commissariat à l'Energie Atomique (CEA), Institut de Génomique (IG), Genoscope, 91000 Evry, France
| | - Denis Roze
- Evolutionary Biology and Ecology of Algae, CNRS UMI 3604, Sorbonne Université, UPMC, PUCCh, UACH, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France
| | - Yves Van de Peer
- Department of Plant Systems Biology (VIB) and Department of Plant Biotechnology and Bioinformatics (Ghent University), Technologiepark 927, 9052 Gent, Belgium; Genomics Research Institute, University of Pretoria, Hatfield Campus, Pretoria 0028, South Africa
| | - John Bothwell
- Medical Biology Centre, Queens University Belfast, Belfast BT9 7BL, Northern Ireland, UK
| | - Gabriel A B Marais
- Laboratoire de Biométrie et Biologie Évolutive, UMR 5558, Centre National de la Recherche Scientifique, Université Lyon 1, 69622 Villeurbanne, France
| | - Susana M Coelho
- Integrative Biology of Marine Models, CNRS UMR 8227, Sorbonne Universités, UPMC Université Paris 6, Station Biologique de Roscoff, CS 90074, 29688 Roscoff, France.
| |
Collapse
|
49
|
Dias GB, Svartman M, Delprat A, Ruiz A, Kuhn GCS. Tetris is a foldback transposon that provided the building blocks for an emerging satellite DNA of Drosophila virilis. Genome Biol Evol 2014; 6:1302-13. [PMID: 24858539 PMCID: PMC4079207 DOI: 10.1093/gbe/evu108] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
Transposable elements (TEs) and satellite DNAs (satDNAs) are abundant components of most eukaryotic genomes studied so far and their impact on evolution has been the focus of several studies. A number of studies linked TEs with satDNAs, but the nature of their evolutionary relationships remains unclear. During in silico analyses of the Drosophila virilis assembled genome, we found a novel DNA transposon we named Tetris based on its modular structure and diversity of rearranged forms. We aimed to characterize Tetris and investigate its role in generating satDNAs. Data mining and sequence analysis showed that Tetris is apparently nonautonomous, with a structure similar to foldback elements, and present in D. virilis and D. americana. Herein, we show that Tetris shares the final portions of its terminal inverted repeats (TIRs) with DAIBAM, a previously described miniature inverted transposable element implicated in the generation of chromosome inversions. Both elements are likely to be mobilized by the same autonomous TE. Tetris TIRs contain approximately 220-bp internal tandem repeats that we have named TIR-220. We also found TIR-220 repeats making up longer (kb-size) satDNA-like arrays. Using bioinformatic, phylogenetic and cytogenomic tools, we demonstrated that Tetris has contributed to shaping the genomes of D. virilis and D. americana, providing internal tandem repeats that served as building blocks for the amplification of satDNA arrays. The β-heterochromatic genomic environment seemed to have favored such amplification. Our results imply for the first time a role for foldback elements in generating satDNAs.
Collapse
Affiliation(s)
- Guilherme B Dias
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Marta Svartman
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| | - Alejandra Delprat
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Catalunya, Spain
| | - Alfredo Ruiz
- Departament de Genètica i de Microbiologia, Universitat Autònoma de Barcelona, Bellaterra, Catalunya, Spain
| | - Gustavo C S Kuhn
- Departamento de Biologia Geral, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil
| |
Collapse
|
50
|
Soria-Carrasco V, Gompert Z, Comeault AA, Farkas TE, Parchman TL, Johnston JS, Buerkle CA, Feder JL, Bast J, Schwander T, Egan SP, Crespi BJ, Nosil P. Stick insect genomes reveal natural selection's role in parallel speciation. Science 2014; 344:738-42. [PMID: 24833390 DOI: 10.1126/science.1252136] [Citation(s) in RCA: 294] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
Natural selection can drive the repeated evolution of reproductive isolation, but the genomic basis of parallel speciation remains poorly understood. We analyzed whole-genome divergence between replicate pairs of stick insect populations that are adapted to different host plants and undergoing parallel speciation. We found thousands of modest-sized genomic regions of accentuated divergence between populations, most of which are unique to individual population pairs. We also detected parallel genomic divergence across population pairs involving an excess of coding genes with specific molecular functions. Regions of parallel genomic divergence in nature exhibited exceptional allele frequency changes between hosts in a field transplant experiment. The results advance understanding of biological diversification by providing convergent observational and experimental evidence for selection's role in driving repeatable genomic divergence.
Collapse
Affiliation(s)
- Víctor Soria-Carrasco
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK
| | | | - Aaron A Comeault
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK
| | - Timothy E Farkas
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK
| | | | - J Spencer Johnston
- Department of Entomology, Texas A&M University, College Station, TX 77843, USA
| | - C Alex Buerkle
- Department of Botany, University of Wyoming, Laramie, WY 82071, USA
| | - Jeffrey L Feder
- Department of Biology, Notre Dame University, South Bend, IN 46556, USA
| | - Jens Bast
- J. F. Blumenbach Institute of Zoology and Anthropology, University of Göttingen, 37073 Göttingen, Germany
| | - Tanja Schwander
- Department of Ecology and Evolution, University of Lausanne, Lausanne CH-1015, Switzerland
| | - Scott P Egan
- Department of Ecology and Evolutionary Biology, Rice University, Houston, TX 77005, USA
| | - Bernard J Crespi
- Department of Biological Sciences, Simon Fraser University, Burnaby, BC V5A 1S6, Canada
| | - Patrik Nosil
- Department of Animal and Plant Sciences, University of Sheffield, Sheffield S10 2TN, UK.
| |
Collapse
|