1
|
Asif-Laidin A, Casier K, Ziriat Z, Boivin A, Viodé E, Delmarre V, Ronsseray S, Carré C, Teysset L. Modeling early germline immunization after horizontal transfer of transposable elements reveals internal piRNA cluster heterogeneity. BMC Biol 2023; 21:117. [PMID: 37226160 DOI: 10.1186/s12915-023-01616-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Accepted: 05/05/2023] [Indexed: 05/26/2023] Open
Abstract
BACKGROUND A fraction of all genomes is composed of transposable elements (TEs) whose mobility needs to be carefully controlled. In gonads, TE activity is repressed by PIWI-interacting RNAs (piRNAs), a class of small RNAs synthesized by heterochromatic loci enriched in TE fragments, called piRNA clusters. Maintenance of active piRNA clusters across generations is secured by maternal piRNA inheritance providing the memory for TE repression. On rare occasions, genomes encounter horizontal transfer (HT) of new TEs with no piRNA targeting them, threatening the host genome integrity. Naïve genomes can eventually start to produce new piRNAs against these genomic invaders, but the timing of their emergence remains elusive. RESULTS Using a set of TE-derived transgenes inserted in different germline piRNA clusters and functional assays, we have modeled a TE HT in Drosophila melanogaster. We have found that the complete co-option of these transgenes by a germline piRNA cluster can occur within four generations associated with the production of new piRNAs all along the transgenes and the germline silencing of piRNA sensors. Synthesis of new transgenic TE piRNAs is linked to piRNA cluster transcription dependent on Moonshiner and heterochromatin mark deposition that propagates more efficiently on short sequences. Moreover, we found that sequences located within piRNA clusters can have different piRNA profiles and can influence transcript accumulation of nearby sequences. CONCLUSIONS Our study reveals that genetic and epigenetic properties, such as transcription, piRNA profiles, heterochromatin, and conversion efficiency along piRNA clusters, could be heterogeneous depending on the sequences that compose them. These findings suggest that the capacity of transcriptional signal erasure induced by the chromatin complex specific of the piRNA cluster can be incomplete through the piRNA cluster loci. Finally, these results have revealed an unexpected level of complexity that highlights a new magnitude of piRNA cluster plasticity fundamental for the maintenance of genome integrity.
Collapse
Affiliation(s)
- Amna Asif-Laidin
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Karine Casier
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
- Present Address: CNRS, Institut de Biologie Physico-Chimique, Laboratoire de Biologie Moléculaire et Cellulaire des Eucaryotes, UMR8226, Telomere Biology, Paris, F-75005, France
| | - Zoheir Ziriat
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Antoine Boivin
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Elise Viodé
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Valérie Delmarre
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Stéphane Ronsseray
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Clément Carré
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France
| | - Laure Teysset
- Sorbonne Université, CNRS, Institut de Biologie Paris-Seine, Laboratoire Biologie du Développement, UMR7622, "Transgenerational Epigenetics & Small RNA Biology", Paris, F-75005, France.
| |
Collapse
|
2
|
Yushkova E, Moskalev A. Transposable elements and their role in aging. Ageing Res Rev 2023; 86:101881. [PMID: 36773759 DOI: 10.1016/j.arr.2023.101881] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2022] [Revised: 01/16/2023] [Accepted: 02/07/2023] [Indexed: 02/12/2023]
Abstract
Transposable elements (TEs) are an important part of eukaryotic genomes. The role of somatic transposition in aging, carcinogenesis, and other age-related diseases has been determined. This review discusses the fundamental properties of TEs and their complex interactions with cellular processes, which are crucial for understanding the diverse effects of their activity on the genetics and epigenetics of the organism. The interactions of TEs with recombination, replication, repair, and chromosomal regulation; the ability of TEs to maintain a balance between their own activity and repression, the involvement of TEs in the creation of new or alternative genes, the expression of coding/non-coding RNA, and the role in DNA damage and modification of regulatory networks are reviewed. The contribution of the derepressed TEs to age-dependent effects in individual cells/tissues in different organisms was assessed. Conflicting information about TE activity under stress as well as theories of aging mechanisms related to TEs is discussed. On the one hand, transposition activity in response to stressors can lead to organisms acquiring adaptive innovations of great importance for evolution at the population level. On the other hand, the TE expression can cause decreased longevity and stress tolerance at the individual level. The specific features of TE effects on aging processes in germline and soma and the ways of their regulation in cells are highlighted. Recent results considering somatic mutations in normal human and animal tissues are indicated, with the emphasis on their possible functional consequences. In the context of aging, the correlation between somatic TE activation and age-related changes in the number of proteins required for heterochromatin maintenance and longevity regulation was analyzed. One of the original features of this review is a discussion of not only effects based on the TEs insertions and the associated consequences for the germline cell dynamics and somatic genome, but also the differences between transposon- and retrotransposon-mediated structural genome changes and possible phenotypic characteristics associated with aging and various age-related pathologies. Based on the analysis of published data, a hypothesis about the influence of the species-specific features of number, composition, and distribution of TEs on aging dynamics of different animal genomes was formulated.
Collapse
Affiliation(s)
- Elena Yushkova
- Laboratory of Geroprotective and Radioprotective Technologies, Institute of Biology, Komi Science Center, Ural Branch, Russian Academy of Sciences, 28 Kommunisticheskaya st., 167982 Syktyvkar, Russian Federation
| | - Alexey Moskalev
- Laboratory of Geroprotective and Radioprotective Technologies, Institute of Biology, Komi Science Center, Ural Branch, Russian Academy of Sciences, 28 Kommunisticheskaya st., 167982 Syktyvkar, Russian Federation; Laboratory of Genetics and Epigenetics of Aging, Russian Clinical Research Center for Gerontology, Pirogov Russian National Research Medical University, Moscow 129226, Russian Federation; Longaevus Technologies, London, UK.
| |
Collapse
|
3
|
Unravelling complex transposable elements surrounding bla GES-16 in a Pseudomonas aeruginosa ExoU strain. J Glob Antimicrob Resist 2022; 30:143-147. [PMID: 35447384 DOI: 10.1016/j.jgar.2022.04.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2021] [Revised: 03/16/2022] [Accepted: 04/11/2022] [Indexed: 11/22/2022] Open
Abstract
OBJECTIVES We characterised the complex surrounding regions of blaGES-16 in a Pseudomonas aeruginosa exoU+ strain (P-10.226) in Brazil. METHODS Species identification was performed by MALDI-TOF MS, and the antimicrobial susceptibility profile was determined by broth microdilution based on European Committee on Antimicrobial Susceptibility Testing (EUCAST) breakpoints. The whole genome sequencing (WGS) of P-10.226 strain was performed using both short-read paired-end sequencing on the Illumina MiSeq platform as well as the long-read Oxford Nanopore MinION. RESULTS WGS analysis showed that P-10.226 carried blaGES-16, which was found as a gene cassette inserted into a novel class I integron, In1992 (aadB-blaOXA-56-blaGES-16-aadB-aadA6c), whose 3'-CS was truncated by a nested transposable element, IS5564::ISPa157. The structure was even more complex since IS6100-ΔIS6100 structure and a TnAs2-like harbouring the operon merRTPADE was found downstream In1992. Fragments of TnAs3 harbouring 25-bp imperfect inverted repeats were identified bordering the intl1 of In1992 and also flanking IS6100-ΔIS6100, which might be genetic marks of its previous presence in the genome. Interestingly, In1992 also shows a distinct cassette array from In581 (blaGES-16-dfrA22-aacA27-aadA1), which was previously reported in Serratia marcescens strains recovered in Brazil. Finally, exoU gene, which encodes a potent cytotoxin of type III secretion systems (T3SS) effector proteins from P. aeruginosa and is associated to severe infections, was also detected. CONCLUSION We described the novel In1992 carrying blaGES-16 surrounded by complex transposition events in a XDR P. aeruginosa strain. The identification of many sets of direct repeats adjacent to TnAs3 fragments indicates a major past of transposition events that shaped the current genetic environment of In1992.
Collapse
|
4
|
Orozco-Arias S, Candamil-Cortes MS, Jaimes PA, Valencia-Castrillon E, Tabares-Soto R, Isaza G, Guyot R. Automatic curation of LTR retrotransposon libraries from plant genomes through machine learning. J Integr Bioinform 2022; 19:jib-2021-0036. [PMID: 35822734 PMCID: PMC9521825 DOI: 10.1515/jib-2021-0036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 06/10/2022] [Indexed: 11/19/2022] Open
Abstract
Transposable elements are mobile sequences that can move and insert themselves into chromosomes, activating under internal or external stimuli, giving the organism the ability to adapt to the environment. Annotating transposable elements in genomic data is currently considered a crucial task to understand key aspects of organisms such as phenotype variability, species evolution, and genome size, among others. Because of the way they replicate, LTR retrotransposons are the most common transposable elements in plants, accounting in some cases for up to 80% of all DNA information. To annotate these elements, a reference library is usually created, a curation process is performed, eliminating TE fragments and false positives and then annotated in the genome using the homology method. However, the curation process can take weeks, requires extensive manual work and the execution of multiple time-consuming bioinformatics software. Here, we propose a machine learning-based approach to perform this process automatically on plant genomes, obtaining up to 91.18% F1-score. This approach was tested with four plant species, obtaining up to 93.6% F1-score (Oryza granulata) in only 22.61 s, where bioinformatics methods took approximately 6 h. This acceleration demonstrates that the ML-based approach is efficient and could be used in massive sequencing projects.
Collapse
Affiliation(s)
- Simon Orozco-Arias
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Colombia.,Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia
| | | | - Paula A Jaimes
- Department of Computer Science, Universidad Autónoma de Manizales, Manizales, Colombia
| | | | - Reinel Tabares-Soto
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia
| | - Gustavo Isaza
- Department of Systems and Informatics, Universidad de Caldas, Manizales, Colombia
| | - Romain Guyot
- Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia.,Institut de Recherche pour le Développement, CIRAD, Univ. Montpellier, Montpellier, France
| |
Collapse
|
5
|
Rodriguez M, Makałowski W. Software evaluation for de novo detection of transposons. Mob DNA 2022; 13:14. [PMID: 35477485 PMCID: PMC9047281 DOI: 10.1186/s13100-022-00266-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2021] [Accepted: 03/16/2022] [Indexed: 11/16/2022] Open
Abstract
Transposable elements (TEs) are major genomic components in most eukaryotic genomes and play an important role in genome evolution. However, despite their relevance the identification of TEs is not an easy task and a number of tools were developed to tackle this problem. To better understand how they perform, we tested several widely used tools for de novo TE detection and compared their performance on both simulated data and well curated genomic sequences. As expected, tools that build TE-models performed better than k-mer counting ones, with RepeatModeler beating competitors in most datasets. However, there is a tendency for most tools to identify TE-regions in a fragmented manner and it is also frequent that small TEs or fragmented TEs are not detected. Consequently, the identification of TEs is still a challenging endeavor and it requires a significant manual curation by an experienced expert. The results will be helpful for identifying common issues associated with TE-annotation and for evaluating how comparable are the results obtained with different tools.
Collapse
Affiliation(s)
- Matias Rodriguez
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149, Münster, Germany
| | - Wojciech Makałowski
- Institute of Bioinformatics, Faculty of Medicine, University of Münster, 48149, Münster, Germany.
| |
Collapse
|
6
|
A large transposable element mediates metal resistance in the fungus Paecilomyces variotii. Curr Biol 2022; 32:937-950.e5. [PMID: 35063120 DOI: 10.1016/j.cub.2021.12.048] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2020] [Revised: 08/11/2021] [Accepted: 12/17/2021] [Indexed: 12/19/2022]
Abstract
The horizontal transfer of large gene clusters by mobile elements is a key driver of prokaryotic adaptation in response to environmental stresses. Eukaryotic microbes face similar stresses; however, a parallel role for mobile elements has not been established. A stress faced by many microorganisms is toxic metal ions in their environment. In fungi, identified mechanisms for protection against metals generally rely on genes that are dispersed within an organism's genome. Here, we discover a large (∼85 kb) region that confers tolerance to five metal/metalloid ions (arsenate, cadmium, copper, lead, and zinc) in the genomes of some, but not all, strains of a fungus, Paecilomyces variotii. We name this region HEPHAESTUS (Hφ) and present evidence that it is mobile within the P. variotii genome with features characteristic of a transposable element. HEPHAESTUS contains the greatest complement of host-beneficial genes carried by a transposable element in eukaryotes, suggesting that eukaryotic transposable elements might play a role analogous to bacteria in the horizontal transfer of large regions of host-beneficial DNA. Genes within HEPHAESTUS responsible for individual metal tolerances include those encoding a P-type ATPase transporter-PcaA-required for cadmium and lead tolerance, a transporter-ZrcA-providing tolerance to zinc, and a multicopper oxidase-McoA-conferring tolerance to copper. In addition, a subregion of Hφ confers tolerance to arsenate. The genome sequences of other fungi in the Eurotiales contain further examples of HEPHAESTUS, suggesting that it is responsible for independently assembling tolerance to a diverse array of ions, including chromium, mercury, and sodium.
Collapse
|
7
|
Lexa M, Jedlicka P, Vanat I, Cervenansky M, Kejnovsky E. TE-greedy-nester: structure-based detection of LTR retrotransposons and their nesting. Bioinformatics 2020; 36:4991-4999. [PMID: 32663247 PMCID: PMC7755421 DOI: 10.1093/bioinformatics/btaa632] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2019] [Revised: 06/08/2020] [Accepted: 07/07/2020] [Indexed: 11/23/2022] Open
Abstract
Motivation Transposable elements (TEs) in eukaryotes often get inserted into one another, forming sequences that become a complex mixture of full-length elements and their fragments. The reconstruction of full-length elements and the order in which they have been inserted is important for genome and transposon evolution studies. However, the accumulation of mutations and genome rearrangements over evolutionary time makes this process error-prone and decreases the efficiency of software aiming to recover all nested full-length TEs. Results We created software that uses a greedy recursive algorithm to mine increasingly fragmented copies of full-length LTR retrotransposons in assembled genomes and other sequence data. The software called TE-greedy-nester considers not only sequence similarity but also the structure of elements. This new tool was tested on a set of natural and synthetic sequences and its accuracy was compared to similar software. We found TE-greedy-nester to be superior in a number of parameters, namely computation time and full-length TE recovery in highly nested regions. Availability and implementation http://gitlab.fi.muni.cz/lexa/nested. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Matej Lexa
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic.,Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic
| | - Pavel Jedlicka
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic
| | - Ivan Vanat
- Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic
| | - Michal Cervenansky
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic.,Department of Machine Learning and Data Processing, Faculty of Informatics, Masaryk University, 60200 Brno, Czech Republic
| | - Eduard Kejnovsky
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, 61200 Brno, Czech Republic
| |
Collapse
|
8
|
Co-option of the lineage-specific LAVA retrotransposon in the gibbon genome. Proc Natl Acad Sci U S A 2020; 117:19328-19338. [PMID: 32690705 DOI: 10.1073/pnas.2006038117] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
Co-option of transposable elements (TEs) to become part of existing or new enhancers is an important mechanism for evolution of gene regulation. However, contributions of lineage-specific TE insertions to recent regulatory adaptations remain poorly understood. Gibbons present a suitable model to study these contributions as they have evolved a lineage-specific TE called LAVA (LINE-AluSz-VNTR-Alu LIKE), which is still active in the gibbon genome. The LAVA retrotransposon is thought to have played a role in the emergence of the highly rearranged structure of the gibbon genome by disrupting transcription of cell cycle genes. In this study, we investigated whether LAVA may have also contributed to the evolution of gene regulation by adopting enhancer function. We characterized fixed and polymorphic LAVA insertions across multiple gibbons and found 96 LAVA elements overlapping enhancer chromatin states. Moreover, LAVA was enriched in multiple transcription factor binding motifs, was bound by an important transcription factor (PU.1), and was associated with higher levels of gene expression in cis We found gibbon-specific signatures of purifying/positive selection at 27 LAVA insertions. Two of these insertions were fixed in the gibbon lineage and overlapped with enhancer chromatin states, representing putative co-opted LAVA enhancers. These putative enhancers were located within genes encoding SETD2 and RAD9A, two proteins that facilitate accurate repair of DNA double-strand breaks and prevent chromosomal rearrangement mutations. Co-option of LAVA in these genes may have influenced regulation of processes that preserve genome integrity. Our findings highlight the importance of considering lineage-specific TEs in studying evolution of gene regulatory elements.
Collapse
|
9
|
Kögler A, Seibt KM, Heitkam T, Morgenstern K, Reiche B, Brückner M, Wolf H, Krabel D, Schmidt T. Divergence of 3' ends as a driver of short interspersed nuclear element (SINE) evolution in the Salicaceae. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 103:443-458. [PMID: 32056333 DOI: 10.1111/tpj.14721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 01/13/2020] [Accepted: 01/29/2020] [Indexed: 06/10/2023]
Abstract
Short interspersed nuclear elements (SINEs) are small, non-autonomous and heterogeneous retrotransposons that are widespread in plants. To explore the amplification dynamics and evolutionary history of SINE populations in representative deciduous tree species, we analyzed the genomes of the six following Salicaceae species: Populus deltoides, Populus euphratica, Populus tremula, Populus tremuloides, Populus trichocarpa, and Salix purpurea. We identified 11 Salicaceae SINE families (SaliS-I to SaliS-XI), comprising 27 077 full-length copies. Most of these families harbor segmental similarities, providing evidence for SINE emergence by reshuffling or heterodimerization. We observed two SINE groups, differing in phylogenetic distribution pattern, similarity and 3' end structure. These groups probably emerged during the 'salicoid duplication' (~65 million years ago) in the Salix-Populus progenitor and during the separation of the genus Salix (45-65 million years ago), respectively. In contrast to conserved 5' start motifs across species and SINE families, the 3' ends are highly variable in sequence and length. This extraordinary 3'-end variability results from mutations in the poly(A) tail, which were fixed by subsequent amplificational bursts. We show that the dissemination of newly evolved 3' ends is accomplished by a displacement of older motifs, leading to various 3'-end subpopulations within the SaliS families.
Collapse
Affiliation(s)
- Anja Kögler
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| | - Kathrin M Seibt
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| | - Tony Heitkam
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| | - Kristin Morgenstern
- Department of Forest Sciences, Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, 01735, Tharandt, Germany
| | - Birgit Reiche
- Department of Forest Sciences, Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, 01735, Tharandt, Germany
| | | | - Heino Wolf
- Staatsbetrieb Sachsenforst, 01796, Pirna, Germany
| | - Doris Krabel
- Department of Forest Sciences, Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, 01735, Tharandt, Germany
| | - Thomas Schmidt
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| |
Collapse
|
10
|
Yan H, Bombarely A, Li S. DeepTE: a computational method for de novo classification of transposons with convolutional neural network. Bioinformatics 2020; 36:4269-4275. [DOI: 10.1093/bioinformatics/btaa519] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Revised: 04/12/2020] [Accepted: 05/12/2020] [Indexed: 01/23/2023] Open
Abstract
Abstract
Motivation
Transposable elements (TEs) classification is an essential step to decode their roles in genome evolution. With a large number of genomes from non-model species becoming available, accurate and efficient TE classification has emerged as a new challenge in genomic sequence analysis.
Results
We developed a novel tool, DeepTE, which classifies unknown TEs using convolutional neural networks (CNNs). DeepTE transferred sequences into input vectors based on k-mer counts. A tree structured classification process was used where eight models were trained to classify TEs into super families and orders. DeepTE also detected domains inside TEs to correct false classification. An additional model was trained to distinguish between non-TEs and TEs in plants. Given unclassified TEs of different species, DeepTE can classify TEs into seven orders, which include 15, 24 and 16 super families in plants, metazoans and fungi, respectively. In several benchmarking tests, DeepTE outperformed other existing tools for TE classification. In conclusion, DeepTE successfully leverages CNN for TE classification, and can be used to precisely classify TEs in newly sequenced eukaryotic genomes.
Availability and implementation
DeepTE is accessible at https://github.com/LiLabAtVT/DeepTE.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Haidong Yan
- School of Plant and Environmental Sciences (SPES), Virginia Tech, Blacksburg, VA 24061, USA
| | - Aureliano Bombarely
- School of Plant and Environmental Sciences (SPES), Virginia Tech, Blacksburg, VA 24061, USA
- Department of Life Sciences, University of Milan, Milan 20122, Italy
| | - Song Li
- School of Plant and Environmental Sciences (SPES), Virginia Tech, Blacksburg, VA 24061, USA
- Graduate Program in Genetics, Bioinformatics and Computational Biology (GBCB), Virginia Tech, Blacksburg, VA 24061, USA
| |
Collapse
|
11
|
Jedlicka P, Lexa M, Vanat I, Hobza R, Kejnovsky E. Nested plant LTR retrotransposons target specific regions of other elements, while all LTR retrotransposons often target palindromes and nucleosome-occupied regions: in silico study. Mob DNA 2019; 10:50. [PMID: 31871489 PMCID: PMC6911290 DOI: 10.1186/s13100-019-0186-z] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Accepted: 10/31/2019] [Indexed: 01/08/2023] Open
Abstract
Background Nesting is common in LTR retrotransposons, especially in large genomes containing a high number of elements. Results We analyzed 12 plant genomes and obtained 1491 pairs of nested and original (pre-existing) LTR retrotransposons. We systematically analyzed mutual nesting of individual LTR retrotransposons and found that certain families, more often belonging to the Ty3/gypsy than Ty1/copia superfamilies, showed a higher nesting frequency as well as a higher preference for older copies of the same family ("autoinsertions"). Nested LTR retrotransposons were preferentially located in the 3'UTR of other LTR retrotransposons, while coding and regulatory regions (LTRs) are not commonly targeted. Insertions displayed a weak preference for palindromes and were associated with a strong positional pattern of higher predicted nucleosome occupancy. Deviation from randomness in target site choice was also found in 13,983 non-nested plant LTR retrotransposons. Conclusions We reveal that nesting of LTR retrotransposons is not random. Integration is correlated with sequence composition, secondary structure and the chromatin environment. Insertion into retrotransposon positions with a low negative impact on family fitness supports the concept of the genome being viewed as an ecosystem of various elements.
Collapse
Affiliation(s)
- Pavel Jedlicka
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, 61200 Brno, Czech Republic
| | - Matej Lexa
- 2Faculty of Informatics, Masaryk University, Botanicka 68a, 60200 Brno, Czech Republic
| | - Ivan Vanat
- 2Faculty of Informatics, Masaryk University, Botanicka 68a, 60200 Brno, Czech Republic
| | - Roman Hobza
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, 61200 Brno, Czech Republic
| | - Eduard Kejnovsky
- Department of Plant Developmental Genetics, Institute of Biophysics of the Czech Academy of Sciences, Kralovopolska 135, 61200 Brno, Czech Republic
| |
Collapse
|
12
|
Schneider J, Volkmer I, Engel K, Emmer A, Staege MS. Expression of A New Endogenous Retrovirus-Associated Transcript in Hodgkin Lymphoma Cells. Int J Mol Sci 2019; 20:ijms20215320. [PMID: 31731509 PMCID: PMC6862598 DOI: 10.3390/ijms20215320] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 10/21/2019] [Accepted: 10/23/2019] [Indexed: 12/20/2022] Open
Abstract
During characterization of a cDNA library from the Hodgkin lymphoma (HL) cell line L-1236, we discovered a new transcript derived from chromosome 1 at the long intergenic non-protein coding RNA 1768 (LINC01768)/colony stimulating factor 1 (CSF1) region. The first exon of this transcript from Hodgkin lymphoma cells (THOLE) starts in the predicted exon 4 of LINC01768 and is part of an endogenous retrovirus (ERV) from the HUERS-P1/LTR8 family. High expression of THOLE was only detectable in HL cell line L-1236. The expression of THOLE in L-1236 cell is another example for ERV/LTR-associated gene expression in HL cells. At the genome level, the HUERS-P1/LTR8 region including THOLE is only present in Hominoidea. The influence of ERV/LTRs on gene expression might explain the characteristic phenotype of human HL.
Collapse
Affiliation(s)
- Jana Schneider
- Department of Surgical and Conservative Pediatrics and Adolescent Medicine, Martin Luther University Halle-Wittenberg, 06097 Halle, Germany; (J.S.); (I.V.); (K.E.)
| | - Ines Volkmer
- Department of Surgical and Conservative Pediatrics and Adolescent Medicine, Martin Luther University Halle-Wittenberg, 06097 Halle, Germany; (J.S.); (I.V.); (K.E.)
| | - Kristina Engel
- Department of Surgical and Conservative Pediatrics and Adolescent Medicine, Martin Luther University Halle-Wittenberg, 06097 Halle, Germany; (J.S.); (I.V.); (K.E.)
| | - Alexander Emmer
- Department of Neurology, Martin Luther University Halle-Wittenberg, 06097 Halle, Germany
| | - Martin S. Staege
- Department of Surgical and Conservative Pediatrics and Adolescent Medicine, Martin Luther University Halle-Wittenberg, 06097 Halle, Germany; (J.S.); (I.V.); (K.E.)
- Correspondence: ; Tel.: +49-345-557-7280; Fax: +49-345-557-7275
| |
Collapse
|
13
|
Casier K, Delmarre V, Gueguen N, Hermant C, Viodé E, Vaury C, Ronsseray S, Brasset E, Teysset L, Boivin A. Environmentally-induced epigenetic conversion of a piRNA cluster. eLife 2019; 8:e39842. [PMID: 30875295 PMCID: PMC6420265 DOI: 10.7554/elife.39842] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2018] [Accepted: 03/06/2019] [Indexed: 01/02/2023] Open
Abstract
Transposable element (TE) activity is repressed in animal gonads by PIWI-interacting RNAs (piRNAs) produced by piRNA clusters. Current models in flies propose that germinal piRNA clusters are functionally defined by the maternal inheritance of piRNAs produced during the previous generation. Taking advantage of an inactive, but ready to go, cluster of P-element derived transgene insertions in Drosophila melanogaster, we show here that raising flies at high temperature (29°C) instead of 25°C triggers the stable conversion of this locus from inactive into actively producing functional piRNAs. The increase of antisense transcripts from the cluster at 29°C combined with the requirement of transcription of euchromatic homologous sequences, suggests a role of double stranded RNA in the production of de novo piRNAs. This report describes the first case of the establishment of an active piRNA cluster by environmental changes in the absence of maternal inheritance of homologous piRNAs. Editorial note This article has been through an editorial process in which the authors decide how to respond to the issues raised during peer review. The Reviewing Editor's assessment is that all the issues have been addressed (see decision letter).
Collapse
Affiliation(s)
- Karine Casier
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| | - Valérie Delmarre
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| | - Nathalie Gueguen
- GReDUniversité Clermont Auvergne, CNRS, INSERM, BP 10448Clermont-FerrandFrance
| | - Catherine Hermant
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| | - Elise Viodé
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| | - Chantal Vaury
- GReDUniversité Clermont Auvergne, CNRS, INSERM, BP 10448Clermont-FerrandFrance
| | - Stéphane Ronsseray
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| | - Emilie Brasset
- GReDUniversité Clermont Auvergne, CNRS, INSERM, BP 10448Clermont-FerrandFrance
| | - Laure Teysset
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| | - Antoine Boivin
- Laboratoire Biologie du Développement, UMR7622Sorbonne Université, CNRS, Institut de Biologie Paris-SeineParisFrance
| |
Collapse
|
14
|
Shao F, Wang J, Xu H, Peng Z. FishTEDB: a collective database of transposable elements identified in the complete genomes of fish. DATABASE-THE JOURNAL OF BIOLOGICAL DATABASES AND CURATION 2018; 2018:4812028. [PMID: 29688350 PMCID: PMC6404401 DOI: 10.1093/database/bax106] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/24/2017] [Accepted: 12/21/2017] [Indexed: 11/28/2022]
Abstract
Transposable elements (TEs) are important for host gene regulation and genome evolution. Consensus sequences of TEs can assist investigators in accelerating studies on TE origins, amplification, functions and evolution, as well as comparative analyses and prediction of TEs in different species. In evolution, physiology, ecology and heredity research, fish are important models. However, to date, no comprehensive resource for TE consensus sequences exists for fish. Here, we collected genome-wide data and developed a novel database, FishTEDB, including 27 bony fishes, 1 cartilaginous fish, 1 lamprey and 1 lancelet. De novo, structure-based and homology-based approaches were combined to detect TEs. The database is open-source and user-friendly, and users can browse, search and download all data. FishTEDB also provides GetORF, BLAST and HMMER tools to analyze sequences. Database URL: http://www.fishtedb.org/
Collapse
Affiliation(s)
- Feng Shao
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing 400715, China
| | - Jianrong Wang
- Department of Computational Mathematics, Science and Engineering, Michigan State University, MI 48824, USA
| | - Hongen Xu
- Department of Genome Oriented Bioinformatics Wissenschaftszentrum Weihenstephan, TU Muenchen Maximus-von-Imhof-Forum 3, Freising 85354, Germany
| | - Zuogang Peng
- Key Laboratory of Freshwater Fish Reproduction and Development (Ministry of Education), Southwest University School of Life Sciences, Chongqing 400715, China
| |
Collapse
|
15
|
Ty3/Gypsy retrotransposons in the Pacific abalone Haliotis discus hannai: characterization and use for species identification in the genus Haliotis. Genes Genomics 2018; 40:177-187. [PMID: 29892921 DOI: 10.1007/s13258-017-0619-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2017] [Accepted: 10/05/2017] [Indexed: 01/08/2023]
Abstract
Transposable elements are highly abundant elements that are present in all eukaryotic species. Here, we present a molecular description of abalone retrotransposon (Abret) elements. The genome of Haliotis discus hannai contains 130 Abret elements which were all Ty3/Gypsy retrotransposons. The Ty1/Copia elements were absent in the H. discus hannai genome. Most of the elements were not complete due to sequence truncation or coding region decay. However, three elements Abret-296, Abret-935, and Abret-3259 had most of the canonical features of LTR (long terminal repeat)-retrotransposons. There were several reading frame shifts in Abret-935 and Abret-3259 elements. Surprisingly, phylogenetic analysis indicated that all of the elements belonged to the Osvaldo lineage. The sequence divergence between LTRs revealed that the Abret elements were mostly active within 2 million years ago. Abret elements were used as molecular markers in SSAP analyses, which allowed clear distinction of different species in the genus Haliotis. The polymorphic markers were converted into SCAR markers for use in species identification by simple PCR in the Haliotis genus.
Collapse
|
16
|
Russian Doll Genes and Complex Chromosome Rearrangements in Oxytricha trifallax. G3-GENES GENOMES GENETICS 2018; 8:1669-1674. [PMID: 29545465 PMCID: PMC5940158 DOI: 10.1534/g3.118.200176] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Ciliates have two different types of nuclei per cell, with one acting as a somatic, transcriptionally active nucleus (macronucleus; abbr. MAC) and another serving as a germline nucleus (micronucleus; abbr. MIC). Furthermore, Oxytricha trifallax undergoes extensive genome rearrangements during sexual conjugation and post-zygotic development of daughter cells. These rearrangements are necessary because the precursor MIC loci are often both fragmented and scrambled, with respect to the corresponding MAC loci. Such genome architectures are remarkably tolerant of encrypted MIC loci, because RNA-guided processes during MAC development reorganize the gene fragments in the correct order to resemble the parental MAC sequence. Here, we describe the germline organization of several nested and highly scrambled genes in Oxytricha trifallax These include cases with multiple layers of nesting, plus highly interleaved or tangled precursor loci that appear to deviate from previously described patterns. We present mathematical methods to measure the degree of nesting between precursor MIC loci, and revisit a method for a mathematical description of scrambling. After applying these methods to the chromosome rearrangement maps of O. trifallax we describe cases of nested arrangements with up to five layers of embedded genes, as well as the most scrambled loci in O. trifallax.
Collapse
|
17
|
Genome-wide analysis of transposable elements in the coffee berry borer Hypothenemus hampei (Coleoptera: Curculionidae): description of novel families. Mol Genet Genomics 2017; 292:565-583. [PMID: 28204924 DOI: 10.1007/s00438-017-1291-7] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2016] [Accepted: 01/12/2017] [Indexed: 10/20/2022]
Abstract
The coffee berry borer (CBB) Hypothenemus hampei is the most limiting pest of coffee production worldwide. The CBB genome has been recently sequenced; however, information regarding the presence and characteristics of transposable elements (TEs) was not provided. Using systematic searching strategies based on both de novo and homology-based approaches, we present a library of TEs from the draft genome of CBB sequenced by the Colombian Coffee Growers Federation. The library consists of 880 sequences classified as 66% Class I (LTRs: 46%, non-LTRs: 20%) and 34% Class II (DNA transposons: 8%, Helitrons: 16% and MITEs: 10%) elements, including families of the three main LTR (Gypsy, Bel-Pao and Copia) and non-LTR (CR1, Daphne, I/Nimb, Jockey, Kiri, R1, R2 and R4) clades and DNA superfamilies (Tc1-mariner, hAT, Merlin, P, PIF-Harbinger, PiggyBac and Helitron). We propose the existence of novel families: Hypo, belonging to the LTR Gypsy superfamily; Hamp, belonging to non-LTRs; and rosa, belonging to Class II or DNA transposons. Although the rosa clade has been previously described, it was considered to be a basal subfamily of the mariner family. Based on our phylogenetic analysis, including Tc1, mariner, pogo, rosa and Lsra elements from other insects, we propose that rosa and Lsra elements are subfamilies of an independent family of Class II elements termed rosa. The annotations obtained indicate that a low percentage of the assembled CBB genome (approximately 8.2%) consists of TEs. Although these TEs display high diversity, most sequences are degenerate, with few full-length copies of LTR and DNA transposons and several complete and putatively active copies of non-LTR elements. MITEs constitute approximately 50% of the total TEs content, with a high proportion associated with DNA transposons in the Tc1-mariner superfamily.
Collapse
|
18
|
Characterization of new transposable element sub-families from white clover (Trifolium repens) using PCR amplification. Genetica 2016; 144:577-589. [PMID: 27671023 DOI: 10.1007/s10709-016-9926-x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2016] [Accepted: 09/17/2016] [Indexed: 12/15/2022]
Abstract
Transposable elements (TEs) dominate the landscapes of most plant and animal genomes. Once considered junk DNA and genetic parasites, these interspersed, repetitive DNA elements are now known to play major roles in both genetic and epigenetic processes that sponsor genome variation and regulate gene expression. Knowledge of TE consensus sequences from elements in species whose genomes have not been sequenced is limited, and the individual TEs that are encountered in clones or short-reads rarely represent potentially canonical, let alone, functional representatives. In this study, we queried the Repbase database with eight BAC clones from white clover (Trifolium repens), identified a large number of candidate TEs, and used polymerase chain reaction and Sanger sequencing to create consensus sequences for three new TE families. The results show that TE family consensus sequences can be obtained experimentally in species for which just a single, full-length member of a TE family has been sequenced.
Collapse
|
19
|
Sheshukova EV, Shindyapina AV, Komarova TV, Dorokhov YL. “Matreshka” genes with alternative reading frames. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416020149] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
20
|
Wang Y, Drader T, Tiwari VK, Dong L, Kumar A, Huo N, Ghavami F, Iqbal MJ, Lazo GR, Leonard J, Gill BS, Kianian SF, Luo MC, Gu YQ. Development of a D genome specific marker resource for diploid and hexaploid wheat. BMC Genomics 2015; 16:646. [PMID: 26315263 PMCID: PMC4552153 DOI: 10.1186/s12864-015-1852-2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2015] [Accepted: 08/17/2015] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Mapping and map-based cloning of genes that control agriculturally and economically important traits remain great challenges for plants with complex highly repetitive genomes such as those within the grass tribe, Triticeae. Mapping limitations in the Triticeae are primarily due to low frequencies of polymorphic gene markers and poor genetic recombination in certain genetic regions. Although the abundance of repetitive sequence may pose common problems in genome analysis and sequence assembly of large and complex genomes, they provide repeat junction markers with random and unbiased distribution throughout chromosomes. Hence, development of a high-throughput mapping technology that combine both gene-based and repeat junction-based markers is needed to generate maps that have better coverage of the entire genome. RESULTS In this study, the available genomics resource of the diploid Aegilop tauschii, the D genome donor of bread wheat, were used to develop genome specific markers that can be applied for mapping in modern hexaploid wheat. A NimbleGen array containing both gene-based and repeat junction probe sequences derived from Ae. tauschii was developed and used to map the Chinese Spring nullisomic-tetrasomic lines and deletion bin lines of the D genome chromosomes. Based on these mapping data, we have now anchored 5,171 repeat junction probes and 10,892 gene probes, corresponding to 5,070 gene markers, to the delineated deletion bins of the D genome. The order of the gene-based markers within the deletion bins of the Chinese Spring can be inferred based on their positions on the Ae. tauschii genetic map. Analysis of the probe sequences against the Chinese Spring chromosome sequence assembly database facilitated mapping of the NimbleGen probes to the sequence contigs and allowed assignment or ordering of these sequence contigs within the deletion bins. The accumulated length of anchored sequence contigs is about 155 Mb, representing ~ 3.2 % of the D genome. A specific database was developed to allow user to search or BLAST against the probe sequence information and to directly download PCR primers for mapping specific genetic loci. CONCLUSIONS In bread wheat, aneuploid stocks have been extensively used to assign markers linked with genes/traits to chromosomes, chromosome arms, and their specific bins. Through this study, we added thousands of markers to the existing wheat chromosome bin map, representing a significant step forward in providing a resource to navigate the wheat genome. The database website ( http://probes.pw.usda.gov/ATRJM/ ) provides easy access and efficient utilization of the data. The resources developed herein can aid map-based cloning of traits of interest and the sequencing of the D genome of hexaploid wheat.
Collapse
Affiliation(s)
- Yi Wang
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710, USA. .,Department of Plant Sciences, University of California, Davis, CA, 95616, USA.
| | - Thomas Drader
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710, USA.
| | - Vijay K Tiwari
- Department of Crop and Soil Science, Oregon State University, Corvallis, OR, 97331, USA. .,Wheat Genetic Resource Center, Department of Plant Pathology, Kansas State University, Manhattan, KS, 66506, USA.
| | - Lingli Dong
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710, USA. .,Department of Plant Sciences, University of California, Davis, CA, 95616, USA.
| | - Ajay Kumar
- Department of Plant Sciences, North Dakota State University, Fargo, ND, 58108, USA. ajay.kumar.2.@ndsu.edu
| | - Naxin Huo
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710, USA.,Department of Plant Sciences, University of California, Davis, CA, 95616, USA
| | - Farhad Ghavami
- Department of Plant Sciences, North Dakota State University, Fargo, ND, 58108, USA.,Molecular Breeding and Genomics Technology Laboratory, BioDiagnostics Inc., River Falls, WI, 54022, USA
| | - M Javed Iqbal
- Department of Plant Sciences, North Dakota State University, Fargo, ND, 58108, USA
| | - Gerard R Lazo
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710, USA.
| | - Jeff Leonard
- Department of Crop and Soil Science, Oregon State University, Corvallis, OR, 97331, USA.
| | - Bikram S Gill
- Wheat Genetic Resource Center, Department of Plant Pathology, Kansas State University, Manhattan, KS, 66506, USA.
| | | | - Ming-Cheng Luo
- Department of Plant Sciences, University of California, Davis, CA, 95616, USA.
| | - Yong Q Gu
- Western Regional Research Center, USDA-ARS, Albany, CA, 94710, USA.
| |
Collapse
|
21
|
Castanera R, Pérez G, López L, Sancho R, Santoyo F, Alfaro M, Gabaldón T, Pisabarro AG, Oguiza JA, Ramírez L. Highly expressed captured genes and cross-kingdom domains present in Helitrons create novel diversity in Pleurotus ostreatus and other fungi. BMC Genomics 2014; 15:1071. [PMID: 25480150 PMCID: PMC4289320 DOI: 10.1186/1471-2164-15-1071] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2014] [Accepted: 11/14/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Helitrons are class-II eukaryotic transposons that transpose via a rolling circle mechanism. Due to their ability to capture and mobilize gene fragments, they play an important role in the evolution of their host genomes. We have used a bioinformatics approach for the identification of helitrons in two Pleurotus ostreatus genomes using de novo detection and homology-based searching. We have analyzed the presence of helitron-captured genes as well as the expansion of helitron-specific helicases in fungi and performed a phylogenetic analysis of their conserved domains with other representative eukaryotic species. RESULTS Our results show the presence of two helitron families in P. ostreatus that disrupt gene colinearity and cause a lack of synteny between their genomes. Both putative autonomous and non-autonomous helitrons were transcriptionally active, and some of them carried highly expressed captured genes of unknown origin and function. In addition, both families contained eukaryotic, bacterial and viral domains within the helitron's boundaries. A phylogenetic reconstruction of RepHel helicases using the Helitron-like and PIF1-like helicase conserved domains revealed a polyphyletic origin for eukaryotic helitrons. CONCLUSION P. ostreatus helitrons display features similar to other eukaryotic helitrons and do not tend to capture host genes or gene fragments. The occurrence of genes probably captured from other hosts inside the helitrons boundaries pose the hypothesis that an ancient horizontal transfer mechanism could have taken place. The viral domains found in some of these genes and the polyphyletic origin of RepHel helicases in the eukaryotic kingdom suggests that virus could have played a role in a putative lateral transfer of helitrons within the eukaryotic kingdom. The high similarity of some helitrons, along with the transcriptional activity of its RepHel helicases indicates that these elements are still active in the genome of P. ostreatus.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | - Lucía Ramírez
- Department of Agrarian Production, Genetics and Microbiology Research Group, Public University of Navarre, 31006 Pamplona, Navarre, Spain.
| |
Collapse
|
22
|
Yadav CB, Bonthala VS, Muthamilarasan M, Pandey G, Khan Y, Prasad M. Genome-wide development of transposable elements-based markers in foxtail millet and construction of an integrated database. DNA Res 2014; 22:79-90. [PMID: 25428892 PMCID: PMC4379977 DOI: 10.1093/dnares/dsu039] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Transposable elements (TEs) are major components of plant genome and are reported to play significant roles in functional genome diversity and phenotypic variations. Several TEs are highly polymorphic for insert location in the genome and this facilitates development of TE-based markers for various genotyping purposes. Considering this, a genome-wide analysis was performed in the model plant foxtail millet. A total of 30,706 TEs were identified and classified as DNA transposons (24,386), full-length Copia type (1,038), partial or solo Copia type (10,118), full-length Gypsy type (1,570), partial or solo Gypsy type (23,293) and Long- and Short-Interspersed Nuclear Elements (3,659 and 53, respectively). Further, 20,278 TE-based markers were developed, namely Retrotransposon-Based Insertion Polymorphisms (4,801, ∼24%), Inter-Retrotransposon Amplified Polymorphisms (3,239, ∼16%), Repeat Junction Markers (4,451, ∼22%), Repeat Junction-Junction Markers (329, ∼2%), Insertion-Site-Based Polymorphisms (7,401, ∼36%) and Retrotransposon-Microsatellite Amplified Polymorphisms (57, 0.2%). A total of 134 Repeat Junction Markers were screened in 96 accessions of Setaria italica and 3 wild Setaria accessions of which 30 showed polymorphism. Moreover, an open access database for these developed resources was constructed (Foxtail millet Transposable Elements-based Marker Database; http://59.163.192.83/ltrdb/index.html). Taken together, this study would serve as a valuable resource for large-scale genotyping applications in foxtail millet and related grass species.
Collapse
Affiliation(s)
- Chandra Bhan Yadav
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110 067, India
| | - Venkata Suresh Bonthala
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110 067, India
| | | | - Garima Pandey
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110 067, India
| | - Yusuf Khan
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110 067, India
| | - Manoj Prasad
- National Institute of Plant Genome Research (NIPGR), Aruna Asaf Ali Marg, New Delhi 110 067, India
| |
Collapse
|
23
|
Halász J, Kodad O, Hegedűs A. Identification of a recently active Prunus-specific non-autonomous Mutator element with considerable genome shaping force. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2014; 79:220-231. [PMID: 24813246 DOI: 10.1111/tpj.12551] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/12/2014] [Revised: 04/24/2014] [Accepted: 04/30/2014] [Indexed: 06/03/2023]
Abstract
Miniature inverted-repeat transposable elements (MITEs) are known to contribute to the evolution of plants, but only limited information is available for MITEs in the Prunus genome. We identified a MITE that has been named Falling Stones, FaSt. All structural features (349-bp size, 82-bp terminal inverted repeats and 9-bp target site duplications) are consistent with this MITE being a putative member of the Mutator transposase superfamily. FaSt showed a preferential accumulation in the short AT-rich segments of the euchromatin region of the peach genome. DNA sequencing and pollination experiments have been performed to confirm that the nested insertion of FaSt into the S-haplotype-specific F-box gene of apricot resulted in the breakdown of self-incompatibility (SI). A bioinformatics-based survey of the known Rosaceae and other genomes and a newly designed polymerase chain reaction (PCR) assay verified the Prunoideae-specific occurrence of FaSt elements. Phylogenetic analysis suggested a recent activity of FaSt in the Prunus genome. The occurrence of a nested insertion in the apricot genome further supports the recent activity of FaSt in response to abiotic stress conditions. This study reports on a presumably active non-autonomous Mutator element in Prunus that exhibits a major indirect genome shaping force through inducing loss-of-function mutation in the SI locus.
Collapse
Affiliation(s)
- Júlia Halász
- Department of Genetics and Plant Breeding, Corvinus University of Budapest, P.O. Box 53, Budapest, H-1518, Hungary
| | | | | |
Collapse
|
24
|
Campos-Sánchez R, Kapusta A, Feschotte C, Chiaromonte F, Makova KD. Genomic landscape of human, bat, and ex vivo DNA transposon integrations. Mol Biol Evol 2014; 31:1816-32. [PMID: 24809961 DOI: 10.1093/molbev/msu138] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
The integration and fixation preferences of DNA transposons, one of the major classes of eukaryotic transposable elements, have never been evaluated comprehensively on a genome-wide scale. Here, we present a detailed study of the distribution of DNA transposons in the human and bat genomes. We studied three groups of DNA transposons that integrated at different evolutionary times: 1) ancient (>40 My) and currently inactive human elements, 2) younger (<40 My) bat elements, and 3) ex vivo integrations of piggyBat and Sleeping Beauty elements in HeLa cells. Although the distribution of ex vivo elements reflected integration preferences, the distribution of human and (to a lesser extent) bat elements was also affected by selection. We used regression techniques (linear, negative binomial, and logistic regression models with multiple predictors) applied to 20-kb and 1-Mb windows to investigate how the genomic landscape in the vicinity of DNA transposons contributes to their integration and fixation. Our models indicate that genomic landscape explains 16-79% of variability in DNA transposon genome-wide distribution. Importantly, we not only confirmed previously identified predictors (e.g., DNA conformation and recombination hotspots) but also identified several novel predictors (e.g., signatures of double-strand breaks and telomere hexamer). Ex vivo integrations showed a bias toward actively transcribed regions. Older DNA transposons were located in genomic regions scarce in most conserved elements-likely reflecting purifying selection. Our study highlights how DNA transposons are integral to the evolution of bat and human genomes, and has implications for the development of DNA transposon assays for gene therapy and mutagenesis applications.
Collapse
Affiliation(s)
- Rebeca Campos-Sánchez
- Genetics Program, The Huck Institutes of the Life Sciences, Penn State University, University Park, PA
| | - Aurélie Kapusta
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
| | - Cédric Feschotte
- Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT
| | - Francesca Chiaromonte
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, PADepartment of Statistics, Penn State University, University Park, PA
| | - Kateryna D Makova
- Center for Medical Genomics, The Huck Institutes of the Life Sciences, Penn State University, University Park, PADepartment of Biology, Penn State University, University Park, PA
| |
Collapse
|
25
|
Gill N, Buti M, Kane N, Bellec A, Helmstetter N, Berges H, Rieseberg LH. Sequence-Based Analysis of Structural Organization and Composition of the Cultivated Sunflower (Helianthus annuus L.) Genome. BIOLOGY 2014; 3:295-319. [PMID: 24833511 PMCID: PMC4085609 DOI: 10.3390/biology3020295] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/30/2013] [Revised: 03/16/2014] [Accepted: 03/25/2014] [Indexed: 12/19/2022]
Abstract
Sunflower is an important oilseed crop, as well as a model system for evolutionary studies, but its 3.6 gigabase genome has proven difficult to assemble, in part because of the high repeat content of its genome. Here we report on the sequencing, assembly, and analyses of 96 randomly chosen BACs from sunflower to provide additional information on the repeat content of the sunflower genome, assess how repetitive elements in the sunflower genome are organized relative to genes, and compare the genomic distribution of these repeats to that found in other food crops and model species. We also examine the expression of transposable element-related transcripts in EST databases for sunflower to determine the representation of repeats in the transcriptome and to measure their transcriptional activity. Our data confirm previous reports in suggesting that the sunflower genome is >78% repetitive. Sunflower repeats share very little similarity to other plant repeats such as those of Arabidopsis, rice, maize and wheat; overall 28% of repeats are “novel” to sunflower. The repetitive sequences appear to be randomly distributed within the sequenced BACs. Assuming the 96 BACs are representative of the genome as a whole, then approximately 5.2% of the sunflower genome comprises non TE-related genic sequence, with an average gene density of 18kbp/gene. Expression levels of these transposable elements indicate tissue specificity and differential expression in vegetative and reproductive tissues, suggesting that expressed TEs might contribute to sunflower development. The assembled BACs will also be useful for assessing the quality of several different draft assemblies of the sunflower genome and for annotating the reference sequence.
Collapse
Affiliation(s)
- Navdeep Gill
- Department of Botany and The Biodiversity Research Centre, University of British Columbia, Vancouver V6T 1Z4, BC, Canada.
| | - Matteo Buti
- Applied Rosaceous Genomics Group, Centre for Research and Innovation, Michele all'Adige (TN) P.IVA 020384102, Italy.
| | - Nolan Kane
- Department of Ecology and Evolutionary Biology, University of Colorado, Boulder, CO 80309, USA.
| | - Arnaud Bellec
- French Plant Genomic Resource Centre, INRA-CNRGV, Chemin de Borde Rouge, CS 52627, 31326 Castanet Tolosan, France.
| | - Nicolas Helmstetter
- French Plant Genomic Resource Centre, INRA-CNRGV, Chemin de Borde Rouge, CS 52627, 31326 Castanet Tolosan, France.
| | - Hélène Berges
- French Plant Genomic Resource Centre, INRA-CNRGV, Chemin de Borde Rouge, CS 52627, 31326 Castanet Tolosan, France.
| | - Loren H Rieseberg
- Department of Botany and The Biodiversity Research Centre, University of British Columbia, Vancouver V6T 1Z4, BC, Canada.
| |
Collapse
|
26
|
Sreeskandarajan S, Flowers MM, Karro JE, Liang C. A MATLAB-based tool for accurate detection of perfect overlapping and nested inverted repeats in DNA sequences. ACTA ACUST UNITED AC 2013; 30:887-8. [PMID: 24215021 DOI: 10.1093/bioinformatics/btt651] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022]
Abstract
SUMMARY Palindromic sequences, or inverted repeats (IRs), in DNA sequences involve important biological processes such as DNA-protein binding, DNA replication and DNA transposition. Development of bioinformatics tools that are capable of accurately detecting perfect IRs can enable genome-wide studies of IR patterns in both prokaryotes and eukaryotes. Different from conventional string-comparison approaches, we propose a novel algorithm that uses a cumulative score system based on a prime number representation of nucleotide bases. We then implemented this algorithm as a MATLAB-based program for perfect IR detection. In comparison with other existing tools, our program demonstrates a high accuracy in detecting nested and overlapping IRs. AVAILABILITY AND IMPLEMENTATION The source code is freely available on (http://bioinfolab.miamioh.edu/bioinfolab/palindrome.php) CONTACT liangc@miamioh.edu or karroje@miamioh.edu SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sutharzan Sreeskandarajan
- Department of Biology, Department of Computer Science and Software Engineering, Miami University, Oxford, OH 45056, USA and State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, Beijing 100193, China
| | | | | | | |
Collapse
|
27
|
Internal deletions of transposable elements: the case of Lemi elements. Genetica 2013; 141:369-79. [PMID: 24114377 DOI: 10.1007/s10709-013-9736-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2013] [Accepted: 09/05/2013] [Indexed: 12/21/2022]
Abstract
Mobile elements using a "cut and paste" mechanism of transposition (Class II) are frequently prone to internal deletions and the question of the origin of these copies remains elusive. In this study, we looked for copies belonging to the Lemi Family (Tc1-mariner-IS630 SuperFamily) in the plant genomes, and copies within internal deletions were analyzed in detail. Lemi elements are found exclusively in Eudicots, and more than half of the copies have been deleted. All deletions occur between microhomologies (direct repeats from 2 to 13 bp). Copies less than 500 bp long, similar to MITEs, are frequent. These copies seem to result from large deletions occurring between microhomologies present within a region of 300 bp at both extremities of the element. These regions are particularly A/T rich, compared to the internal part of the element, which increases the probability of observing short direct repeats. Most of the molecular mechanisms responsible for double strand break repair are able to induce deletions between microhomologies during the repair process. This could be a quick way to reduce the population of active copies within a genome and, more generally, to reduce the overall activity of the element after it has entered a naive genome.
Collapse
|