1
|
Li Y, Jiang N, Sun Y. AnnoSINE: a short interspersed nuclear elements annotation tool for plant genomes. PLANT PHYSIOLOGY 2022; 188:955-970. [PMID: 34792587 PMCID: PMC8825457 DOI: 10.1093/plphys/kiab524] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2021] [Accepted: 10/01/2021] [Indexed: 06/13/2023]
Abstract
Short interspersed nuclear elements (SINEs) are a widespread type of small transposable element (TE). With increasing evidence for their impact on gene function and genome evolution in plants, accurate genome-scale SINE annotation becomes a fundamental step for studying the regulatory roles of SINEs and their relationship with other components in the genomes. Despite the overall promising progress made in TE annotation, SINE annotation remains a major challenge. Unlike some other TEs, SINEs are short and heterogeneous, and they usually lack well-conserved sequence or structural features. Thus, current SINE annotation tools have either low sensitivity or high false discovery rates. Given the demand and challenges, we aimed to provide a more accurate and efficient SINE annotation tool for plant genomes. The pipeline starts with maximizing the pool of SINE candidates via profile hidden Markov model-based homology search and de novo SINE search using structural features. Then, it excludes the false positives by integrating all known features of SINEs and the features of other types of TEs that can often be misannotated as SINEs. As a result, the pipeline substantially improves the tradeoff between sensitivity and accuracy, with both values close to or over 90%. We tested our tool in Arabidopsis thaliana and rice (Oryza sativa), and the results show that our tool competes favorably against existing SINE annotation tools. The simplicity and effectiveness of this tool would potentially be useful for generating more accurate SINE annotations for other plant species. The pipeline is freely available at https://github.com/yangli557/AnnoSINE.
Collapse
Affiliation(s)
- Yang Li
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| | - Ning Jiang
- Department of Horticulture, Michigan State University, East Lansing, Michigan 48824, USA
| | - Yanni Sun
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR, China
| |
Collapse
|
2
|
Han G, Zhang N, Jiang H, Meng X, Qian K, Zheng Y, Xu J, Wang J. Diversity of short interspersed nuclear elements (SINEs) in lepidopteran insects and evidence of horizontal SINE transfer between baculovirus and lepidopteran hosts. BMC Genomics 2021; 22:226. [PMID: 33789582 PMCID: PMC8010984 DOI: 10.1186/s12864-021-07543-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 03/22/2021] [Indexed: 11/16/2022] Open
Abstract
Background Short interspersed nuclear elements (SINEs) belong to non-long terminal repeat (non-LTR) retrotransposons, which can mobilize dependent on the help of counterpart long interspersed nuclear elements (LINEs). Although 234 SINEs have been identified so far, only 23 are from insect species (SINEbase: http://sines.eimb.ru/). Results Here, five SINEs were identified from the genome of Plutella xylostella, among which PxSE1, PxSE2 and PxSE3 were tRNA-derived SINEs, PxSE4 and PxSE5 were 5S RNA-derived SINEs. A total of 18 related SINEs were further identified in 13 lepidopteran insects and a baculovirus. The 3′-tail of PxSE5 shares highly identity with that of LINE retrotransposon, PxLINE1. The analysis of relative age distribution profiles revealed that PxSE1 is a relatively young retrotransposon in the genome of P. xylostella and was generated by recent explosive amplification. Integration pattern analysis showed that SINEs in P. xylostella prefer to insert into or accumulate in introns and regions 5 kb downstream of genes. In particular, the PxSE1-like element, SlNPVSE1, in Spodoptera litura nucleopolyhedrovirus II genome is highly identical to SfSE1 in Spodoptera frugiperda, SlittSE1 in Spodoptera littoralis, and SlituSE1 in Spodoptera litura, suggesting the occurrence of horizontal transfer. Conclusions Lepidopteran insect genomes harbor a diversity of SINEs. The retrotransposition activity and copy number of these SINEs varies considerably between host lineages and SINE lineages. Host-parasite interactions facilitate the horizontal transfer of SINE between baculovirus and its lepidopteran hosts. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07543-z.
Collapse
Affiliation(s)
- Guangjie Han
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China.,Jiangsu Lixiahe District Institute of Agricultural Sciences, Yangzhou, 225008, China
| | - Nan Zhang
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China
| | - Heng Jiang
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China
| | - Xiangkun Meng
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China
| | - Kun Qian
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China
| | - Yang Zheng
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China
| | - Jian Xu
- Jiangsu Lixiahe District Institute of Agricultural Sciences, Yangzhou, 225008, China.
| | - Jianjun Wang
- College of Horticulture and Plant Protection, Yangzhou University, Yangzhou, 225009, China. .,Joint International Research Laboratory of Agriculture andAgri-Product Safety of the Ministry of Education, Yangzhou University, Yangzhou, 225009, China.
| |
Collapse
|
3
|
Kögler A, Seibt KM, Heitkam T, Morgenstern K, Reiche B, Brückner M, Wolf H, Krabel D, Schmidt T. Divergence of 3' ends as a driver of short interspersed nuclear element (SINE) evolution in the Salicaceae. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 103:443-458. [PMID: 32056333 DOI: 10.1111/tpj.14721] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/18/2019] [Revised: 01/13/2020] [Accepted: 01/29/2020] [Indexed: 06/10/2023]
Abstract
Short interspersed nuclear elements (SINEs) are small, non-autonomous and heterogeneous retrotransposons that are widespread in plants. To explore the amplification dynamics and evolutionary history of SINE populations in representative deciduous tree species, we analyzed the genomes of the six following Salicaceae species: Populus deltoides, Populus euphratica, Populus tremula, Populus tremuloides, Populus trichocarpa, and Salix purpurea. We identified 11 Salicaceae SINE families (SaliS-I to SaliS-XI), comprising 27 077 full-length copies. Most of these families harbor segmental similarities, providing evidence for SINE emergence by reshuffling or heterodimerization. We observed two SINE groups, differing in phylogenetic distribution pattern, similarity and 3' end structure. These groups probably emerged during the 'salicoid duplication' (~65 million years ago) in the Salix-Populus progenitor and during the separation of the genus Salix (45-65 million years ago), respectively. In contrast to conserved 5' start motifs across species and SINE families, the 3' ends are highly variable in sequence and length. This extraordinary 3'-end variability results from mutations in the poly(A) tail, which were fixed by subsequent amplificational bursts. We show that the dissemination of newly evolved 3' ends is accomplished by a displacement of older motifs, leading to various 3'-end subpopulations within the SaliS families.
Collapse
Affiliation(s)
- Anja Kögler
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| | - Kathrin M Seibt
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| | - Tony Heitkam
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| | - Kristin Morgenstern
- Department of Forest Sciences, Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, 01735, Tharandt, Germany
| | - Birgit Reiche
- Department of Forest Sciences, Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, 01735, Tharandt, Germany
| | | | - Heino Wolf
- Staatsbetrieb Sachsenforst, 01796, Pirna, Germany
| | - Doris Krabel
- Department of Forest Sciences, Institute of Forest Botany and Forest Zoology, Technische Universität Dresden, 01735, Tharandt, Germany
| | - Thomas Schmidt
- Faculty of Biology, Institute of Botany, Technische Universität Dresden, 01062, Dresden, Germany
| |
Collapse
|
4
|
Seibt KM, Schmidt T, Heitkam T. The conserved 3' Angio-domain defines a superfamily of short interspersed nuclear elements (SINEs) in higher plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 101:681-699. [PMID: 31610059 DOI: 10.1111/tpj.14567] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 09/13/2019] [Accepted: 09/17/2019] [Indexed: 06/10/2023]
Abstract
Repetitive sequences are ubiquitous components of eukaryotic genomes affecting genome size and evolution as well as gene regulation. Among them, short interspersed nuclear elements (SINEs) are non-coding retrotransposons usually shorter than 1000 bp. They contain only few short conserved structural motifs, in particular an internal promoter derived from cellular RNAs and a mostly AT-rich 3' tail, whereas the remaining regions are highly variable. SINEs emerge and vanish during evolution, and often diversify into numerous families and subfamilies that are usually specific for only a limited number of species. In contrast, at the 3' end of multiple plant SINEs we detected the highly conserved 'Angio-domain'. This 37 bp segment defines the Angio-SINE superfamily, which encompasses 24 plant SINE families widely distributed across 13 orders within the plant kingdom. We retrieved 28 433 full-length Angio-SINE copies from genome assemblies of 46 plant species, frequently located in genes. Compensatory mutations in and adjacent to the Angio-domain imply selective restraints maintaining its RNA structure. Angio-SINE families share segmental sequence similarities, indicating a modular evolution with strong Angio-domain preservation. We suggest that the conserved domain contributes to the evolutionary success of Angio-SINEs through either structural interactions between SINE RNA and proteins increasing their transpositional efficiency, or by enhancing their accumulation in genes.
Collapse
Affiliation(s)
- Kathrin M Seibt
- Faculty of Biology, Technische Universität Dresden, Zellescher Weg 20b, Dresden, 01217, Germany
| | - Thomas Schmidt
- Faculty of Biology, Technische Universität Dresden, Zellescher Weg 20b, Dresden, 01217, Germany
| | - Tony Heitkam
- Faculty of Biology, Technische Universität Dresden, Zellescher Weg 20b, Dresden, 01217, Germany
| |
Collapse
|
5
|
Whole genome sequencing of Entamoeba nuttalli reveals mammalian host-related molecular signatures and a novel octapeptide-repeat surface protein. PLoS Negl Trop Dis 2019; 13:e0007923. [PMID: 31805050 PMCID: PMC6917348 DOI: 10.1371/journal.pntd.0007923] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2019] [Revised: 12/17/2019] [Accepted: 11/12/2019] [Indexed: 11/19/2022] Open
Abstract
The enteric protozoa Entamoeba histolytica is the causative agent of amebiasis, which is one of the most common parasitic diseases in developed and developing countries. Entamoeba nuttalli is the genetically closest species to E. histolytica in current phylogenetic analyses of Entamoeba species, and is prevalent in wild macaques. Therefore, E. nuttalli may be a key organism in which to investigate molecules required for infection of human or non-human primates. To explore the molecular signatures of host-parasite interactions, we conducted de novo assembly of the E. nuttalli genome, utilizing self-correction of PacBio long reads and polishing corrected reads using Illumina short reads, followed by comparative genomic analysis with two other mammalian and a reptilian Entamoeba species. The final draft assembly of E. nuttalli included 395 contigs with a total length of approximately 23 Mb, and 9,647 predicted genes, of which 6,940 were conserved with E. histolytica. In addition, we found an E. histolytica-specific repeat known as ERE2 in the E. nuttalli genome. GO-term enrichment analysis of mammalian host-related molecules indicated diversification of transmembrane proteins, including AIG1 family and BspA-like proteins that may be involved in the host-parasite interaction. Furthermore, we identified an E. nuttalli-specific protein that contained 42 repeats of an octapeptide ([G,E]KPTDTPS). This protein was shown to be localized on the cell surface using immunofluorescence. Since many repeat-containing proteins in parasites play important roles in interactions with host cells, this unique octapeptide repeat-containing protein may be involved in colonization of E. nuttalli in the intestine of macaques. Overall, our draft assembly provides a valuable resource for studying Entamoeba evolution and host-parasite selection.
Collapse
|