1
|
Gao D. Introduction of Plant Transposon Annotation for Beginners. BIOLOGY 2023; 12:1468. [PMID: 38132293 PMCID: PMC10741241 DOI: 10.3390/biology12121468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/07/2023] [Revised: 11/21/2023] [Accepted: 11/23/2023] [Indexed: 12/23/2023]
Abstract
Transposons are mobile DNA sequences that contribute large fractions of many plant genomes. They provide exclusive resources for tracking gene and genome evolution and for developing molecular tools for basic and applied research. Despite extensive efforts, it is still challenging to accurately annotate transposons, especially for beginners, as transposon prediction requires necessary expertise in both transposon biology and bioinformatics. Moreover, the complexity of plant genomes and the dynamic evolution of transposons also bring difficulties for genome-wide transposon discovery. This review summarizes the three major strategies for transposon detection including repeat-based, structure-based, and homology-based annotation, and introduces the transposon superfamilies identified in plants thus far, and some related bioinformatics resources for detecting plant transposons. Furthermore, it describes transposon classification and explains why the terms 'autonomous' and 'non-autonomous' cannot be used to classify the superfamilies of transposons. Lastly, this review also discusses how to identify misannotated transposons and improve the quality of the transposon database. This review provides helpful information about plant transposons and a beginner's guide on annotating these repetitive sequences.
Collapse
Affiliation(s)
- Dongying Gao
- Small Grains and Potato Germplasm Research Unit, USDA-ARS, Aberdeen, ID 83210, USA
| |
Collapse
|
2
|
Bernet GP, Muñoz-Pomer A, Domínguez-Escribá L, Covelli L, Bernad L, Ramasamy S, Futami R, Sempere JM, Moya A, Llorens C. GyDB mobilomics: LTR retroelements and integrase-related transposons of the pea aphid Acyrthosiphon pisum genome. Mob Genet Elements 2011; 1:97-102. [PMID: 22016855 DOI: 10.4161/mge.1.2.17635] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2011] [Accepted: 08/04/2011] [Indexed: 12/14/2022] Open
Abstract
The Gypsy Database concerning Mobile Genetic Elements (release 2.0) is a wiki-style project devoted to the phylogenetic classification of LTR retroelements and their viral and host gene relatives characterized from distinct organisms. Furthermore, GyDB 2.0 is concerned with studying mobile elements within genomes. Therefore, an in-progress repository was created for databases with annotations of mobile genetic elements from particular genomes. This repository is called Mobilomics and the first uploaded database contains 549 LTR retroelements and related transposases which have been annotated from the genome of the Pea aphid Acyrthosiphon pisum. Mobilomics is accessible from the GyDB 2.0 project using the URL: http://gydb.org/index.php/Mobilomics.
Collapse
Affiliation(s)
- Guillermo P Bernet
- Biotechvana; Parc Cientific de la Universitat de València; Valencia, Spain
| | | | | | | | | | | | | | | | | | | |
Collapse
|
3
|
Marín I. GIN transposons: genetic elements linking retrotransposons and genes. Mol Biol Evol 2010; 27:1903-11. [PMID: 20228153 DOI: 10.1093/molbev/msq072] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
In a previous work, we characterized a gene, called Gypsy Integrase 1 (GIN1), which encodes a protein very similar to the integrase domains present in Gypsy/Ty3 retrotransposons. I describe here a paralog of GIN1 and GIN2 and show that both genes are present in multiple vertebrates and that a likely homolog is found in urochordates. Surprisingly, phylogenetic and structural analyses support the counterintuitive idea that the GIN genes did not directly derive from retrotransposons but from a novel type of animal-specific DNA transposons, the GIN elements. These elements, described for the first time in this study, are characterized by containing a gene that encodes a protein that is also very similar to Gypsy/Ty3 integrases. It turns out that the sequences of the integrases encoded by GIN1 and GIN2 are more similar to those found in GIN elements than to those detected in retrotransposons. Moreover, several introns are in the same positions in the integrase-encoding genes of some GIN elements, GIN1 and GIN2. The simplest explanation for these results is that GIN elements appeared early in animal evolution by co-option of the integrase of a retrotransposon, they later expanded in multiple animal lineages, and, eventually, gave rise to the GIN genes. In summary, GIN transposons may be the "missing link" that explain how GIN genes evolved from retrotransposons. GIN1 and GIN2 may have contributed to control the expansion of GIN elements and Gypsy/Ty3 retrotransposons in chordates.
Collapse
Affiliation(s)
- Ignacio Marín
- Instituto de Biomedicina de Valencia, Consejo Superior de Investigaciones Científicas, Valencia, Spain.
| |
Collapse
|
4
|
Bao W, Kapitonov VV, Jurka J. Ginger DNA transposons in eukaryotes and their evolutionary relationships with long terminal repeat retrotransposons. Mob DNA 2010; 1:3. [PMID: 20226081 PMCID: PMC2836005 DOI: 10.1186/1759-8753-1-3] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2009] [Accepted: 01/25/2010] [Indexed: 12/12/2022] Open
Abstract
Background In eukaryotes, long terminal repeat (LTR) retrotransposons such as Copia, BEL and Gypsy integrate their DNA copies into the host genome using a particular type of DDE transposase called integrase (INT). The Gypsy INT-like transposase is also conserved in the Polinton/Maverick self-synthesizing DNA transposons and in the 'cut and paste' DNA transposons known as TDD-4 and TDD-5. Moreover, it is known that INT is similar to bacterial transposases that belong to the IS3, IS481, IS30 and IS630 families. It has been suggested that LTR retrotransposons evolved from a non-LTR retrotransposon fused with a DNA transposon in early eukaryotes. In this paper we analyze a diverse superfamily of eukaryotic cut and paste DNA transposons coding for INT-like transposase and discuss their evolutionary relationship to LTR retrotransposons. Results A new diverse eukaryotic superfamily of DNA transposons, named Ginger (for 'Gypsy INteGrasE Related') DNA transposons is defined and analyzed. Analogously to the IS3 and IS481 bacterial transposons, the Ginger termini resemble those of the Gypsy LTR retrotransposons. Currently, Ginger transposons can be divided into two distinct groups named Ginger1 and Ginger2/Tdd. Elements from the Ginger1 group are characterized by approximately 40 to 270 base pair (bp) terminal inverted repeats (TIRs), and are flanked by CCGG-specific or CCGT-specific target site duplication (TSD) sequences. The Ginger1-encoded transposases contain an approximate 400 amino acid N-terminal portion sharing high amino acid identity to the entire Gypsy-encoded integrases, including the YPYY motif, zinc finger, DDE domain, and, importantly, the GPY/F motif, a hallmark of Gypsy and endogenous retrovirus (ERV) integrases. Ginger1 transposases also contain additional C-terminal domains: ovarian tumor (OTU)-like protease domain or Ulp1 protease domain. In vertebrate genomes, at least two host genes, which were previously thought to be derived from the Gypsy integrases, apparently have evolved from the Ginger1 transposase genes. We also introduce a second Ginger group, designated Ginger2/Tdd, which includes the previously reported DNA transposon TDD-4. Conclusions The Ginger superfamily represents eukaryotic DNA transposons closely related to LTR retrotransposons. Ginger elements provide new insights into the evolution of transposable elements and certain transposable element (TE)-derived genes.
Collapse
Affiliation(s)
- Weidong Bao
- Genetic Information Research Institute, Mountain View, CA, USA.
| | | | | |
Collapse
|
5
|
Abstract
Eukaryotes contain numerous transposable or mobile elements capable of parasite-like proliferation in the host genome. All known transposable elements in eukaryotes belong to two types: retrotransposons and DNA transposons. Here we report a previously uncharacterized class of DNA transposons called Polintons that populate genomes of protists, fungi, and animals, including entamoeba, soybean rust, hydra, sea anemone, nematodes, fruit flies, beetle, sea urchin, sea squirt, fish, lizard, frog, and chicken. Polintons from all these species are characterized by a unique set of proteins necessary for their transposition, including a protein-primed DNA polymerase B, retroviral integrase, cysteine protease, and ATPase. In addition, Polintons are characterized by 6-bp target site duplications, terminal-inverted repeats that are several hundred nucleotides long, and 5'-AG and TC-3' termini. Analogously to known transposable elements, Polintons exist as autonomous and nonautonomous elements. Our data suggest that Polintons have evolved from a linear plasmid that acquired a retroviral integrase at least 1 billion years ago. According to the model of Polinton transposition proposed here, a Polinton DNA molecule excised from the genome serves as a template for extrachromosomal synthesis of its double-stranded DNA copy by the Polinton-encoded DNA polymerase and is inserted back into genome by its integrase.
Collapse
Affiliation(s)
- Vladimir V. Kapitonov
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043
- *To whom correspondence may be addressed. E-mail:
or
| | - Jerzy Jurka
- Genetic Information Research Institute, 1925 Landings Drive, Mountain View, CA 94043
- *To whom correspondence may be addressed. E-mail:
or
| |
Collapse
|
6
|
Winckler T, Szafranski K, Glöckner G. Transfer RNA gene-targeted integration: an adaptation of retrotransposable elements to survive in the compact Dictyostelium discoideum genome. Cytogenet Genome Res 2005; 110:288-98. [PMID: 16093681 DOI: 10.1159/000084961] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2003] [Accepted: 10/10/2003] [Indexed: 11/19/2022] Open
Abstract
Almost every organism carries along a multitude of molecular parasites known as transposable elements (TEs). TEs influence their host genomes in many ways by expanding genome size and complexity, rearranging genomic DNA, mutagenizing host genes, and altering transcription levels of nearby genes. The eukaryotic microorganism Dictyostelium discoideum is attractive for the study of fundamental biological phenomena such as intercellular communication, formation of multicellularity, cell differentiation, and morphogenesis. D. discoideum has a highly compacted, haploid genome with less than 1 kb of genomic DNA separating coding regions. Nevertheless, the D. discoideum genome is loaded with 10% of TEs that managed to settle and survive in this inhospitable environment. In depth analysis of D. discoideum genome project data has provided intriguing insights into the evolutionary challenges that mobile elements face when they invade compact genomes. Two different mechanisms are used by D. discoideum TEs to avoid disruption of host genes upon retrotransposition. Several TEs have invented the specific targeting of tRNA gene-flanking regions as a means to avoid integration into coding regions. These elements have been dispersed on all chromosomes, closely following the distribution of tRNA genes. By contrast, TEs that lack bona fide integration specificities show a strong bias to nested integration, thus forming large TE clusters at certain chromosomal loci that are hardly resolved by bioinformatics approaches. We summarize our current view of D. discoideum TEs and present new data from the analysis of the complete sequences of D. discoideum chromosomes 1 and 2, which comprise more than one third of the total genome.
Collapse
Affiliation(s)
- T Winckler
- Institut für Pharmazeutische Biologie, Universität Frankfurt am Main (Biozentrum), Frankfurt, Germany.
| | | | | |
Collapse
|
7
|
Pritham EJ, Feschotte C, Wessler SR. Unexpected Diversity and Differential Success of DNA Transposons in Four Species of Entamoeba Protozoans. Mol Biol Evol 2005; 22:1751-63. [PMID: 15901838 DOI: 10.1093/molbev/msi169] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
We report the first comprehensive analysis of transposable element content in the compact genomes (approximately 20 Mb) of four species of Entamoeba unicellular protozoans for which draft sequences are now available. Entamoeba histolytica and Entamoeba dispar, two human parasites, have many retrotransposons, but few DNA transposons. In contrast, the reptile parasite Entamoeba invadens and the free-living Entamoeba moshkovskii contain few long interspersed elements but harbor diverse and recently amplified populations of DNA transposons. Representatives of three DNA transposase superfamilies (hobo/Activator/Tam3, Mutator, and piggyBac) were identified for the first time in a protozoan species in addition to a variety of members of a fourth superfamily (Tc1/mariner), previously reported only from ciliates and Trichomonas vaginalis among protozoans. The diversity of DNA transposons and their differential amplification among closely related species with similar compact genomes are discussed in the context of the biology of Entamoeba protozoans.
Collapse
Affiliation(s)
- Ellen J Pritham
- Department of Plant Biology, The University of Georgia, USA.
| | | | | |
Collapse
|
8
|
McClure MA, Donaldson E, Corro S. Potential multiple endonuclease functions and a ribonuclease H encoded in retroposon genomes. Virology 2002; 296:147-58. [PMID: 12036326 DOI: 10.1006/viro.2002.1392] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Among the retroposons, the source of the endonuclease activity is known to be variable and can be provided as either a retroviral-like integrase or a protein similar to the cellular apurinic-apyrimidinic endonuclease. It has also been reported that other retroposon and retrointron sequences have limited similarity to various eubacterial endonucleases. We investigated whether any retroposon genomes possibly encode multiple endonuclease functions. Amino acid alignments were generated and analyzed for the presence of the characterized ordered-series-of-motifs (OSM) representative of four different endonuclease functions. The results indicate that SLACS, CZAR, CRE1, CRE2, and some Trypanosoma brucei retroposon sequences encode multiple putative endonuclease functions. Interestingly, one of the endonuclease functions is embedded within the potential ribonuclease H sequence found in SLACS, CZAR, CRE1, CRE2, and R2BM retroposons.
Collapse
Affiliation(s)
- Marcella A McClure
- Department of Microbiology, Montana State University, Bozeman, Montana 59717, USA.
| | | | | |
Collapse
|
9
|
Lloréns C, Marín I. A mammalian gene evolved from the integrase domain of an LTR retrotransposon. Mol Biol Evol 2001; 18:1597-600. [PMID: 11470852 DOI: 10.1093/oxfordjournals.molbev.a003947] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
|
10
|
Glöckner G, Szafranski K, Winckler T, Dingermann T, Quail MA, Cox E, Eichinger L, Noegel AA, Rosenthal A. The complex repeats of Dictyostelium discoideum. Genome Res 2001; 11:585-94. [PMID: 11282973 PMCID: PMC311061 DOI: 10.1101/gr.162201] [Citation(s) in RCA: 43] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
In the course of determining the sequence of the Dictyostelium discoideum genome we have characterized in detail the quantity and nature of interspersed repetitive elements present in this species. Several of the most abundant small complex repeats and transposons (DIRS-1; TRE3-A,B; TRE5-A; skipper; Tdd-4; H3R) have been described previously. In our analysis we have identified additional elements. Thus, we can now present a complete list of complex repetitive elements in D. discoideum. All elements add up to 10% of the genome. Some of the newly described elements belong to established classes (TRE3-C, D; TRE5-B,C; DGLT-A,P; Tdd-5). However, we have also defined two new classes of DNA transposable elements (DDT and thug) that have not been described thus far. Based on the nucleotide amount, we calculated the least copy number in each family. These vary between <10 up to >200 copies. Unique sequences adjacent to the element ends and truncation points in elements gave a measure for the fragmentation of the elements. Furthermore, we describe the diversity of single elements with regard to polymorphisms and conserved structures. All elements show insertion preference into loci in which other elements of the same family reside. The analysis of the complex repeats is a valuable data resource for the ongoing assembly of whole D. discoideum chromosomes.
Collapse
Affiliation(s)
- G Glöckner
- IMB Jena, Department of Genome Analysis, D-07745 Jena, Germany.
| | | | | | | | | | | | | | | | | |
Collapse
|