1
|
Requena JM, Rastrojo A, Garde E, López MC, Thomas MC, Aguado B. Dataset for distribution of SIDER2 elements in the Leishmania major genome and transcriptome. Data Brief 2017; 11:39-43. [PMID: 28127581 PMCID: PMC5247276 DOI: 10.1016/j.dib.2017.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2016] [Revised: 12/27/2016] [Accepted: 01/04/2017] [Indexed: 11/19/2022] Open
Abstract
This paper contains data related to the research article entitled “Genomic cartography and proposal of nomenclature for the repeated, interspersed elements of the Leishmania major SIDER2 family and identification of SIDER2-containing transcripts” [1]. SIDER2 elements are repeated sequences, derived from, nowadays, extinct retrotransposons, that populate the genomes of protist of the genera Leishmania. This dataset (Supplementary file 1), an inventory of 1100 SIDER2 elements, was generated by surveying the L. major complete genome using bioinformatics tools with further manual refinements. In addition to the genomic distribution of these elements (summarized in Fig. 1), this dataset contains information regarding their association with specific transcripts, based on the recently established transcriptome for L. major[2].
Collapse
Affiliation(s)
- Jose M Requena
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Alberto Rastrojo
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Esther Garde
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Manuel C López
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN-CSIC), 18016 Granada, Spain
| | - M Carmen Thomas
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN-CSIC), 18016 Granada, Spain
| | - Begoña Aguado
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| |
Collapse
|
2
|
Requena JM, Rastrojo A, Garde E, López MC, Thomas MC, Aguado B. Genomic cartography and proposal of nomenclature for the repeated, interspersed elements of the Leishmania major SIDER2 family and identification of SIDER2-containing transcripts. Mol Biochem Parasitol 2016; 212:9-15. [PMID: 28034676 DOI: 10.1016/j.molbiopara.2016.12.009] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2016] [Revised: 12/14/2016] [Accepted: 12/21/2016] [Indexed: 01/09/2023]
Abstract
The genomes of most eukaryotic organisms contain a large number of transposable elements that are able to move from one genomic site to another either by transferring of DNA mobile elements (transposons) or transpose via reverse transcription of an RNA intermediate (retroposons). An exception to this rule is found in protists of the subgenus Leishmania, in which active retroposons degenerated after a flourishing era, leaving only retroposon remains; these have been classified into two families: SIDER1 and SIDER2. In this work, we have re-examined the elements belonging to the family SIDER2 present in the genome of Leishmania major with the aim of providing a nomenclature that will facilitate a future reference to particular elements. According to sequence conservation, the 1100 SIDER2 elements have been grouped into subfamilies, and the inferred taxonomic relationships have also been incorporated into the nomenclature. Additionally, we are providing detailed data regarding the genomic distribution of these elements and their association with specific transcripts, based on the recently established transcriptome for L. major. Thus, the presented data can help to study and better understand the roles played by these degenerated retroposons in both regulation of gene expression and genome plasticity.
Collapse
Affiliation(s)
- Jose M Requena
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain.
| | - Alberto Rastrojo
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Esther Garde
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain
| | - Manuel C López
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN-CSIC), 18016 Granada, Spain
| | - M Carmen Thomas
- Instituto de Parasitología y Biomedicina López-Neyra (IPBLN-CSIC), 18016 Granada, Spain
| | - Begoña Aguado
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, 28049 Madrid, Spain.
| |
Collapse
|
3
|
Alonso G, Rastrojo A, López-Pérez S, Requena JM, Aguado B. Resequencing and assembly of seven complex loci to improve the Leishmania major (Friedlin strain) reference genome. Parasit Vectors 2016; 9:74. [PMID: 26857920 PMCID: PMC4746890 DOI: 10.1186/s13071-016-1329-4] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Accepted: 01/20/2016] [Indexed: 01/22/2023] Open
Abstract
Background Leishmania parasites cause severe human diseases known as leishmaniasis. These eukaryotic microorganisms possess an atypical chromosomal architecture and the regulation of gene expression occurs almost exclusively at post-transcriptional levels. Accordingly, sequencing of the genome of Leishmania major, and subsequently the genome of other related species, was paramount for highlighting these peculiar molecular aspects. Recently, we carried out an analysis of gene expression by massive sequencing of RNA in the L. major promastigote, and data derived from that analysis were suggestive of possible errors in the current genome assembly for this Leishmania species. Results During the analysis by RNA-Seq of the transcriptome for L. major Friedlin strain, 163,714 reads could not be aligned with the reference genome. Thus, de novo assembly with these reads was carried out and the resulting contigs were further analyzed. After detailed homology searches using available databases, it was postulated that 15 contigs might correspond to genomic sequences lost during the initial genome assembly of the L. major Friedlin strain. This was experimentally confirmed by PCR amplification, cloning and sequencing of the new genomic regions. As a result, we have identified seven regions of the L. major (Friedlin) genome that were lost during the sequence assembly. This led to the uncovering of six new genes (LmjF.15.1475, LmjF.15.0285, LmjF.24.0765, LmjF.14.0860, LmjF.19.0305, and LmjF.27.2035), and correction of the annotation for two others (LmjF.15.1480 and LmjF.27.2030). Our data suggest that these genomic regions probably collapsed during the genome assembly due to the existence of gene duplications and/or repeated regions surrounding the missed genes. Conclusion RNA-seq data helped to reconstruct some genomic regions misassembled during the L. major Friedlin genome assembly, which is otherwise quite robust. On the other hand, this study shows that data derived from massive sequencing approaches, including RNA-Seq, should be carefully inspected to improve current genome definition and gene annotations. Electronic supplementary material The online version of this article (doi:10.1186/s13071-016-1329-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Graciela Alonso
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Alberto Rastrojo
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Sara López-Pérez
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Jose M Requena
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| | - Begoña Aguado
- Centro de Biología Molecular Severo Ochoa (CSIC-UAM), Universidad Autónoma de Madrid, c/ Nicolás Cabrera, 1, 28049, Madrid, Spain.
| |
Collapse
|
4
|
Sánchez-Luque F, López MC, Macias F, Alonso C, Thomas MC. Pr77 and L1TcRz: A dual system within the 5'-end of L1Tc retrotransposon, internal promoter and HDV-like ribozyme. Mob Genet Elements 2014; 2:1-7. [PMID: 22754746 PMCID: PMC3383444 DOI: 10.4161/mge.19233] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
The sequence corresponding to the first 77 nucleotides of the L1Tc and NARTc non-LTR retrotransposons from Trypanosoma cruzi is an internal promoter (Pr77) that generates abundant, although poorly translatable, un-spliced transcripts. It has been recently described that L1TcRz, an HDV-like ribozyme, resides within the 5'-end of the RNA from the L1Tc and NARTc retrotransposons. Remarkably, the same first 77 nucleotides of L1Tc/NARTc elements comprise both the Pr77 internal promoter and the HDV-like L1TcRz. The L1TcRz cleaves on the 5'-side of the +1 nucleotide of the L1Tc element insuring that the promoter and the ribozyme functions travel with the transposon during retrotransposition. The ribozyme activity would prevent the mobilization of upstream sequences and insure the individuality of the L1Tc/NARTc copies transcribed from associated tandems. The Pr77/L1TcRz sequence is also found in other trypanosomatid's non-LTR retrotransposons and degenerated retroposons. The possible conservation of the ribozyme activity in a widely degenerated retrotransposon, as the Leishmania SIDERs, could indicate that the presence of this element and the catalytic activity could play some favorable genetic regulation. The functional implications of the Pr77/L1TcRz dual system in the regulation of the L1Tc/NARTc retrotransposons and in the gene expression of trypanosomatids are also discussed in this paper.
Collapse
|
5
|
Sánchez-Luque FJ, López MC, Carreira PE, Alonso C, Thomas MC. The wide expansion of hepatitis delta virus-like ribozymes throughout trypanosomatid genomes is linked to the spreading of L1Tc/ingi clade mobile elements. BMC Genomics 2014; 15:340. [PMID: 24884364 PMCID: PMC4035085 DOI: 10.1186/1471-2164-15-340] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2013] [Accepted: 04/24/2014] [Indexed: 01/03/2023] Open
Abstract
Background Hepatitis Delta Virus (HDV)-like ribozymes have recently been found in many mobile elements in which they take part in a mechanism that releases intermediate RNAs from cellular co-transcripts. L1Tc in Trypanosoma cruzi is one of the elements in which such a ribozyme is located. It lies in the so-called Pr77-hallmark, a conserved region shared by retrotransposons belonging to the trypanosomatid L1Tc/ingi clade. The wide distribution of the Pr77-hallmark detected in trypanosomatid retrotransposons renders the potential catalytic activity of these elements worthy of study: their distribution might contribute to host genetic regulation at the mRNA level. Indeed, in Leishmania spp, the pervasive presence of these HDV-like ribozyme-containing mobile elements in certain 3′-untranslated regions of protein-coding genes has been linked to mRNA downregulation. Results Intensive screening of publicly available trypanosomatid genomes, combined with manual folding analyses, allowed the isolation of putatively Pr77-hallmarks with HDV-like ribozyme activity. This work describes the conservation of an HDV-like ribozyme structure in the Pr77 sequence of retrotransposons in a wide range of trypanosomatids, the catalytic function of which is maintained in the majority. These results are consistent with the previously suggested common phylogenetic origin of the elements that belong to this clade, although in some cases loss of functionality appears to have occurred and/or perhaps molecular domestication by the host. Conclusions These HDV-like ribozymes are widely distributed within retrotransposons across trypanosomatid genomes. This type of ribozyme was once thought to be rare in nature, but in fact it would seem to be abundant in trypanosomatid transcripts. It can even form part of the pool of mRNA 3′-untranslated regions, particularly in Leishmania spp. Its putative regulatory role in host genetic expression is discussed. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-340) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Manuel Carlos López
- Instituto de Parasitología y Biomedicina "López-Neyra", CSIC, Parque Tecnológico de Ciencias de la Salud, Av, del Conocimiento s/n, 18016 Granada, Spain.
| | | | | | | |
Collapse
|
6
|
Smircich P, Forteza D, El-Sayed NM, Garat B. Genomic analysis of sequence-dependent DNA curvature in Leishmania. PLoS One 2013; 8:e63068. [PMID: 23646176 PMCID: PMC3639952 DOI: 10.1371/journal.pone.0063068] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2012] [Accepted: 03/27/2013] [Indexed: 11/26/2022] Open
Abstract
Leishmania major is a flagellated protozoan parasite of medical importance. Like other members of the Trypanosomatidae family, it possesses unique mechanisms of gene expression such as constitutive polycistronic transcription of directional gene clusters, gene amplification, mRNA trans-splicing, and extensive editing of mitochondrial transcripts. The molecular signals underlying most of these processes remain under investigation. In order to investigate the role of DNA secondary structure signals in gene expression, we carried out a genome-wide in silico analysis of the intrinsic DNA curvature. The L. major genome revealed a lower frequency of high intrinsic curvature regions as well as inter- and intra- chromosomal distribution heterogeneity, when compared to prokaryotic and eukaryotic organisms. Using a novel method aimed at detecting region-integrated intrinsic curvature (RIIC), high DNA curvature was found to be associated with regions implicated in transcription initiation. Those include divergent strand-switch regions between directional gene clusters and regions linked to markers of active transcription initiation such as acetylated H3 histone, TRF4 and SNAP50. These findings suggest a role for DNA curvature in transcription initiation in Leishmania supporting the relevance of DNA secondary structures signals.
Collapse
Affiliation(s)
- Pablo Smircich
- Laboratorio de Interacciones Moleculares, Facultad de Ciencias, Montevideo, Uruguay
- Departamento de Genética, Facultad de Medicina, Montevideo, Uruguay
| | - Diego Forteza
- Laboratorio de Interacciones Moleculares, Facultad de Ciencias, Montevideo, Uruguay
| | - Najib M. El-Sayed
- Department of Cell Biology and Molecular Genetics and Center for Bioinformatics and Computational Biology, University of Maryland College Park, Maryland, United States of America
| | - Beatriz Garat
- Laboratorio de Interacciones Moleculares, Facultad de Ciencias, Montevideo, Uruguay
| |
Collapse
|
7
|
Thomas MC, Macias F, Alonso C, López MC. The biology and evolution of transposable elements in parasites. Trends Parasitol 2010; 26:350-62. [PMID: 20444649 DOI: 10.1016/j.pt.2010.04.001] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Revised: 03/30/2010] [Accepted: 04/01/2010] [Indexed: 12/19/2022]
Abstract
Transposable elements (TEs) are dynamic elements that can reshape host genomes by generating rearrangements with the potential to create or disrupt genes, to shuffle existing genes, and to modulate their patterns of expression. In the genomes of parasites that infect mammals several TEs have been identified that probably have been maintained throughout evolution due to their contribution to gene function and regulation of gene expression. This review addresses how TEs are organized, how they colonize the genomes of mammalian parasites, the functional role these elements play in parasite biology, and the interactions between these elements and the parasite genome.
Collapse
Affiliation(s)
- M Carmen Thomas
- Departamento de Biología Molecular, Instituto de Parasitología y Biomedicina López Neyra - CSIC, Parque Tecnológico de Ciencias de la Salud, 18100 Granada, Spain
| | | | | | | |
Collapse
|
8
|
Gene expression in trypanosomatid parasites. J Biomed Biotechnol 2010; 2010:525241. [PMID: 20169133 PMCID: PMC2821653 DOI: 10.1155/2010/525241] [Citation(s) in RCA: 100] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2009] [Accepted: 11/04/2009] [Indexed: 12/21/2022] Open
Abstract
The parasites Leishmania spp., Trypanosoma brucei, and Trypanosoma cruzi are the trypanosomatid protozoa that cause the deadly human diseases leishmaniasis, African sleeping sickness, and Chagas disease, respectively. These organisms possess unique mechanisms for gene expression such as constitutive polycistronic transcription of protein-coding genes and trans-splicing. Little is known about either the DNA sequences or the proteins that are involved in the initiation and termination of transcription in trypanosomatids. In silico analyses of the genome databases of these parasites led to the identification of a small number of proteins involved in gene expression. However, functional studies have revealed that trypanosomatids have more general transcription factors than originally estimated. Many posttranslational histone modifications, histone variants, and chromatin modifying enzymes have been identified in trypanosomatids, and recent genome-wide studies showed that epigenetic regulation might play a very important role in gene expression in this group of parasites. Here, we review and comment on the most recent findings related to transcription initiation and termination in trypanosomatid protozoa.
Collapse
|
9
|
Smith M, Bringaud F, Papadopoulou B. Organization and evolution of two SIDER retroposon subfamilies and their impact on the Leishmania genome. BMC Genomics 2009; 10:240. [PMID: 19463167 PMCID: PMC2689281 DOI: 10.1186/1471-2164-10-240] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Accepted: 05/22/2009] [Indexed: 12/17/2022] Open
Abstract
Background We have recently identified two large families of extinct transposable elements termed Short Interspersed DEgenerated Retroposons (SIDERs) in the parasitic protozoan Leishmania major. The characterization of SIDER elements was limited to the SIDER2 subfamily, although members of both subfamilies have been shown to play a role in the regulation of gene expression at the post-transcriptional level. Apparent functional domestication of SIDERs prompted further investigation of their characterization, dissemination and evolution throughout the Leishmania genus, with particular attention to the disregarded SIDER1 subfamily. Results Using optimized statistical profiles of both SIDER1 and SIDER2 subgroups, we report the first automated and highly sensitive annotation of SIDERs in the genomes of L. infantum, L. braziliensis and L. major. SIDER annotations were combined to in-silico mRNA extremity predictions to generate a detailed distribution map of the repeat family, hence uncovering an enrichment of antisense-oriented SIDER repeats between the polyadenylation and trans-splicing sites of intergenic regions, in contrast to the exclusive sense orientation of SIDER elements within 3'UTRs. Our data indicate that SIDER elements are quite uniformly dispersed throughout all three genomes and that their distribution is generally syntenic. However, only 47.4% of orthologous genes harbor a SIDER element in all three species. There is evidence for species-specific enrichment of SIDERs and for their preferential association, especially for SIDER2s, with different metabolic functions. Investigation of the sequence attributes and evolutionary relationship of SIDERs to other trypanosomatid retroposons reveals that SIDER1 is a truncated version of extinct autonomous ingi-like retroposons (DIREs), which were functional in the ancestral Leishmania genome. Conclusion A detailed characterization of the sequence traits for both SIDER subfamilies unveils major differences. The SIDER1 subfamily is more heterogeneous and shows an evolutionary link with vestigial DIRE retroposons as previously observed for the ingi/RIME and L1Tc/NARTc couples identified in the T. brucei and T. cruzi genomes, whereas no identified DIREs are related to SIDER2 sequences. Although SIDER1s and SIDER2s display equivalent genomic distribution globally, the varying degrees of sequence conservation, preferential genomic disposition, and differential association to orthologous genes allude to an intricate web of SIDER assimilation in these parasitic organisms.
Collapse
Affiliation(s)
- Martin Smith
- Research Centre in Infectious Diseases, CHUL Research Centre, RC-709, 2705 Laurier Blvd, Quebec (QC), G1V4G2 Canada.
| | | | | |
Collapse
|