1
|
Liu C, Zhang Y, Li X, Jia Y, Li F, Li J, Zhang Z. Evidence of constraint in the 3D genome for trans-splicing in human cells. SCIENCE CHINA-LIFE SCIENCES 2020; 63:1380-1393. [PMID: 32221814 DOI: 10.1007/s11427-019-1609-6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2019] [Accepted: 12/04/2019] [Indexed: 10/24/2022]
Abstract
Fusion transcripts are commonly found in eukaryotes, and many aberrant fusions are associated with severe diseases, including cancer. One class of fusion transcripts is generated by joining separate transcripts through trans-splicing. However, the mechanism of trans-splicing in mammals remains largely elusive. Here we showed evidence to support an intuitive hypothesis that attributes trans-sphcing to the spatial proximity between premature transcripts. A novel trans-splicing detection tool (TSD) was developed to reliably identify intra-chromosomal trans-splicing events (iTSEs) from RNA-seq data. TSD can maintain a remarkable balance between sensitivity and accuracy, thus distinguishing it from most state-of-the-art tools. The accuracy of TSD was experimentally demonstrated by excluding potential false discovery from mosaic genome or template switching during PCR. We showed that iTSEs identified by TSD were frequently found between genomic regulatory elements, which are known to be more prone to interact with each other. Moreover, iTSE sites may be more physically adjacent to each other than random control in the tested human lymphoblastoid cell line according to Hi-C data. Our results suggest that trans-splicing and 3D genome architecture may be coupled in mammals and that our pipeline, TSD, may facilitate investigations of trans-splicing on a systematic and accurate level previously thought impossible.
Collapse
Affiliation(s)
- Cong Liu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yiqun Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Xiaoli Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Yan Jia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China
| | - Feifei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Jing Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China.
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, and China National Center for Bioinformation, Chinese Academy of Sciences, Beijing, 100101, China. .,School of Life Science, University of Chinese Academy of Sciences, Beijing, 100049, China.
| |
Collapse
|
2
|
Boroni M, Sammeth M, Gava SG, Jorge NAN, Macedo AM, Machado CR, Mourão MM, Franco GR. Landscape of the spliced leader trans-splicing mechanism in Schistosoma mansoni. Sci Rep 2018; 8:3877. [PMID: 29497070 PMCID: PMC5832876 DOI: 10.1038/s41598-018-22093-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2017] [Accepted: 02/12/2018] [Indexed: 11/09/2022] Open
Abstract
Spliced leader dependent trans-splicing (SLTS) has been described as an important RNA regulatory process that occurs in different organisms, including the trematode Schistosoma mansoni. We identified more than seven thousand putative SLTS sites in the parasite, comprising genes with a wide spectrum of functional classes, which underlines the SLTS as a ubiquitous mechanism in the parasite. Also, SLTS gene expression levels span several orders of magnitude, showing that SLTS frequency is not determined by the expression level of the target gene, but by the presence of particular gene features facilitating or hindering the trans-splicing mechanism. Our in-depth investigation of SLTS events demonstrates widespread alternative trans-splicing (ATS) acceptor sites occurring in different regions along the entire gene body, highlighting another important role of SLTS generating alternative RNA isoforms in the parasite, besides the polycistron resolution. Particularly for introns where SLTS directly competes for the same acceptor substrate with cis-splicing, we identified for the first time additional and important features that might determine the type of splicing. Our study substantially extends the current knowledge of RNA processing by SLTS in S. mansoni, and provide basis for future studies on the trans-splicing mechanism in other eukaryotes.
Collapse
Affiliation(s)
- Mariana Boroni
- Laboratório de Genética Bioquímica, Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
- Laboratório de Bioinformática e Biologia Computacional, Coordenação de Pesquisa, Instituto Nacional de Câncer José Alencar Gomes da Silva, Rio de Janeiro, 20231-050, Brazil
| | - Michael Sammeth
- Bioinformatics in Transcriptomics and Functional Genomics (BITFUN), Instituto de Biofísica Carlos Chagas Filho, Universidade Federal do Rio de Janeiro, Rio de Janeiro, 21941-901, Brazil
- Laboratório Nacional de Computação Científica, Petrópolis, 25651-075, Brazil
| | - Sandra Grossi Gava
- Grupo de Helmintologia e Malacologia Médica, Instituto René Rachou, Fundação Oswaldo Cruz, Belo Horizonte, 30190-009, Brazil
| | - Natasha Andressa Nogueira Jorge
- Laboratório de Bioinformática e Biologia Computacional, Coordenação de Pesquisa, Instituto Nacional de Câncer José Alencar Gomes da Silva, Rio de Janeiro, 20231-050, Brazil
| | - Andréa Mara Macedo
- Laboratório de Genética Bioquímica, Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Carlos Renato Machado
- Laboratório de Genética Bioquímica, Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil
| | - Marina Moraes Mourão
- Grupo de Helmintologia e Malacologia Médica, Instituto René Rachou, Fundação Oswaldo Cruz, Belo Horizonte, 30190-009, Brazil.
| | - Glória Regina Franco
- Laboratório de Genética Bioquímica, Departamento de Bioquímica e Imunologia, Universidade Federal de Minas Gerais, Belo Horizonte, 31270-901, Brazil.
| |
Collapse
|
3
|
Tourasse NJ, Millet JRM, Dupuy D. Quantitative RNA-seq meta-analysis of alternative exon usage in C. elegans. Genome Res 2017; 27:2120-2128. [PMID: 29089372 PMCID: PMC5741048 DOI: 10.1101/gr.224626.117] [Citation(s) in RCA: 30] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2017] [Accepted: 10/26/2017] [Indexed: 12/16/2022]
Abstract
Almost 20 years after the completion of the C. elegans genome sequence, gene structure annotation is still an ongoing process with new evidence for gene variants still being regularly uncovered by additional in-depth transcriptome studies. While alternative splice forms can allow a single gene to encode several functional isoforms, the question of how much spurious splicing is tolerated is still heavily debated. Here we gathered a compendium of 1682 publicly available C. elegans RNA-seq data sets to increase the dynamic range of detection of RNA isoforms, and obtained robust measurements of the relative abundance of each splicing event. While most of the splicing reads come from reproducibly detected splicing events, a large fraction of purported junctions is only supported by a very low number of reads. We devised an automated curation method that takes into account the expression level of each gene to discriminate robust splicing events from potential biological noise. We found that rarely used splice sites disproportionately come from highly expressed genes and are significantly less conserved in other nematode genomes than splice sites with a higher usage frequency. Our increased detection power confirmed trans-splicing for at least 84% of C. elegans protein coding genes. The genes for which trans-splicing was not observed are overwhelmingly low expression genes, suggesting that the mechanism is pervasive but not fully captured by organism-wide RNA-seq. We generated annotated gene models including quantitative exon usage information for the entire C. elegans genome. This allows users to visualize at a glance the relative expression of each isoform for their gene of interest.
Collapse
Affiliation(s)
- Nicolas J Tourasse
- Université de Bordeaux, Inserm U1212, CNRS UMR5320, Institut Européen de Chimie et Biologie (IECB), 33607 Pessac, France
| | - Jonathan R M Millet
- Université de Bordeaux, Inserm U1212, CNRS UMR5320, Institut Européen de Chimie et Biologie (IECB), 33607 Pessac, France
| | - Denis Dupuy
- Université de Bordeaux, Inserm U1212, CNRS UMR5320, Institut Européen de Chimie et Biologie (IECB), 33607 Pessac, France
| |
Collapse
|
4
|
Saito TL, Hashimoto SI, Gu SG, Morton JJ, Stadler M, Blumenthal T, Fire A, Morishita S. The transcription start site landscape of C. elegans. Genome Res 2013; 23:1348-61. [PMID: 23636945 PMCID: PMC3730108 DOI: 10.1101/gr.151571.112] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2012] [Accepted: 04/18/2013] [Indexed: 11/24/2022]
Abstract
More than half of Caenorhabditis elegans pre-mRNAs lose their original 5' ends in a process termed "trans-splicing" in which the RNA extending from the transcription start site (TSS) to the site of trans-splicing of the primary transcript, termed the "outron," is replaced with a 22-nt spliced leader. This complicates the mapping of TSSs, leading to a lack of available TSS mapping data for these genes. We used growth at low temperature and nuclear isolation to enrich for transcripts still containing outrons, applying a modified SAGE capture procedure and high-throughput sequencing to characterize 5' termini in this transcript population. We report from this data both a landscape of 5'-end utilization for C. elegans and a representative collection of TSSs for 7351 trans-spliced genes. TSS distributions for individual genes were often dispersed, with a greater average number of TSSs for trans-spliced genes, suggesting that trans-splicing may remove selective pressure for a single TSS. Upstream of newly defined TSSs, we observed well-known motifs (including TATAA-box and SP1) as well as novel motifs. Several of these motifs showed association with tissue-specific expression and/or conservation among six worm species. Comparing TSS features between trans-spliced and non-trans-spliced genes, we found stronger signals among outron TSSs for preferentially positioning of flanking nucleosomes and for downstream Pol II enrichment. Our data provide an enabling resource for both experimental and theoretical analysis of gene structure and function in C. elegans.
Collapse
Affiliation(s)
- Taro Leo Saito
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan
| | - Shin-ichi Hashimoto
- Department of Laboratory Medicine, Faculty of Medicine, Kanazawa University, Kanazawa, 920-8641 Japan
| | - Sam Guoping Gu
- Department of Pathology, School of Medicine, Stanford University, Stanford, California 94305-5324, USA
| | - J. Jason Morton
- Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado 80309-0347, USA
| | - Michael Stadler
- Department of Pathology, School of Medicine, Stanford University, Stanford, California 94305-5324, USA
| | - Thomas Blumenthal
- Molecular, Cellular, and Developmental Biology, University of Colorado, Boulder, Colorado 80309-0347, USA
| | - Andrew Fire
- Departments of Pathology and Genetics, School of Medicine, Stanford University, Stanford, California 94305-5324, USA
| | - Shinichi Morishita
- Department of Computational Biology, Graduate School of Frontier Sciences, The University of Tokyo, Kashiwa 277-0882, Japan
| |
Collapse
|
5
|
Sleumer MC, Wei G, Wang Y, Chang H, Xu T, Chen R, Zhang MQ. Regulatory elements of Caenorhabditis elegans ribosomal protein genes. BMC Genomics 2012; 13:433. [PMID: 22928635 PMCID: PMC3575287 DOI: 10.1186/1471-2164-13-433] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2012] [Accepted: 08/17/2012] [Indexed: 01/16/2023] Open
Abstract
Background Ribosomal protein genes (RPGs) are essential, tightly regulated, and highly expressed during embryonic development and cell growth. Even though their protein sequences are strongly conserved, their mechanism of regulation is not conserved across yeast, Drosophila, and vertebrates. A recent investigation of genomic sequences conserved across both nematode species and associated with different gene groups indicated the existence of several elements in the upstream regions of C. elegans RPGs, providing a new insight regarding the regulation of these genes in C. elegans. Results In this study, we performed an in-depth examination of C. elegans RPG regulation and found nine highly conserved motifs in the upstream regions of C. elegans RPGs using the motif discovery algorithm DME. Four motifs were partially similar to transcription factor binding sites from C. elegans, Drosophila, yeast, and human. One pair of these motifs was found to co-occur in the upstream regions of 250 transcripts including 22 RPGs. The distance between the two motifs displayed a complex frequency pattern that was related to their relative orientation. We tested the impact of three of these motifs on the expression of rpl-2 using a series of reporter gene constructs and showed that all three motifs are necessary to maintain the high natural expression level of this gene. One of the motifs was similar to the binding site of an orthologue of POP-1, and we showed that RNAi knockdown of pop-1 impacts the expression of rpl-2. We further determined the transcription start site of rpl-2 by 5’ RACE and found that the motifs lie 40–90 bases upstream of the start site. We also found evidence that a noncoding RNA, contained within the outron of rpl-2, is co-transcribed with rpl-2 and cleaved during trans-splicing. Conclusions Our results indicate that C. elegans RPGs are regulated by a complex novel series of regulatory elements that is evolutionarily distinct from those of all other species examined up until now.
Collapse
Affiliation(s)
- Monica C Sleumer
- Bioinformatics Division, Center for Synthetic and Systems Biology, Tsinghua National Laboratory for Information Science and Technology, Tsinghua University, Beijing, China
| | | | | | | | | | | | | |
Collapse
|
6
|
DeMaso CR, Kovacevic I, Uzun A, Cram EJ. Structural and functional evaluation of C. elegans filamins FLN-1 and FLN-2. PLoS One 2011; 6:e22428. [PMID: 21799850 PMCID: PMC3143143 DOI: 10.1371/journal.pone.0022428] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 06/23/2011] [Indexed: 11/23/2022] Open
Abstract
Filamins are long, flexible, multi-domain proteins composed of an N-terminal actin-binding domain (ABD) followed by multiple immunoglobulin-like repeats (IgFLN). They function to organize and maintain the actin cytoskeleton, to provide scaffolds for signaling components, and to act as mechanical force sensors. In this study, we used transcript sequencing and homology modeling to characterize the gene and protein structures of the C. elegans filamin orthologs fln-1 and fln-2. Our results reveal that C. elegans FLN-1 is well conserved at the sequence level to vertebrate filamins, particularly in the ABD and several key IgFLN repeats. Both FLN-1 and the more divergent FLN-2 colocalize with actin in vivo. FLN-2 is poorly conserved, with at least 23 IgFLN repeats interrupted by large regions that appear to be nematode-specific. Our results indicate that many of the key features of vertebrate filamins are preserved in C. elegans FLN-1 and FLN-2, and suggest the nematode may be a very useful model system for further study of filamin function.
Collapse
Affiliation(s)
- Christina R. DeMaso
- Department of Biology, Center for Interdisciplinary Research on Complex Systems, Northeastern University, Boston, Massachusetts, United States of America
| | - Ismar Kovacevic
- Department of Biology, Center for Interdisciplinary Research on Complex Systems, Northeastern University, Boston, Massachusetts, United States of America
| | - Alper Uzun
- Department of Pediatrics, Women and Infants Hospital of Rhode Island, Brown Alpert Medical School, Center for Computational Molecular Biology, Brown University, Providence, Rhode Island, United States of America
| | - Erin J. Cram
- Department of Biology, Center for Interdisciplinary Research on Complex Systems, Northeastern University, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
7
|
Grishkevich V, Hashimshony T, Yanai I. Core promoter T-blocks correlate with gene expression levels in C. elegans. Genome Res 2011; 21:707-17. [PMID: 21367940 PMCID: PMC3083087 DOI: 10.1101/gr.113381.110] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2010] [Accepted: 02/17/2011] [Indexed: 02/01/2023]
Abstract
Core promoters mediate transcription initiation by the integration of diverse regulatory signals encoded in the proximal promoter and enhancers. It has been suggested that genes under simple regulation may have low-complexity permissive promoters. For these genes, the core promoter may serve as the principal regulatory element; however, the mechanism by which this occurs is unclear. We report here a periodic poly-thymine motif, which we term T-blocks, enriched in occurrences within core promoter forward strands in Caenorhabditis elegans. An increasing number of T-blocks on either strand is associated with increasing nucleosome eviction. Strikingly, only forward strand T-blocks are correlated with expression levels, whereby genes with ≥6 T-blocks have fivefold higher expression levels than genes with ≤3 T-blocks. We further demonstrate that differences in T-block numbers between strains predictably affect expression levels of orthologs. Highly expressed genes and genes in operons tend to have a large number of T-blocks, as well as the previously characterized SL1 motif involved in trans-splicing. The presence of T-blocks thus correlates with low nucleosome occupancy and the precision of a trans-splicing motif, suggesting its role at both the DNA and RNA levels. Collectively, our results suggest that core promoters may tune gene expression levels through the occurrences of T-blocks, independently of the spatio-temporal regulation mediated by the proximal promoter.
Collapse
Affiliation(s)
| | - Tamar Hashimshony
- Department of Biology, Technion–Israel Institute of Technology, Haifa 32000, Israel
| | - Itai Yanai
- Department of Biology, Technion–Israel Institute of Technology, Haifa 32000, Israel
| |
Collapse
|
8
|
Khare P, Mortimer SI, Cleto CL, Okamura K, Suzuki Y, Kusakabe T, Nakai K, Meedel TH, Hastings KEM. Cross-validated methods for promoter/transcription start site mapping in SL trans-spliced genes, established using the Ciona intestinalis troponin I gene. Nucleic Acids Res 2011; 39:2638-48. [PMID: 21109525 PMCID: PMC3074122 DOI: 10.1093/nar/gkq1151] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2010] [Revised: 10/22/2010] [Accepted: 10/25/2010] [Indexed: 11/12/2022] Open
Abstract
In conventionally-expressed eukaryotic genes, transcription start sites (TSSs) can be identified by mapping the mature mRNA 5'-terminal sequence onto the genome. However, this approach is not applicable to genes that undergo pre-mRNA 5'-leader trans-splicing (SL trans-splicing) because the original 5'-segment of the primary transcript is replaced by the spliced leader sequence during the trans-splicing reaction and is discarded. Thus TSS mapping for trans-spliced genes requires different approaches. We describe two such approaches and show that they generate precisely agreeing results for an SL trans-spliced gene encoding the muscle protein troponin I in the ascidian tunicate chordate Ciona intestinalis. One method is based on experimental deletion of trans-splice acceptor sites and the other is based on high-throughput mRNA 5'-RACE sequence analysis of natural RNA populations in order to detect minor transcripts containing the pre-mRNA's original 5'-end. Both methods identified a single major troponin I TSS located ∼460 nt upstream of the trans-splice acceptor site. Further experimental analysis identified a functionally important TATA element 31 nt upstream of the start site. The two methods employed have complementary strengths and are broadly applicable to mapping promoters/TSSs for trans-spliced genes in tunicates and in trans-splicing organisms from other phyla.
Collapse
Affiliation(s)
- Parul Khare
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Sandra I. Mortimer
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Cynthia L. Cleto
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Kohji Okamura
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Yutaka Suzuki
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Takehiro Kusakabe
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Kenta Nakai
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Thomas H. Meedel
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| | - Kenneth E. M. Hastings
- Montreal Neurological Institute and Department of Biology, McGill University, 3801 University St., Montreal, Quebec, Canada H3A 2B4, Biology Department, Rhode Island College, Providence, RI 02908, USA, Human Genome Center, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokane-dai, Minato-ku, Tokyo, 108-8639 and Department of Biology, Faculty of Science and Engineering, Konan Univeristy, 8-9-1 Okamoto, Higashinada-ku, Kobe 658-8501, Japan
| |
Collapse
|
9
|
Wang Y, Chen J, Wei G, He H, Zhu X, Xiao T, Yuan J, Dong B, He S, Skogerbø G, Chen R. The Caenorhabditis elegans intermediate-size transcriptome shows high degree of stage-specific expression. Nucleic Acids Res 2011; 39:5203-14. [PMID: 21378118 PMCID: PMC3130273 DOI: 10.1093/nar/gkr102] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
Earlier studies have revealed a substantial amount of transcriptional activity occurring outside annotated protein-coding genes of the Caenorhabditis elegans genome. One important fraction of this transcriptional activity relates to intermediate-size (70–500 nt) transcripts (is-ncRNAs) of mostly unknown function. Profiling the expression of this segment of the transcriptome on a tiling array through the C. elegans life cycle identified 5866 hitherto unannotated transcripts. The novel loci were distributed across intronic and intergenic space, with some enrichment toward protein-coding gene termini. The majority of the putative is-ncRNAs showed either stage-specific expression, or distinct developmental variation in their expression levels. More than 200 loci showed male-specific expression, and conserved loci were significantly enriched on the X chromosome, both observations strongly suggesting involvement of is-ncRNAs in sex-specific functions. Half of the novel loci were conserved in other nematodes, and numerous loci showed significant conservational correlations to nearby coding genes. Assuming functional roles for most of the novel loci, the data imply a nematode is-ncRNA tool kit of considerable size and variety.
Collapse
Affiliation(s)
- Yunfei Wang
- Bioinformatics Laboratory and National Laboratory of Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing 100101, China
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
10
|
Allen MA, Hillier LW, Waterston RH, Blumenthal T. A global analysis of C. elegans trans-splicing. Genome Res 2010; 21:255-64. [PMID: 21177958 DOI: 10.1101/gr.113811.110] [Citation(s) in RCA: 129] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Trans-splicing of one of two short leader RNAs, SL1 or SL2, occurs at the 5' ends of pre-mRNAs of many C. elegans genes. We have exploited RNA-sequencing data from the modENCODE project to analyze the transcriptome of C. elegans for patterns of trans-splicing. Transcripts of ∼70% of genes are trans-spliced, similar to earlier estimates based on analysis of far fewer genes. The mRNAs of most trans-spliced genes are spliced to either SL1 or SL2, but most genes are not trans-spliced to both, indicating that SL1 and SL2 trans-splicing use different underlying mechanisms. SL2 trans-splicing occurs in order to separate the products of genes in operons genome wide. Shorter intercistronic distance is associated with greater use of SL2. Finally, increased use of SL1 trans-splicing to downstream operon genes can indicate the presence of an extra promoter in the intercistronic region, creating what has been termed a "hybrid" operon. Within hybrid operons the presence of the two promoters results in the use of the two SL classes: Transcription that originates at the promoter upstream of another gene creates a polycistronic pre-mRNA that receives SL2, whereas transcription that originates at the internal promoter creates transcripts that receive SL1. Overall, our data demonstrate that >17% of all C. elegans genes are in operons.
Collapse
Affiliation(s)
- Mary Ann Allen
- Department of Molecular, Cellular, and Developmental Biology, University of Colorado at Boulder, Colorado 80309, USA
| | | | | | | |
Collapse
|
11
|
Lasda EL, Allen MA, Blumenthal T. Polycistronic pre-mRNA processing in vitro: snRNP and pre-mRNA role reversal in trans-splicing. Genes Dev 2010; 24:1645-58. [PMID: 20624853 DOI: 10.1101/gad.1940010] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Spliced leader (SL) trans-splicing in Caenorhabditis elegans attaches a 22-nucleotide (nt) exon onto the 5' end of many mRNAs. A particular class of SL, SL2, splices mRNAs of downstream operon genes. Here we use an embryonic extract-based in vitro splicing system to show that SL2 specificity information is encoded within the polycistronic pre-mRNA, and that trans-splicing specificity is recapitulated in vitro. We define an RNA sequence required for SL2 trans-splicing, the U-rich (Ur) element, through mutational analysis and bioinformatics as a short stem-loop followed by a sequence motif, UAYYUU, located approximately 50 nt upstream of the trans-splice site. Furthermore, this element is predicted in intercistronic regions of numerous operons of C. elegans and other species that use SL2 trans-splicing. We propose that the UAYYUU motif hybridizes with the 5' splice site on the SL2 RNA to recruit the SL to the pre-mRNA. In this way, the UAYYUU motif in the pre-mRNA would serve an analogous function to the similar sequence in the U1 snRNA, which binds to the 5' splice site of introns, effectively reversing the roles of snRNP and pre-mRNA in trans-splicing.
Collapse
Affiliation(s)
- Erika L Lasda
- Department of Biochemistry and Molecular Genetics, University of Colorado Denver, Anschutz Medical Campus, Aurora, Colorado 80045, USA
| | | | | |
Collapse
|
12
|
Matsumoto J, Dewar K, Wasserscheid J, Wiley GB, Macmil SL, Roe BA, Zeller RW, Satou Y, Hastings KEM. High-throughput sequence analysis of Ciona intestinalis SL trans-spliced mRNAs: alternative expression modes and gene function correlates. Genome Res 2010; 20:636-45. [PMID: 20212022 DOI: 10.1101/gr.100271.109] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
Pre-mRNA 5' spliced-leader (SL) trans-splicing occurs in some metazoan groups but not in others. Genome-wide characterization of the trans-spliced mRNA subpopulation has not yet been reported for any metazoan. We carried out a high-throughput analysis of the SL trans-spliced mRNA population of the ascidian tunicate Ciona intestinalis by 454 Life Sciences (Roche) pyrosequencing of SL-PCR-amplified random-primed reverse transcripts of tailbud embryo RNA. We obtained approximately 250,000 high-quality reads corresponding to 8790 genes, approximately 58% of the Ciona total gene number. The great depth of this data revealed new aspects of trans-splicing, including the existence of a significant class of "infrequently trans-spliced" genes, accounting for approximately 28% of represented genes, that generate largely non-trans-spliced mRNAs, but also produce trans-spliced mRNAs, in part through alternative promoter use. Thus, the conventional qualitative dichotomy of trans-spliced versus non-trans-spliced genes should be supplanted by a more accurate quantitative view recognizing frequently and infrequently trans-spliced gene categories. Our data include reads representing approximately 80% of Ciona frequently trans-spliced genes. Our analysis also revealed significant use of closely spaced alternative trans-splice acceptor sites which further underscores the mechanistic similarity of cis- and trans-splicing and indicates that the prevalence of +/-3-nt alternative splicing events at tandem acceptor sites, NAGNAG, is driven by spliceosomal mechanisms, and not nonsense-mediated decay, or selection at the protein level. The breadth of gene representation data enabled us to find new correlations between trans-splicing status and gene function, namely the overrepresentation in the frequently trans-spliced gene class of genes associated with plasma/endomembrane system, Ca(2+) homeostasis, and actin cytoskeleton.
Collapse
Affiliation(s)
- Jun Matsumoto
- Department of Neurology & Neurosurgery, McGill University, Montreal Neurological Institute, Montréal, Québec H3A 2B4, Canada
| | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Liu C, Chauhan C, Katholi CR, Unnasch TR. The splice leader addition domain represents an essential conserved motif for heterologous gene expression in B. malayi. Mol Biochem Parasitol 2009; 166:15-21. [PMID: 19428668 PMCID: PMC2680783 DOI: 10.1016/j.molbiopara.2009.02.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2008] [Revised: 02/10/2009] [Accepted: 02/11/2009] [Indexed: 11/28/2022]
Abstract
Two promoters from the human filarial parasite Brugia malayi have been mapped in detail. The essential domains of both promoters lacked canonical eukaryotic core promoter motifs. However, the largest contiguous essential domain in both promoters flanked and included the splice leader addition site. These findings suggested that the region flanking the trans-splicing addition site might represent a conserved core domain in B. malayi promoters. To test this hypothesis, the putative promoters of 12 trans-spliced genes encoding ribosomal protein homologues from B. malayi were isolated and tested for activity in a B. malayi transient transfection system. Of the 12 domains examined, 11 produced detectable reporter gene activity. Mutant constructs of the six most active promoters were prepared in which the spliced leader acceptor site and the 10 nt upstream and downstream of the site were deleted. All deletion constructs exhibited >90% reduction in reporter gene activity relative to their respective wild type sequences. A conserved pyrimidine-rich tract was located directly upstream from the spliced leader splice acceptor site which contained a conserved T residue located at position -3. Mutation of the entire polypyrimidine tract or the conserved T individually resulted in the loss of over 90% of reporter gene activity. In contrast, mutation of the splice acceptor site did not significantly reduce promoter activity. These data suggest that the region surrounding the splice acceptor site in the ribosomal promoters represents a conserved essential domain which functions independently of splice leader addition.
Collapse
Affiliation(s)
- Canhui Liu
- Global Health Infectious Disease Research, Department of Global Health, College of Public Health, University of South Florida, Tampa, FL
| | - Chitra Chauhan
- Global Health Infectious Disease Research, Department of Global Health, College of Public Health, University of South Florida, Tampa, FL
| | - Charles R. Katholi
- Department of Biostatistics, School of Public Health, University of Alabama at Birmingham, Birmingham, Al
| | - Thomas R. Unnasch
- Global Health Infectious Disease Research, Department of Global Health, College of Public Health, University of South Florida, Tampa, FL
| |
Collapse
|
14
|
Mitreva M, Elling AA, Dante M, Kloek AP, Kalyanaraman A, Aluru S, Clifton SW, Bird DM, Baum TJ, McCarter JP. A survey of SL1-spliced transcripts from the root-lesion nematode Pratylenchus penetrans. Mol Genet Genomics 2004; 272:138-48. [PMID: 15338281 DOI: 10.1007/s00438-004-1054-0] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2004] [Accepted: 08/06/2004] [Indexed: 10/26/2022]
Abstract
Plant-parasitic nematodes are important and cosmopolitan pathogens of crops. Here, we describe the generation and analysis of 1928 expressed sequence tags (ESTs) of a splice-leader 1 (SL1) library from mixed life stages of the root-lesion nematode Pratylenchus penetrans. The ESTs were grouped into 420 clusters and classified by function using the Gene Ontology (GO) hierarchy and the Kyoto KEGG database. Approximately 80% of all translated clusters show homology to Caenorhabditis elegans proteins, and 37% of the C. elegans gene homologs had confirmed phenotypes as assessed by RNA interference tests. Use of an SL1-PCR approach, while ensuring the cloning of the 5' ends of mRNAs, has demonstrated bias toward short transcripts. Putative nematode-specific and Pratylenchus -specific genes were identified, and their implications for nematode control strategies are discussed.
Collapse
Affiliation(s)
- M Mitreva
- Genome Sequencing Center, Department of Genetics, Washington University School of Medicine, MO 63108, St. Louis, USA,
| | | | | | | | | | | | | | | | | | | |
Collapse
|
15
|
Hwang BJ, Müller HM, Sternberg PW. Genome annotation by high-throughput 5' RNA end determination. Proc Natl Acad Sci U S A 2004; 101:1650-5. [PMID: 14757812 PMCID: PMC341809 DOI: 10.1073/pnas.0308384100] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2003] [Indexed: 11/18/2022] Open
Abstract
Complete gene identification and annotation, including alternative transcripts, remains a challenge in understanding genome organization. Such annotation can be achieved by a combination of computational analysis and experimental confirmation. Here, we describe a high-throughput technique, trans-spliced exon coupled RNA end determination (TEC-RED), that identifies 5' ends of expressed genes in nematodes. TEC-RED can distinguish coding regions from regulatory regions and identify genes as well as their alternative transcripts that have different 5' ends. Application of TEC-RED to approximately 10% of the Caenorhabditis elegans genome yielded tags 75% of which experimentally verified predicted 5'-RNA ends and 25% of which provided previously unknown information about 5'-RNA ends, including the identification of 99 previously unknown genes and 32 previously unknown operons. This technique will be applicable in any organisms that have a trans-splicing reaction from spliced leader RNA. We also describe an efficient sequential method for concatenating short sequence tags for any serial analysis of gene expression-like techniques.
Collapse
Affiliation(s)
- Byung Joon Hwang
- Howard Hughes Medical Institute and Division of Biology, 156-29, California Institute of Technology, 1200 East California Boulevard, Pasadena, CA 91125, USA
| | | | | |
Collapse
|
16
|
Evans D, Perez I, MacMorris M, Leake D, Wilusz CJ, Blumenthal T. A complex containing CstF-64 and the SL2 snRNP connects mRNA 3' end formation and trans-splicing in C. elegans operons. Genes Dev 2001; 15:2562-71. [PMID: 11581161 PMCID: PMC312790 DOI: 10.1101/gad.920501] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Polycistronic pre-mRNAs from Caenorhabditis elegans are processed by 3' end formation of the upstream mRNA and SL2-specific trans-splicing of the downstream mRNA. These processes usually occur within an approximately 100-nucleotide region and are mechanistically coupled. In this paper, we report a complex in C. elegans extracts containing the 3' end formation protein CstF-64 and the SL2 snRNP. This complex, immunoprecipitated with alphaCstF-64 antibody, contains SL2 RNA, but not SL1 RNA or other U snRNAs. Using mutational analysis we have been able to uncouple SL2 snRNP function and identity. SL2 RNA with a mutation in stem/loop III is functional in vivo as a trans-splice donor, but fails to splice to SL2-accepting trans-splice sites, suggesting that it has lost its identity as an SL2 snRNP. Importantly, stem/loop III mutations prevent association of SL2 RNA with CstF-64. In contrast, a mutation in stem II that inactivates the SL2 snRNP still permits complex formation with CstF-64. Therefore, SL2 RNA stem/loop III is required for both SL2 identity and formation of a complex containing CstF-64, but not for trans-splicing. These results provide a molecular framework for the coupling of 3' end formation and trans-splicing in the processing of polycistronic pre-mRNAs from C. elegans operons.
Collapse
Affiliation(s)
- D Evans
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Denver, CO 80262, USA
| | | | | | | | | | | |
Collapse
|
17
|
Abstract
We report the discovery of mRNA 5'-leader trans-splicing (SL trans-splicing) in the chordates. In the ascidian protochordate Ciona intestinalis, the mRNAs of at least seven genes undergo trans-splicing of a 16-nucleotide 5'-leader apparently derived from a 46-nucleotide RNA that shares features with previously characterized splice donor SL RNAs. SL trans-splicing was known previously to occur in several protist and metazoan phyla, however, this is the first report of SL trans-splicing within the deuterostome division of the metazoa. SL trans-splicing is not known to occur in the vertebrates. However, because ascidians are primitive chordates related to vertebrate ancestors, our findings raise the possibility of ancestral SL trans-splicing in the vertebrate lineage.
Collapse
Affiliation(s)
- A E Vandenberghe
- Montreal Neurological Institute and Biology Department, McGill University, Montreal, Quebec, Canada H3A 2B4
| | | | | |
Collapse
|
18
|
Huang T, Kuersten S, Deshpande AM, Spieth J, MacMorris M, Blumenthal T. Intercistronic region required for polycistronic pre-mRNA processing in Caenorhabditis elegans. Mol Cell Biol 2001; 21:1111-20. [PMID: 11158298 PMCID: PMC99565 DOI: 10.1128/mcb.21.4.1111-1120.2001] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
In Caenorhabditis elegans, polycistronic pre-mRNAs are processed by cleavage and polyadenylation at the 3' ends of the upstream genes and trans splicing, generally to the specialized spliced leader SL2, at the 5' ends of the downstream genes. Previous studies have indicated a relationship between these two events in the processing of a heat shock-induced gpd-2-gpd-3 polycistronic pre-mRNA. Here, we report mutational analysis of the intercistronic region of this operon by linker scan analysis. Surprisingly, no sequences downstream of the 3' end were important for 3'-end formation. In contrast, a U-rich (Ur) element located 29 bp downstream of the site of 3'-end formation was shown to be important for downstream mRNA biosynthesis. This approximately 20-bp element is sufficient for SL2 trans splicing and mRNA accumulation when transplanted to a heterologous context. Furthermore, when the downstream gene was replaced by a gene from another organism, no loss of trans-splicing specificity was observed, suggesting that the Ur element may be the primary signal required for downstream mRNA processing.
Collapse
Affiliation(s)
- T Huang
- Department of Biochemistry and Molecular Genetics, University of Colorado Health Sciences Center, Denver, Colorado 80262, USA
| | | | | | | | | | | |
Collapse
|
19
|
Evans D, Blumenthal T. trans splicing of polycistronic Caenorhabditis elegans pre-mRNAs: analysis of the SL2 RNA. Mol Cell Biol 2000; 20:6659-67. [PMID: 10958663 PMCID: PMC86170 DOI: 10.1128/mcb.20.18.6659-6667.2000] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023] Open
Abstract
Genes in Caenorhabditis elegans operons are transcribed as polycistronic pre-mRNAs in which downstream gene products are trans spliced to a specialized spliced leader, SL2. SL2 is donated by a 110-nucleotide RNA, SL2 RNA, present in the cell as an Sm-bound snRNP. SL2 RNA can be conceptually folded into a phylogenetically conserved three-stem-loop secondary structure. Here we report an in vivo mutational analysis of the SL2 RNA. Some sequences can be changed without consequence, while other changes result in a substantial loss of trans splicing. Interestingly, the spliced leader itself can be dramatically altered, such that the first stem-loop cannot form, with only a relatively small loss in trans-splicing efficiency. However, the primary sequence of stem II is crucial for SL2 trans splicing. Similarly, the conserved primary sequence of the third stem-loop plays a key role in trans splicing. While mutations in stem-loop III allow snRNP formation, a single nucleotide substitution in the loop prevents trans splicing. In contrast, the analogous region of SL1 RNA is not highly conserved, and its mutation does not abrogate function. Thus, stem-loop III appears to confer a specific function to SL2 RNA. Finally, an upstream sequence, previously predicted to be a proximal sequence element, is shown to be required for SL2 RNA expression.
Collapse
Affiliation(s)
- D Evans
- Department of Biochemistry and Molecular Genetics, University of Colorado Health Sciences Center, Denver 80262, USA
| | | |
Collapse
|
20
|
MacMorris MA, Zorio DA, Blumenthal T. An exon that prevents transport of a mature mRNA. Proc Natl Acad Sci U S A 1999; 96:3813-8. [PMID: 10097120 PMCID: PMC22377 DOI: 10.1073/pnas.96.7.3813] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
In Caenorhabditis elegans, pre-mRNA for the essential splicing factor U2AF65 sometimes is spliced to produce an RNA that includes an extra 216-bp internal exon, exon 3. Inclusion of exon 3 inserts an in-frame stop codon, yet this RNA is not subject to SMG-mediated RNA surveillance. To test whether exon 3 causes RNA to remain nuclear and thereby escape decay, we inserted it into the 3' untranslated region of a gfp reporter gene. Although exon 3 did not affect accumulation or processing of the mRNA, it dramatically suppressed expression of green fluorescent protein (GFP). We showed by in situ hybridization that exon 3-containing gfp RNA is retained in the nucleus. Intriguingly, exon 3 contains 10 matches to the 8-bp 3' splice-site consensus. We hypothesized that U2AF might recognize this octamer and thereby prevent export. This idea is supported by RNA interference experiments in which reduced levels of U2AF resulted in a small burst of gfp expression.
Collapse
Affiliation(s)
- M A MacMorris
- Department of Biochemistry and Molecular Genetics, University of Colorado Health Sciences Center, Denver, CO 80262, USA
| | | | | |
Collapse
|
21
|
Ferguson KC, Rothman JH. Alterations in the conserved SL1 trans-spliced leader of Caenorhabditis elegans demonstrate flexibility in length and sequence requirements in vivo. Mol Cell Biol 1999; 19:1892-900. [PMID: 10022876 PMCID: PMC83982 DOI: 10.1128/mcb.19.3.1892] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Approximately 70% of mRNAs in Caenorhabditis elegans are trans spliced to conserved 21- to 23-nucleotide leader RNAs. While the function of SL1, the major C. elegans trans-spliced leader, is unknown, SL1 RNA, which contains this leader, is essential for embryogenesis. Efforts to characterize in vivo requirements of the SL1 leader sequence have been severely constrained by the essential role of the corresponding DNA sequences in SL1 RNA transcription. We devised a heterologous expression system that circumvents this problem, making it possible to probe the length and sequence requirements of the SL1 leader without interfering with its transcription. We report that expression of SL1 from a U2 snRNA promoter rescues mutants lacking the SL1-encoding genes and that the essential embryonic function of SL1 is retained when approximately one-third of the leader sequence and/or the length of the leader is significantly altered. In contrast, although all mutant SL1 RNAs were well expressed, more severe alterations eliminate this essential embryonic function. The one non-rescuing mutant leader tested was never detected on messages, demonstrating that part of the leader sequence is essential for trans splicing in vivo. Thus, in spite of the high degree of SL1 sequence conservation, its length, primary sequence, and composition are not critical parameters of its essential embryonic function. However, particular nucleotides in the leader are essential for the in vivo function of the SL1 RNA, perhaps for its assembly into a functional snRNP or for the trans-splicing reaction.
Collapse
Affiliation(s)
- K C Ferguson
- Department of Molecular, Cellular, and Developmental Biology and Neuroscience Research Institute, University of California, Santa Barbara, California 93106, USA
| | | |
Collapse
|
22
|
Williams C, Xu L, Blumenthal T. SL1 trans splicing and 3'-end formation in a novel class of Caenorhabditis elegans operon. Mol Cell Biol 1999; 19:376-83. [PMID: 9858561 PMCID: PMC83895 DOI: 10.1128/mcb.19.1.376] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/1998] [Accepted: 09/16/1998] [Indexed: 11/20/2022] Open
Abstract
Many Caenorhabditis elegans genes exist in operons in which polycistronic precursors are processed by cleavage at the 3' ends of upstream genes and trans splicing 100 to 400 nucleotides away, at the 5' ends of downstream genes, to generate monocistronic messages. Of the two spliced leaders, SL1 is trans spliced to the 5' ends of upstream genes, whereas SL2 is reserved for downstream genes in operons. However, there are isolated examples of what appears to be a different sort of operon, in which trans splicing is exclusively to SL1 and there is no intercistronic region; the polyadenylation signal is only a few base pairs upstream of the trans-splice site. We have analyzed the processing of an operon of this type by inserting the central part of mes-6/cks-1 into an SL2-type operon. In this novel context, cks-1 is trans spliced only to SL1, and mes-6 3'-end formation occurs normally, demonstrating that this unique mode of processing is indeed intrinsic to this kind of operon, which we herein designate "SL1-type." An exceptionally long polypyrimidine tract found in the 3' untranslated regions of the three known SL1-type operons is shown to be required for the accumulation of both upstream and downstream mRNAs. Mutations of the trans-splice and poly(A) signals indicate that the two processes are independent and in competition, presumably due to their close proximity, raising the possibility that production of upstream and downstream mRNAs is mutually exclusive.
Collapse
Affiliation(s)
- C Williams
- Department of Biochemistry and Molecular Genetics, University of Colorado Health Sciences Center, Denver, Colorado 80262, USA
| | | | | |
Collapse
|
23
|
Caudevilla C, Serra D, Miliar A, Codony C, Asins G, Bach M, Hegardt FG. Natural trans-splicing in carnitine octanoyltransferase pre-mRNAs in rat liver. Proc Natl Acad Sci U S A 1998; 95:12185-90. [PMID: 9770461 PMCID: PMC22806 DOI: 10.1073/pnas.95.21.12185] [Citation(s) in RCA: 115] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Carnitine octanoyltransferase (COT) transports medium-chain fatty acids through the peroxisome. During isolation of a COT clone from a rat liver library, a cDNA in which exon 2 was repeated, was characterized. Reverse transcription-PCR amplifications of total RNAs from rat liver showed a three-band pattern. Sequencing of the fragments revealed that, in addition to the canonical exon organization, previously reported [Choi, S. J. et al. (1995) Biochim. Biophys. Acta 1264, 215-222], there were two other forms in which exon 2 or exons 2 and 3 were repeated. The possibility of this exonic repetition in the COT gene was ruled out by genomic Southern blot. To study the gene expression, we analyzed RNA transcripts by Northern blot after RNase H digestion of total RNA. Three different transcripts were observed. Splicing experiments also were carried out in vitro with different constructs that contain exon 2 plus the 5' or the 3' adjacent intron sequences. Our results indicate that accurate joining of two exons 2 occurs by a trans-splicing mechanism, confirming the potential of these structures for this process in nature. The trans-splicing can be explained by the presence of three exon-enhancer sequences in exon 2. Analysis by Western blot of the COT proteins by using specific antibodies showed that two proteins corresponding to the expected Mr are present in rat peroxisomes. This is the first time that a natural trans-splicing reaction has been demonstrated in mammalian cells.
Collapse
Affiliation(s)
- C Caudevilla
- Department of Biochemistry, School of Pharmacy, University of Barcelona, 08028 Barcelona, Spain
| | | | | | | | | | | | | |
Collapse
|
24
|
Evans D, Zorio D, MacMorris M, Winter CE, Lea K, Blumenthal T. Operons and SL2 trans-splicing exist in nematodes outside the genus Caenorhabditis. Proc Natl Acad Sci U S A 1997; 94:9751-6. [PMID: 9275196 PMCID: PMC23262 DOI: 10.1073/pnas.94.18.9751] [Citation(s) in RCA: 71] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/1997] [Accepted: 06/03/1997] [Indexed: 02/05/2023] Open
Abstract
The genomes of most eukaryotes are composed of genes arranged on the chromosomes without regard to function, with each gene transcribed from a promoter at its 5' end. However, the genome of the free-living nematode Caenorhabditis elegans contains numerous polycistronic clusters similar to bacterial operons in which the genes are transcribed sequentially from a single promoter at the 5' end of the cluster. The resulting polycistronic pre-mRNAs are processed into monocistronic mRNAs by conventional 3' end formation, cleavage, and polyadenylation, accompanied by trans-splicing with a specialized spliced leader (SL), SL2. To determine whether this mode of gene organization and expression, apparently unique among the animals, occurs in other species, we have investigated genes in a distantly related free-living rhabditid nematode in the genus Dolichorhabditis (strain CEW1). We have identified both SL1 and SL2 RNAs in this species. In addition, we have sequenced a Dolichorhabditis genomic region containing a gene cluster with all of the characteristics of the C. elegans operons. We show that the downstream gene is trans-spliced to SL2. We also present evidence that suggests that these two genes are also clustered in the C. elegans and Caenorhabditis briggsae genomes. Thus, it appears that the arrangement of genes in operons pre-dates the divergence of the genus Caenorhabditis from the other genera in the family Rhabditidae, and may be more widespread than is currently appreciated.
Collapse
Affiliation(s)
- D Evans
- Department of Biology, Indiana University, Bloomington, IN 47405, USA
| | | | | | | | | | | |
Collapse
|
25
|
Zorio DA, Lea K, Blumenthal T. Cloning of Caenorhabditis U2AF65: an alternatively spliced RNA containing a novel exon. Mol Cell Biol 1997; 17:946-53. [PMID: 9001248 PMCID: PMC231820 DOI: 10.1128/mcb.17.2.946] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2023] Open
Abstract
The U2 small nuclear ribonucleoprotein particle (snRNP) auxiliary factor, U2AF, is an essential splicing factor required for recognition of the polypyrimidine tract and subsequent U2 snRNP assembly at the branch point. Because Caenorhabditis elegans introns lack both polypyrimidine tract and branch point consensus sequences but have a very highly conserved UUUUCAG/R consensus at their 3' splice sites, we hypothesized that U2AF might serve to recognize this sequence and thus promote intron recognition in C. elegans. Here we report the cloning of the gene for the large subunit of U2AF, uaf-1. Three classes of cDNA were identified. In the most abundant class the open reading frame is similar to that for the U2AF65 from mammals and flies. The remaining two classes result from an alternative splicing event in which an exon containing an in-frame stop codon is inserted near the beginning of the second RNA recognition motif. However, this alternative mRNA is apparently not translated. Interestingly, the inserted exon contains 10 matches to the 3' splice site consensus. To determine whether this feature is conserved, we sequenced uaf-1 from the related nematode Caenorhabditis briggsae. It is composed of six exons, including an alternatively spliced third exon interrupting the gene at the same location as in C. elegans. uaf-1 is contained in an operon with the rab-18 gene in both species. Although the alternative exons from the two species are not highly conserved and would not encode related polypeptides, the C. briggsae alternative exon has 18 matches to the 3' splice site consensus. We hypothesize that the array of 3' splice site-like sequences in the pre-mRNA and alternatively spliced exon may have a regulatory role. The alternatively spliced RNA accumulates at high levels following starvation, suggesting that this RNA may represent an adaption for reducing U2AF65 levels when pre-mRNA levels are low.
Collapse
Affiliation(s)
- D A Zorio
- Department of Biology, Indiana University, Bloomington 47405, USA
| | | | | |
Collapse
|
26
|
Niemann G, von Besser H, Walter RD. Panagrellus redivivus ornithine decarboxylase: structure of the gene, expression in Escherichia coli and characterization of the recombinant protein. Biochem J 1996; 317 ( Pt 1):135-40. [PMID: 8694755 PMCID: PMC1217454 DOI: 10.1042/bj3170135] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
A southern blot analysis of the Panagrellus redivivus ornithine decarboxylase (ODC) gene suggests that it is a single-copy gene that resides on a genomic 3.2 kb EcoRI fragment. Phage clones possessing ODC gene sequences were isolated from a genomic EMBL-4 library and purified. The phage DNA inserts were analysed and a 3.2 kb EcoRI fragment containing the entire ODC gene was isolated. The nucleotide sequence analysis of this fragment reveals that the gene is interrupted by two introns of 47 and 49 bp. In the 5' non-translated region of the gene, putative AP1, VPE2 and c-Myc binding sites were identified. The ODC cDNA was expressed in a bacterial system as a His-fusion protein and the enzyme was purified by Ni(2+)-chelating affinity chromatography. The subunit molecular mass, as deduced from the cDNA and shown by SDS/PAGE, is 47.1 kDa. On the basis of gel filtration analyses it is shown that the active enzyme is a dimer. The specific enzyme activity was determined to be 4.2 mumol CO2/min/mg protein. The enzyme is dependent on pyridoxal 5-phosphate as a cofactor, and the presence of dithioerythritol or other thiol-reducing agents is essential for maximal activity. The Km value for L-ornithine was determined as 44 microM. The Ki values for putrescine, alpha-diffluoromethylornithine, alpha-hydrazino-ornithine and alpha-methylornithine were calculated as 51, 34, 0.34 and 42 microM respectively.
Collapse
Affiliation(s)
- G Niemann
- Bemhard Nocht Institute for Tropical Medicine, Department of Biochemistry, Hamburg, Federal Republic of Germany
| | | | | |
Collapse
|
27
|
Ferguson KC, Heid PJ, Rothman JH. The SL1 trans-spliced leader RNA performs an essential embryonic function in Caenorhabditis elegans that can also be supplied by SL2 RNA. Genes Dev 1996; 10:1543-56. [PMID: 8666237 DOI: 10.1101/gad.10.12.1543] [Citation(s) in RCA: 44] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Covalent joining of leader RNA exons to pre-mRNAs by trans-splicing has been observed in protists and invertebrates, and can occur in cultured mammalian cells. In the nematode Caenorhabditis elegans, approximately 60% of mRNA species are trans-spliced to the 22-nucleotide SL1 leader, and another approximately 10% of mRNAs receive the 22-nucleotide SL2 leader. We have isolated deletions that remove the rrs-1 cluster, a gene complex that contains approximately 110 tandem copies of a repeat encoding both SL1 RNA and 5S rRNA. An SL1-encoding gene alone rescues the embryonic lethality caused by these deletions. Mutations within the Sm-binding site of SL1 RNA, which is required for trans-splicing, eliminate rescue, suggesting that the ability of the SL1 leader to be trans-spliced is required for its essential activity. We observe pleiotropic defects in embryos lacking SL1 RNA, suggesting that multiple mRNAs may be affected by the absence of an SL1 leader. We found, however, that SL1-receiving messages are expressed without an SL1 leader. Surprisingly, when overexpressed, SL2 RNA, which performs a distinct function from that of SL1 RNA in wild-type animals, can rescue the lethality of embryos lacking SL1 RNA. Moreover, in these mutant embryos, we detect SL2 instead of SL1 leaders on normally SL1-trans-spliced messages; this result suggests that the mechanism that discriminates between SL1 and SL2-trans-splicing may involve competition between SL1 and SL2-specific trans-splicing. Our findings demonstrate that SL1 RNA is essential for embryogenesis in C. elegans and that SL2 RNA can substitute for SL1 RNA in vivo.
Collapse
Affiliation(s)
- K C Ferguson
- Department of Biochemistry, University of Wisconsin, Madison 53706, USA
| | | | | |
Collapse
|
28
|
Eul J, Graessmann M, Graessmann A. Trans-splicing and alternative-tandem-cis-splicing: two ways by which mammalian cells generate a truncated SV40 T-antigen. Nucleic Acids Res 1996; 24:1653-61. [PMID: 8649982 PMCID: PMC145833 DOI: 10.1093/nar/24.9.1653] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
The early SV40 BstXI-BamHI (Bst/Bam) DNA fragment encodes exclusively for the second exon of the large T-antigen and contains the intact small t-antigen intron. Rat cells transformed by the p14T, a construct that carries the Bst/Bam DNA fragment as a tail-to-head tandem duplication, synthesize a truncated T-antigen (T1-antigen) without having a direct equivalent at the DNA level. Formation of the T1-mRNA occurs by means of two distinct mechanisms: alternative-tandem-cis-splicing and trans-splicing. To generate the T1-mRNA the cells utilize a cryptic 5' splice site, located within the second exon of the large T-antigen and the regular small t-antigen 3' splice site. Since these splice sites are in an inverted order two Bst/Bam transcripts are required to generate one T1-mRNA molecule. For alternative-tandem-cis-splicing the cells utilize a 4.4 kb pre-mRNA that contains the sequence of the entire Bst/Bam tandem repeat. The proximal Bst/Bam segment provides the 5' donor splice site and the distal segment the 3' acceptor site. This requires that the pre-mRNA not be cleaved after the RNA polymerase II has passed the polyadenylation signal of the proximal Bst/Bam DNA segment. Synthesis of the 4.4 kb pre-mRNA was demonstrable by RT-PCR but not by Northern blot analysis. For trans-splicing, the cells utilize two separate pre-mRNA molecules. One transcript provides the cryptic 5' splice donor site and the other the 3' splice acceptor site. To demonstrate this a three base pair deletion was introduced into the proximal Bst/Bam segment of the p14T DNA (p14Tdelta-3) as a marker, destroying the recognition site for Pf/MI restriction enzyme. This deletion allowed the differentiation between the proximal and distal Bst/Bam segment. RT-PCR analysis and DNA sequencing confirmed that the p14Tdelta-3 transformed cells generate the T1-mRNA by intra- and inter-molecular RNA splicing.
Collapse
Affiliation(s)
- J Eul
- Institut für Molekularbiologie und Biochemie, Freie Universität, Berlin, Germany
| | | | | |
Collapse
|
29
|
Yochem J, Greenwald I. A gene for a low density lipoprotein receptor-related protein in the nematode Caenorhabditis elegans. Proc Natl Acad Sci U S A 1993; 90:4572-6. [PMID: 8506301 PMCID: PMC46554 DOI: 10.1073/pnas.90.10.4572] [Citation(s) in RCA: 96] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
A >23-kb gene that encodes a large integral membrane protein with a predicted structure similar to that of the low density lipoprotein (LDL) receptor-related protein (LRP) of mammals has been isolated and sequenced from the free-living nematode Caenorhabditis elegans. The 4753-amino acid predicted C. elegans product shares a nearly identical number and arrangement of amino acid sequence motifs with human LRP, and several exons of the C. elegans LRP gene correspond to exons of related parts of the human LDL receptor gene. The existence of an apparent homolog of LRP in C. elegans offers the possibility of genetic analysis of the in vivo roles of LRP and of the relationship between protein structure and function in a simple model organism.
Collapse
Affiliation(s)
- J Yochem
- Department of Molecular Biology, Princeton University, NJ 08544
| | | |
Collapse
|
30
|
Conrad R, Liou RF, Blumenthal T. Functional analysis of a C. elegans trans-splice acceptor. Nucleic Acids Res 1993; 21:913-9. [PMID: 8451190 PMCID: PMC309224 DOI: 10.1093/nar/21.4.913] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023] Open
Abstract
The rol-6 gene is trans-spliced to the 22 nt leader, SL1, 173 nt downstream of the transcription start. We have analyzed splicing in transformants carrying extrachromosomal arrays of rol-6 with mutations in the trans-splice acceptor site. This site is a close match to the consensus, UUUCAG, that is highly conserved in both trans-splice and intron acceptor sites in C. elegans. When the trans-splice site was inactivated by mutating the perfectly-conserved AG, trans-splicing still occurred, but at a cryptic site 20 nt upstream. We tested the frequency with which splicing switched from the normal site to the cryptic site when the pyrimidines at this site were changed to A's. Since most C. elegans 3' splice sites lack an obvious polypyrimidine tract, we hypothesized that these four pyrimidines might play this role, and indeed mutation of these bases caused splicing to switch to the cryptic site. We also demonstrated that a major reason the downstream site is normally favored is because it occurs at a boundary between A+U rich and non-A+U rich RNA. When the RNA between the two splice sites was made less A+U rich, splicing occurred preferentially at the upstream site.
Collapse
Affiliation(s)
- R Conrad
- Department of Biology, Indiana University, Bloomington 47405
| | | | | |
Collapse
|
31
|
Li W, Herman RK, Shaw JE. Analysis of the Caenorhabditis elegans axonal guidance and outgrowth gene unc-33. Genetics 1992; 132:675-89. [PMID: 1468626 PMCID: PMC1205206 DOI: 10.1093/genetics/132.3.675] [Citation(s) in RCA: 112] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
Mutations in the unc-33 gene of the nematode Caenorhabditis elegans lead to severely uncoordinated movement, abnormalities in the guidance and outgrowth of the axons of many neurons, and a superabundance of microtubules in neuronal processes. We have cloned unc-33 by tagging the gene with the transposable element Tc4. Three unc-33 messages, which are transcribed from a genomic region of at least 10 kb, were identified and characterized. The three messages have common 3' ends and identical reading frames. The largest (3.8-kb) message consists of the 22-nucleotide trans-spliced leader SL1 and 10 exons (I-X); the intermediate-size (3.3-kb) message begins with SL1 spliced to the 5' end of exon V and includes exons V-X; and the smallest (2.8-kb) message begins within exon VII and also includes exons VIII-X. A gamma-ray-induced deletion mutation situated within exon VIII reduces the sizes of all three messages by 0.5 kb. The three putative polypeptides encoded by the three messages overlap in C-terminal sequence but differ by the positions at which their N termini begin; none has significant similarity to any other known protein. A Tc4 insertion in exon VII leads to alterations in splicing that result in three approximately wild-type-size messages: the Tc4 sequence and 28 additional nucleotides are spliced out of the two larger messages; the Tc4 sequence is trans-spliced off the smallest message such that SL1 is added 13 nucleotides upstream of the normal 5' end of the smallest message.
Collapse
Affiliation(s)
- W Li
- Department of Genetics and Cell Biology, University of Minnesota, St. Paul 55108
| | | | | |
Collapse
|
32
|
elt-1, an embryonically expressed Caenorhabditis elegans gene homologous to the GATA transcription factor family. Mol Cell Biol 1991. [PMID: 1875944 DOI: 10.1128/mcb.11.9.4651] [Citation(s) in RCA: 48] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The short, asymmetrical DNA sequence to which the vertebrate GATA family of transcription factors binds is present in some Caenorhabditis elegans gene regulatory regions: it is required for activation of the vitellogenin genes and is also found just 5' of the TATA boxes of tra-2 and the msp genes. In vertebrates GATA-1 is specific to erythroid lineages, whereas GATA-2 and GATA-3 are present in multiple tissues. In an effort to identify the trans-acting factors that may recognize this sequence element in C. elegans, we used a degenerate oligonucleotide to clone a C. elegans homolog to this gene. We call this gene elt-1 (erythrocytelike transcription factor). It is single copy and specifies a 1.75-kb mRNA that is present predominantly, if not exclusively, in embryos. The region of elt-1 encoding two zinc fingers is remarkably similar to the DNA-binding domain of the vertebrate GATA-binding proteins. However, outside of the DNA-binding domains the amino acid sequences are quite divergent. Nevertheless, introns are located at identical or nearly identical positions in elt-1 and the mouse GATA-1 gene. In addition, elt-1 mRNA is trans-spliced to the 22-base untranslated leader, SL1. The DNA upstream of the elt-1 TATA box contains eight copies of the GATA recognition sequence within the first 300 bp, suggesting that elt-1 may be autogenously regulated. Our results suggest that the specialized role of GATA-1 in erythroid gene expression was derived after separation of the nematodes and the line that led to the vertebrates, since C. elegans lacks an erythroid lineage.
Collapse
|
33
|
Spieth J, Shim YH, Lea K, Conrad R, Blumenthal T. elt-1, an embryonically expressed Caenorhabditis elegans gene homologous to the GATA transcription factor family. Mol Cell Biol 1991; 11:4651-9. [PMID: 1875944 PMCID: PMC361353 DOI: 10.1128/mcb.11.9.4651-4659.1991] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The short, asymmetrical DNA sequence to which the vertebrate GATA family of transcription factors binds is present in some Caenorhabditis elegans gene regulatory regions: it is required for activation of the vitellogenin genes and is also found just 5' of the TATA boxes of tra-2 and the msp genes. In vertebrates GATA-1 is specific to erythroid lineages, whereas GATA-2 and GATA-3 are present in multiple tissues. In an effort to identify the trans-acting factors that may recognize this sequence element in C. elegans, we used a degenerate oligonucleotide to clone a C. elegans homolog to this gene. We call this gene elt-1 (erythrocytelike transcription factor). It is single copy and specifies a 1.75-kb mRNA that is present predominantly, if not exclusively, in embryos. The region of elt-1 encoding two zinc fingers is remarkably similar to the DNA-binding domain of the vertebrate GATA-binding proteins. However, outside of the DNA-binding domains the amino acid sequences are quite divergent. Nevertheless, introns are located at identical or nearly identical positions in elt-1 and the mouse GATA-1 gene. In addition, elt-1 mRNA is trans-spliced to the 22-base untranslated leader, SL1. The DNA upstream of the elt-1 TATA box contains eight copies of the GATA recognition sequence within the first 300 bp, suggesting that elt-1 may be autogenously regulated. Our results suggest that the specialized role of GATA-1 in erythroid gene expression was derived after separation of the nematodes and the line that led to the vertebrates, since C. elegans lacks an erythroid lineage.
Collapse
Affiliation(s)
- J Spieth
- Program in Molecular, Cellular and Developmental Biology, Indiana University, Bloomington 47405
| | | | | | | | | |
Collapse
|