1
|
Fine gene expression regulation by minor sequence variations downstream of the polyadenylation signal. Mol Biol Rep 2021; 48:1539-1547. [PMID: 33517473 DOI: 10.1007/s11033-021-06160-z] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2020] [Accepted: 01/12/2021] [Indexed: 12/22/2022]
Abstract
The termination of transcription is a complex process that substantially contributes to gene regulation in eukaryotes. Previously, it was noted that a single cytosine deletion at the position + 32 bp relative to the single polyadenylation signal AAUAAA (hereafter the dC mutation) causes a 2-fold increase in the transcription level of the upstream eGFP reporter in mouse embryonic stem cells. Here, we analyzed the conservation of this phenomenon in immortalized mouse, human and drosophila cell lines and the influence of the dC mutation on the choice of the pre-mRNA cleavage sites. We have constructed dual-reporter plasmids to accurately measure the effect of the dC and other nearby located mutations on eGFP mRNA level by RT-qPCR. In this way, we found that the dC mutation leads to a 2-fold increase in the expression level of the upstream eGFP reporter gene in cultured mouse and human, but not in drosophila cells. In addition, 3' RACE analysis demonstrated that eGFP pre-mRNAs are cut at multiple positions between + 14 to + 31, and that the most proximal cleavage site becomes almost exclusively utilized in the presence of the dC mutation. We also identified new short sequence variations located within positions + 25.. + 40 and + 33.. + 48 that increase eGFP expression up to ~2-4-fold. Altogether, the positive effect of the dC mutation seems to be conserved in mouse embryonic stem cells, mouse embryonic 3T3 fibroblasts and human HEK293T cells. In the latter cells, the dC mutation appears to be involved in regulating pre-mRNA cleavage site selection. Finally, a multiplexed approach is proposed to identify motifs located downstream of cleavage site(s) that are essential for transcription termination.
Collapse
|
2
|
de la Fuente L, Arzalluz-Luque Á, Tardáguila M, Del Risco H, Martí C, Tarazona S, Salguero P, Scott R, Lerma A, Alastrue-Agudo A, Bonilla P, Newman JRB, Kosugi S, McIntyre LM, Moreno-Manzano V, Conesa A. tappAS: a comprehensive computational framework for the analysis of the functional impact of differential splicing. Genome Biol 2020; 21:119. [PMID: 32423416 PMCID: PMC7236505 DOI: 10.1186/s13059-020-02028-w] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Accepted: 04/23/2020] [Indexed: 12/26/2022] Open
Abstract
Recent advances in long-read sequencing solve inaccuracies in alternative transcript identification of full-length transcripts in short-read RNA-Seq data, which encourages the development of methods for isoform-centered functional analysis. Here, we present tappAS, the first framework to enable a comprehensive Functional Iso-Transcriptomics (FIT) analysis, which is effective at revealing the functional impact of context-specific post-transcriptional regulation. tappAS uses isoform-resolved annotation of coding and non-coding functional domains, motifs, and sites, in combination with novel analysis methods to interrogate different aspects of the functional readout of transcript variants and isoform regulation. tappAS software and documentation are available at https://app.tappas.org.
Collapse
Affiliation(s)
- Lorena de la Fuente
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
- Present Address: Bioinformatics Unit, IIS Fundación Jiménez Díaz, Madrid, Spain
| | - Ángeles Arzalluz-Luque
- Department of Statistics and Operational Research, Polytechnical University of Valencia, Valencia, Spain
| | - Manuel Tardáguila
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
- Present Address: Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Héctor Del Risco
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Cristina Martí
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
| | - Sonia Tarazona
- Department of Statistics and Operational Research, Polytechnical University of Valencia, Valencia, Spain
| | - Pedro Salguero
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
| | - Raymond Scott
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA
| | - Alberto Lerma
- Genomics of Gene Expression Laboratory, Prince Felipe Research Center, Valencia, Spain
| | - Ana Alastrue-Agudo
- Present Address: Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Pablo Bonilla
- Present Address: Human Genetics Department, Wellcome Trust Sanger Institute, Hinxton, Cambridge, UK
| | - Jeremy R B Newman
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Department of Pathology, University of Florida, Gainesville, FL, USA
| | - Shunichi Kosugi
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Laboratory for Statistical and Translational Genetics, Center for Integrative Medical Sciences, RIKEN, Wako, Japan
| | - Lauren M McIntyre
- Genetics Institute, University of Florida, Gainesville, FL, USA
- Department of Molecular Genetics and Microbiology, University of Florida, Gainesville, FL, USA
| | | | - Ana Conesa
- Department of Microbiology and Cell Science, Institute for Food and Agricultural Sciences, University of Florida, Gainesville, FL, USA.
- Genetics Institute, University of Florida, Gainesville, FL, USA.
| |
Collapse
|
3
|
Stolyarenko AD. Nuclear Argonaute Piwi Gene Mutation Affects rRNA by Inducing rRNA Fragment Accumulation, Antisense Expression, and Defective Processing in Drosophila Ovaries. Int J Mol Sci 2020; 21:ijms21031119. [PMID: 32046213 PMCID: PMC7037970 DOI: 10.3390/ijms21031119] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2019] [Revised: 01/27/2020] [Accepted: 02/04/2020] [Indexed: 12/26/2022] Open
Abstract
Drosophila key nuclear piRNA silencing pathway protein Piwi of the Argonaute family has been classically studied as a factor controlling transposable elements and fertility. Piwi has been shown to concentrate in the nucleolus for reasons largely unknown. Ribosomal RNA is the main component of the nucleolus. In this work the effect of a piwi mutation on rRNA is described. This work led to three important conclusions: A mutation in piwi induces antisense 5S rRNA expression, a processing defect of 2S rRNA orthologous to the 3′-end of eukaryotic 5.8S rRNA, and accumulation of fragments of all five rRNAs in Drosophilamelanogaster ovaries. Hypotheses to explain these phenomena are proposed, possibly involving the interaction of the components of the piRNA pathway with the RNA surveillance machinery.
Collapse
Affiliation(s)
- Anastasia D Stolyarenko
- Institute of Molecular Genetics, Russian Academy of Sciences, 2 Kurchatov Sq., Moscow 123182, Russia
| |
Collapse
|
4
|
Genome-Wide Profiling of Polyadenylation Events in Maize Using High-Throughput Transcriptomic Sequences. G3-GENES GENOMES GENETICS 2019; 9:2749-2760. [PMID: 31239292 PMCID: PMC6686930 DOI: 10.1534/g3.119.400196] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Polyadenylation is an essential post-transcriptional modification of eukaryotic transcripts that plays critical role in transcript stability, localization, transport, and translational efficiency. About 70% genes in plants contain alternative polyadenylation (APA) sites. Despite availability of vast amount of sequencing data, to date, a comprehensive map of the polyadenylation events in maize is not available. Here, 9.48 billion RNA-Seq reads were analyzed to characterize 95,345 Poly(A) Clusters (PAC) in 23,705 (51%) maize genes. Of these, 76% were APA genes. However, most APA genes (55%) expressed a dominant PAC rather than favoring multiple PACs equally. The lincRNA genes with PACs were significantly longer in length than the genes without any PAC and about 48% genes had APA sites. Heterogeneity was observed in 52% of the PACs supporting the imprecise nature of the polyadenylation process. Genomic distribution revealed that the majority of the PACs (78%) were located in the genic regions. Unlike previous studies, large number of PACs were observed in the intergenic (n = 21,264), 5′-UTR (735), CDS (2,542), and the intronic regions (12,841). The CDS and introns with PACs were longer in length than without PACs, whereas intergenic PACs were more often associated with transcripts that lacked annotated 3′-UTRs. Nucleotide composition around PACs demonstrated AT-richness and the common upstream motif was AAUAAA, which is consistent with other plants. According to this study, only 2,830 genes still maintained the use of AAUAAA motif. This large-scale data provides useful insights about the gene expression regulation and could be utilized as evidence to validate the annotation of transcript ends.
Collapse
|
5
|
The Transcriptional Landscape of Marek's Disease Virus in Primary Chicken B Cells Reveals Novel Splice Variants and Genes. Viruses 2019; 11:v11030264. [PMID: 30884829 PMCID: PMC6466439 DOI: 10.3390/v11030264] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2019] [Revised: 03/12/2019] [Accepted: 03/13/2019] [Indexed: 12/14/2022] Open
Abstract
Marek's disease virus (MDV) is an oncogenic alphaherpesvirus that infects chickens and poses a serious threat to poultry health. In infected animals, MDV efficiently replicates in B cells in various lymphoid organs. Despite many years of research, the viral transcriptome in primary target cells of MDV remained unknown. In this study, we uncovered the transcriptional landscape of the very virulent RB1B strain and the attenuated CVI988/Rispens vaccine strain in primary chicken B cells using high-throughput RNA-sequencing. Our data confirmed the expression of known genes, but also identified a novel spliced MDV gene in the unique short region of the genome. Furthermore, de novo transcriptome assembly revealed extensive splicing of viral genes resulting in coding and non-coding RNA transcripts. A novel splicing isoform of MDV UL15 could also be confirmed by mass spectrometry and RT-PCR. In addition, we could demonstrate that the associated transcriptional motifs are highly conserved and closely resembled those of the host transcriptional machinery. Taken together, our data allow a comprehensive re-annotation of the MDV genome with novel genes and splice variants that could be targeted in further research on MDV replication and tumorigenesis.
Collapse
|
6
|
Majerciak V, Yang W, Zheng J, Zhu J, Zheng ZM. A Genome-Wide Epstein-Barr Virus Polyadenylation Map and Its Antisense RNA to EBNA. J Virol 2019; 93:e01593-18. [PMID: 30355690 PMCID: PMC6321932 DOI: 10.1128/jvi.01593-18] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2018] [Accepted: 10/17/2018] [Indexed: 12/14/2022] Open
Abstract
Epstein-Barr virus (EBV) is a ubiquitous human pathogen associated with Burkitt's lymphoma and nasopharyngeal carcinoma. Although the EBV genome harbors more than a hundred genes, a full transcription map with EBV polyadenylation profiles remains unknown. To elucidate the 3' ends of all EBV transcripts genome-wide, we performed the first comprehensive analysis of viral polyadenylation sites (pA sites) using our previously reported polyadenylation sequencing (PA-seq) technology. We identified that EBV utilizes a total of 62 pA sites in JSC-1, 60 in Raji, and 53 in Akata cells for the expression of EBV genes from both plus and minus DNA strands; 42 of these pA sites are commonly used in all three cell lines. The majority of identified pA sites were mapped to the intergenic regions downstream of previously annotated EBV open reading frames (ORFs) and viral promoters. pA sites lacking an association with any known EBV genes were also identified, mostly for the minus DNA strand within the EBNA locus, a major locus responsible for maintenance of viral latency and cell transformation. The expression of these novel antisense transcripts to EBNA were verified by 3' rapid amplification of cDNA ends (RACE) and Northern blot analyses in several EBV-positive (EBV+) cell lines. In contrast to EBNA RNA expressed during latency, expression of EBNA-antisense transcripts, which is restricted in latent cells, can be significantly induced by viral lytic infection, suggesting potential regulation of viral gene expression by EBNA-antisense transcription during lytic EBV infection. Our data provide the first evidence that EBV has an unrecognized mechanism that regulates EBV reactivation from latency.IMPORTANCE Epstein-Barr virus represents an important human pathogen with an etiological role in the development of several cancers. By elucidation of a genome-wide polyadenylation landscape of EBV in JSC-1, Raji, and Akata cells, we have redefined the EBV transcriptome and mapped individual polymerase II (Pol II) transcripts of viral genes to each one of the mapped pA sites at single-nucleotide resolution as well as the depth of expression. By unveiling a new class of viral lytic RNA transcripts antisense to latent EBNAs, we provide a novel mechanism of how EBV might control the expression of viral latent genes and lytic infection. Thus, this report takes another step closer to understanding EBV gene structure and expression and paves a new path for antiviral approaches.
Collapse
Affiliation(s)
- Vladimir Majerciak
- Tumor Virus RNA Biology Section, RNA Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland, USA
| | - Wenjing Yang
- Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Jing Zheng
- Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Jun Zhu
- Systems Biology Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, USA
| | - Zhi-Ming Zheng
- Tumor Virus RNA Biology Section, RNA Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland, USA
| |
Collapse
|
7
|
Freitas N, Lukash T, Gunewardena S, Chappell B, Slagle BL, Gudima SO. Relative Abundance of Integrant-Derived Viral RNAs in Infected Tissues Harvested from Chronic Hepatitis B Virus Carriers. J Virol 2018; 92:e02221-17. [PMID: 29491161 PMCID: PMC5923063 DOI: 10.1128/jvi.02221-17] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2017] [Accepted: 02/17/2018] [Indexed: 02/07/2023] Open
Abstract
Five matching sets of nonmalignant liver tissues and hepatocellular carcinoma (HCC) samples from individuals chronically infected with hepatitis B virus (HBV) were examined. The HBV genomic sequences were determined by using overlapping PCR amplicons covering the entire viral genome. Four pairs of tissues were infected with HBV genotype C, while one pair was infected with HBV genotype B. HBV replication markers were found in all tissues. In the majority of HCC samples, the levels of pregenomic/precore RNA (pgRNA) and covalently closed circular DNA (cccDNA) were lower than those in liver tissue counterparts. Regardless of the presence of HBV replication markers, (i) integrant-derived HBV RNAs (id-RNAs) were found in all tissues by reverse transcription-PCR (RT-PCR) analysis and were considerably abundant or predominant in 6/10 tissue samples (2 liver and 4 HCC samples), (ii) RNAs that were polyadenylated using the cryptic HBV polyadenylation signal and therefore could be produced by HBV replication or derived from integrated HBV DNA were found in 5/10 samples (3 liver and 2 HCC samples) and were considerably abundant species in 3/10 tissues (2 livers and 1 HCC), and (iii) cccDNA-transcribed RNAs polyadenylated near position 1931 were not abundant in 7/10 tissues (2 liver and 5 HCC samples) and were predominant in only two liver samples. Subsequent RNA sequencing analysis of selected liver/HCC samples also showed relative abundance of id-RNAs in most of the examined tissues. Our findings suggesting that id-RNAs could represent a significant source of HBV envelope proteins, which is independent of viral replication, are discussed in the context of the possible contribution of id-RNAs to the HBV life cycle.IMPORTANCE The relative abundance of integrant-derived HBV RNAs (id-RNAs) in chronically infected tissues suggest that id-RNAs coding for the envelope proteins may facilitate the production of a considerable fraction of surface antigens (HBsAg) in infected cells bearing HBV integrants. If the same cells support HBV replication, then a significant fraction of assembled HBV virions could bear id-RNA-derived HBsAg as a major component of their envelopes. Therefore, the infectivity of these HBV virions and their ability to facilitate virus cell-to-cell spread could be determined mainly by the properties of id-RNA-derived envelope proteins and not by the properties of replication-derived HBsAg. These interpretations suggest that id-RNAs may play a role in the maintenance of chronic HBV infection and therefore contribute to the HBV life cycle. Furthermore, the production of HBsAg from id-RNAs independently of viral replication may explain at least in part why treatment with interferon or nucleos(t)ides in most cases fails to achieve a loss of serum HBsAg.
Collapse
Affiliation(s)
- Natalia Freitas
- Department of Microbiology, Molecular Genetics and Immunology, University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Tetyana Lukash
- Department of Microbiology, Molecular Genetics and Immunology, University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Sumedha Gunewardena
- Department of Molecular and Integrative Physiology, University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Benjamin Chappell
- Department of Microbiology, Molecular Genetics and Immunology, University of Kansas Medical Center, Kansas City, Kansas, USA
| | - Betty L Slagle
- Department of Molecular Virology and Microbiology, Baylor College of Medicine, Houston, Texas, USA
| | - Severin O Gudima
- Department of Microbiology, Molecular Genetics and Immunology, University of Kansas Medical Center, Kansas City, Kansas, USA
| |
Collapse
|
8
|
Targeting the Polyadenylation Signal of Pre-mRNA: A New Gene Silencing Approach for Facioscapulohumeral Dystrophy. Int J Mol Sci 2018; 19:ijms19051347. [PMID: 29751519 PMCID: PMC5983732 DOI: 10.3390/ijms19051347] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2018] [Revised: 04/27/2018] [Accepted: 04/30/2018] [Indexed: 02/07/2023] Open
Abstract
Facioscapulohumeral dystrophy (FSHD) is characterized by the contraction of the D4Z4 array located in the sub-telomeric region of the chromosome 4, leading to the aberrant expression of the DUX4 transcription factor and the mis-regulation of hundreds of genes. Several therapeutic strategies have been proposed among which the possibility to target the polyadenylation signal to silence the causative gene of the disease. Indeed, defects in mRNA polyadenylation leads to an alteration of the transcription termination, a disruption of mRNA transport from the nucleus to the cytoplasm decreasing the mRNA stability and translation efficiency. This review discusses the polyadenylation mechanisms, why alternative polyadenylation impacts gene expression, and how targeting polyadenylation signal may be a potential therapeutic approach for FSHD.
Collapse
|
9
|
Rot G, Wang Z, Huppertz I, Modic M, Lenče T, Hallegger M, Haberman N, Curk T, von Mering C, Ule J. High-Resolution RNA Maps Suggest Common Principles of Splicing and Polyadenylation Regulation by TDP-43. Cell Rep 2018; 19:1056-1067. [PMID: 28467899 PMCID: PMC5437728 DOI: 10.1016/j.celrep.2017.04.028] [Citation(s) in RCA: 67] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Revised: 03/06/2017] [Accepted: 04/06/2017] [Indexed: 11/05/2022] Open
Abstract
Many RNA-binding proteins (RBPs) regulate both alternative exons and poly(A) site selection. To understand their regulatory principles, we developed expressRNA, a web platform encompassing computational tools for integration of iCLIP and RNA motif analyses with RNA-seq and 3′ mRNA sequencing. This reveals at nucleotide resolution the “RNA maps” describing how the RNA binding positions of RBPs relate to their regulatory functions. We use this approach to examine how TDP-43, an RBP involved in several neurodegenerative diseases, binds around its regulated poly(A) sites. Binding close to the poly(A) site generally represses, whereas binding further downstream enhances use of the site, which is similar to TDP-43 binding around regulated exons. Our RNAmotifs2 software also identifies sequence motifs that cluster together with the binding motifs of TDP-43. We conclude that TDP-43 directly regulates diverse types of pre-mRNA processing according to common position-dependent principles. TDP-43 regulates competing poly(A) sites in a highly position-dependent manner expressRNA is a new platform for analysis of alternative polyadenylation and splicing RNAmotifs2 is a cluster motif analysis platform integrated with expressRNA Regulation of pre-mRNA processing might follow common position-dependent principles
Collapse
Affiliation(s)
- Gregor Rot
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland; MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK.
| | - Zhen Wang
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK; Institut de Biologie de l'ENS (IBENS), 46 rue d'Ulm, Paris 75005, France
| | - Ina Huppertz
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK; European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | - Miha Modic
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK; Institute of Stem Cell Research, Helmholtz Center Munich, Ingolstaedter Landstrasse 1, 85764 Neuherberg, Germany
| | - Tina Lenče
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK; Institute of Molecular Biology, Ackermannweg 4, 55128 Mainz, Germany
| | - Martina Hallegger
- UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK; The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Nejc Haberman
- UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK; The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK
| | - Tomaž Curk
- Faculty of Computer and Information Science, University of Ljubljana, Večna pot 113, 1001 Ljubljana, Slovenia
| | - Christian von Mering
- Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, Winterthurerstrasse 190, 8057 Zurich, Switzerland
| | - Jernej Ule
- MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge CB2 0QH, UK; UCL Institute of Neurology, Queen Square, London WC1N 3BG, UK; The Francis Crick Institute, 1 Midland Road, London NW1 1AT, UK.
| |
Collapse
|
10
|
Feng L, Yuen YL, Xu J, Liu X, Chan MYC, Wang K, Fong WP, Cheung WT, Lee SST. Identification and characterization of a novel PPARα-regulated and 7α-hydroxyl bile acid-preferring cytosolic sulfotransferase mL-STL (Sult2a8). J Lipid Res 2017; 58:1114-1131. [PMID: 28442498 DOI: 10.1194/jlr.m074302] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2016] [Revised: 04/19/2017] [Indexed: 12/25/2022] Open
Abstract
PPARα has been known to play a pivotal role in orchestrating lipid, glucose, and amino acid metabolism via transcriptional regulation of its target gene expression during energy deprivation. Recent evidence has also suggested that PPARα is involved in bile acid metabolism, but how PPARα modulates the homeostasis of bile acids during fasting is still not clear. In a mechanistic study aiming to dissect the spectrum of PPARα target genes involved in metabolic response to fasting, we identified a novel mouse gene (herein named mL-STL for mouse liver-sulfotransferase-like) that shared extensive homology with the Sult2a subfamily of a superfamily of cytosolic sulfotransferases, implying its potential function in sulfonation. The mL-STL gene expressed predominantly in liver in fed state, but PPARα was required to sustain its expression during fasting, suggesting a critical role of PPARα in regulating the mL-STL-mediated sulfonation during fasting. Functional studies using recombinant His-tagged mL-STL protein revealed its narrow sulfonating activities toward 7α-hydroxyl primary bile acids, including cholic acid, chenodeoxycholic acid, and α-muricholic acid, and thus suggesting that mL-STL may be the major hepatic bile acid sulfonating enzyme in mice. Together, these studies identified a novel PPARα-dependent gene and uncovered a new role of PPARα as being an essential regulator in bile acid biotransformation via sulfonation during fasting.
Collapse
Affiliation(s)
- Lu Feng
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Yee-Lok Yuen
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Jian Xu
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Xing Liu
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Martin Yan-Chun Chan
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Kai Wang
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Wing-Ping Fong
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Wing-Tai Cheung
- School of Biomedical Sciences, Faculty of Medicine, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| | - Susanna Sau-Tuen Lee
- School of Life Sciences, Faculty of Science, The Chinese University of Hong Kong, Shatin, New Territories, Hong Kong SAR
| |
Collapse
|
11
|
Prediction of Poly(A) Sites by Poly(A) Read Mapping. PLoS One 2017; 12:e0170914. [PMID: 28135292 PMCID: PMC5279776 DOI: 10.1371/journal.pone.0170914] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 01/12/2017] [Indexed: 11/19/2022] Open
Abstract
RNA-seq reads containing part of the poly(A) tail of transcripts (denoted as poly(A) reads) provide the most direct evidence for the position of poly(A) sites in the genome. However, due to reduced coverage of poly(A) tails by reads, poly(A) reads are not routinely identified during RNA-seq mapping. Nevertheless, recent studies for several herpesviruses successfully employed mapping of poly(A) reads to identify herpesvirus poly(A) sites using different strategies and customized programs. To more easily allow such analyses without requiring additional programs, we integrated poly(A) read mapping and prediction of poly(A) sites into our RNA-seq mapping program ContextMap 2. The implemented approach essentially generalizes previously used poly(A) read mapping approaches and combines them with the context-based approach of ContextMap 2 to take into account information provided by other reads aligned to the same location. Poly(A) read mapping using ContextMap 2 was evaluated on real-life data from the ENCODE project and compared against a competing approach based on transcriptome assembly (KLEAT). This showed high positive predictive value for our approach, evidenced also by the presence of poly(A) signals, and considerably lower runtime than KLEAT. Although sensitivity is low for both methods, we show that this is in part due to a high extent of spurious results in the gold standard set derived from RNA-PET data. Sensitivity improves for poly(A) sites of known transcripts or determined with a more specific poly(A) sequencing protocol and increases with read coverage on transcript ends. Finally, we illustrate the usefulness of the approach in a high read coverage scenario by a re-analysis of published data for herpes simplex virus 1. Thus, with current trends towards increasing sequencing depth and read length, poly(A) read mapping will prove to be increasingly useful and can now be performed automatically during RNA-seq mapping with ContextMap 2.
Collapse
|
12
|
Wang X, Zheng ZM. Construction of a Transcription Map for Papillomaviruses using RACE, RNase Protection, and Primer Extension Assays. ACTA ACUST UNITED AC 2016; 40:14B.6.1-14B.6.29. [PMID: 26855281 DOI: 10.1002/9780471729259.mc14b06s40] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Papillomaviruses are a family of small, non-enveloped DNA tumor viruses. Knowing a complete transcription map of each papillomavirus genome can provide guidance for various papillomavirus studies. This unit provides detailed protocols to construct a transcription map of human papillomavirus type 18. The same approach can be easily adapted to other transcription map studies of any other papillomavirus genotype due to the high degree of conservation in genome structure, organization, and gene expression among papillomaviruses. The focused methods are 5'- and 3'-rapid amplification of cDNA ends (RACE), which are techniques commonly used in molecular biology to obtain full-length RNA transcript or to map a transcription start site (TSS) or an RNA polyadenylation (pA) cleavage site. Primer walking RT-PCR is a method for studying the splicing junction of RACE products. In addition, RNase protection assay and primer extension are also introduced as alternative methods in the mapping analysis.
Collapse
Affiliation(s)
- Xiaohong Wang
- Tumor Virus RNA Biology Section, Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland
| | - Zhi-Ming Zheng
- Tumor Virus RNA Biology Section, Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, Maryland
| |
Collapse
|
13
|
The UVS9 gene of Chlamydomonas encodes an XPG homolog with a new conserved domain. DNA Repair (Amst) 2015; 37:33-42. [PMID: 26658142 DOI: 10.1016/j.dnarep.2015.11.003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2015] [Revised: 11/06/2015] [Accepted: 11/16/2015] [Indexed: 11/20/2022]
Abstract
Nucleotide excision repair (NER) is a key pathway for removing DNA damage that destabilizes the DNA double helix. During NER a protein complex coordinates to cleave the damaged DNA strand on both sides of the damage. The resulting lesion-containing oligonucleotide is displaced from the DNA and a replacement strand is synthesized using the undamaged strand as template. Ultraviolet (UV) light is known to induce two primary forms of DNA damage, the cyclobutane pyrimidine dimer and the 6-4 photoproduct, both of which destabilize the DNA double helix. The uvs9 strain of Chlamydomonas reinhardtii was isolated based on its sensitivity to UV light and was subsequently shown to have a defect in NER. In this work, the UVS9 gene was cloned through molecular mapping and shown to encode a homolog of XPG, the structure-specific nuclease responsible for cleaving damaged DNA strands 3' to sites of damage during NER. 3' RACE revealed that the UVS9 transcript is alternatively polyadenylated. The predicted UVS9 protein is nearly twice as long as other XPG homologs, primarily due to an unusually long spacer region. Despite this difference, amino acid sequence alignment of UVS9p with XPG homologs revealed a new conserved domain involved in TFIIH interaction.
Collapse
|
14
|
Abstract
The NCBI manages the SRA (Sequence Read Archive) database to store RNA-Seq data generated from different NGS technologies. With ever increasing finished and ongoing genome and transcriptome sequencing projects, the data in SRA expand rapidly and present a treasure for mining useful information to facilitate our understanding of biological issues like mRNA 3'-end formation and alternative polyadenylation. We developed a bioinformatics pipeline that can process raw SRA sequence data and obtain high quality poly(A) sites and poly(A) cluster sites with detailed expression information. This pipeline is designed to be generic and can be utilized for polyadenylation studies in any eukaryotic species.
Collapse
|
15
|
You L, Wu J, Feng Y, Fu Y, Guo Y, Long L, Zhang H, Luan Y, Tian P, Chen L, Huang G, Huang S, Li Y, Li J, Chen C, Zhang Y, Chen S, Xu A. APASdb: a database describing alternative poly(A) sites and selection of heterogeneous cleavage sites downstream of poly(A) signals. Nucleic Acids Res 2014; 43:D59-67. [PMID: 25378337 PMCID: PMC4383914 DOI: 10.1093/nar/gku1076] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Increasing amounts of genes have been shown to utilize alternative polyadenylation (APA) 3′-processing sites depending on the cell and tissue type and/or physiological and pathological conditions at the time of processing, and the construction of genome-wide database regarding APA is urgently needed for better understanding poly(A) site selection and APA-directed gene expression regulation for a given biology. Here we present a web-accessible database, named APASdb (http://mosas.sysu.edu.cn/utr), which can visualize the precise map and usage quantification of different APA isoforms for all genes. The datasets are deeply profiled by the sequencing alternative polyadenylation sites (SAPAS) method capable of high-throughput sequencing 3′-ends of polyadenylated transcripts. Thus, APASdb details all the heterogeneous cleavage sites downstream of poly(A) signals, and maintains near complete coverage for APA sites, much better than the previous databases using conventional methods. Furthermore, APASdb provides the quantification of a given APA variant among transcripts with different APA sites by computing their corresponding normalized-reads, making our database more useful. In addition, APASdb supports URL-based retrieval, browsing and display of exon-intron structure, poly(A) signals, poly(A) sites location and usage reads, and 3′-untranslated regions (3′-UTRs). Currently, APASdb involves APA in various biological processes and diseases in human, mouse and zebrafish.
Collapse
Affiliation(s)
- Leiming You
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China
| | - Jiexin Wu
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Yuchao Feng
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Yonggui Fu
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Yanan Guo
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Liyuan Long
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Hui Zhang
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Yijie Luan
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Peng Tian
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Liangfu Chen
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Guangrui Huang
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Shengfeng Huang
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Yuxin Li
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Jie Li
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Chengyong Chen
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Yaqing Zhang
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Shangwu Chen
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China
| | - Anlong Xu
- State Key Laboratory of Biocontrol, Guangdong Province Key Laboratory of Pharmaceutical Functional Genes, School of Life Sciences, Sun Yat-Sen University, Higher Education Mega Center, Guangzhou 510006, People's Republic of China School of Basic Medical Sciences, Beijing University of Chinese Medicine, Beijing 100029, People's Republic of China
| |
Collapse
|
16
|
Janbon G, Ormerod KL, Paulet D, Byrnes EJ, Yadav V, Chatterjee G, Mullapudi N, Hon CC, Billmyre RB, Brunel F, Bahn YS, Chen W, Chen Y, Chow EWL, Coppée JY, Floyd-Averette A, Gaillardin C, Gerik KJ, Goldberg J, Gonzalez-Hilarion S, Gujja S, Hamlin JL, Hsueh YP, Ianiri G, Jones S, Kodira CD, Kozubowski L, Lam W, Marra M, Mesner LD, Mieczkowski PA, Moyrand F, Nielsen K, Proux C, Rossignol T, Schein JE, Sun S, Wollschlaeger C, Wood IA, Zeng Q, Neuvéglise C, Newlon CS, Perfect JR, Lodge JK, Idnurm A, Stajich JE, Kronstad JW, Sanyal K, Heitman J, Fraser JA, Cuomo CA, Dietrich FS. Analysis of the genome and transcriptome of Cryptococcus neoformans var. grubii reveals complex RNA expression and microevolution leading to virulence attenuation. PLoS Genet 2014; 10:e1004261. [PMID: 24743168 PMCID: PMC3990503 DOI: 10.1371/journal.pgen.1004261] [Citation(s) in RCA: 276] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 02/07/2014] [Indexed: 02/07/2023] Open
Abstract
Cryptococcus neoformans is a pathogenic basidiomycetous yeast responsible for more than 600,000 deaths each year. It occurs as two serotypes (A and D) representing two varieties (i.e. grubii and neoformans, respectively). Here, we sequenced the genome and performed an RNA-Seq-based analysis of the C. neoformans var. grubii transcriptome structure. We determined the chromosomal locations, analyzed the sequence/structural features of the centromeres, and identified origins of replication. The genome was annotated based on automated and manual curation. More than 40,000 introns populating more than 99% of the expressed genes were identified. Although most of these introns are located in the coding DNA sequences (CDS), over 2,000 introns in the untranslated regions (UTRs) were also identified. Poly(A)-containing reads were employed to locate the polyadenylation sites of more than 80% of the genes. Examination of the sequences around these sites revealed a new poly(A)-site-associated motif (AUGHAH). In addition, 1,197 miscRNAs were identified. These miscRNAs can be spliced and/or polyadenylated, but do not appear to have obvious coding capacities. Finally, this genome sequence enabled a comparative analysis of strain H99 variants obtained after laboratory passage. The spectrum of mutations identified provides insights into the genetics underlying the micro-evolution of a laboratory strain, and identifies mutations involved in stress responses, mating efficiency, and virulence. Cryptococcus neoformans var. grubii is a major human pathogen responsible for deadly meningoencephalitis in immunocompromised patients. Here, we report the sequencing and annotation of its genome. Evidence for extensive intron splicing, antisense transcription, non-coding RNAs, and alternative polyadenylation indicates the potential for highly intricate regulation of gene expression in this opportunistic pathogen. In addition, detailed molecular, genetic, and genomic studies were performed to characterize structural features of the genome, including centromeres and origins of replication. Finally, the phenotypic and genome re-sequencing analysis of a collection of isolates of the reference H99 strain resulting from laboratory passage revealed that microevolutionary processes during in vitro culturing of pathogenic fungi can impact virulence.
Collapse
Affiliation(s)
- Guilhem Janbon
- Institut Pasteur, Unité Biologie et Pathogénicité Fongiques, Département Génomes et Génétique, Paris, France
- INRA, USC2019, Paris, France
- * E-mail: (GJ); (JH); (CAC); (FSD)
| | - Kate L. Ormerod
- University of Queensland, School of Chemistry and Molecular Biosciences, Brisbane, Queensland, Australia
| | - Damien Paulet
- Institut Pasteur, Plate-forme Transcriptome et Epigénome, Département Génomes et Génétique, Paris, France
| | - Edmond J. Byrnes
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
| | - Vikas Yadav
- Jawaharlal Nehru Centre for Advanced Scientific Research, Molecular Biology and Genetics Unit, Bangalore, India
| | - Gautam Chatterjee
- Jawaharlal Nehru Centre for Advanced Scientific Research, Molecular Biology and Genetics Unit, Bangalore, India
| | | | - Chung-Chau Hon
- Institut Pasteur, Unité Biologie Cellulaire du Parasitisme, Département Biologie Cellulaire et Infection, Paris, France
| | - R. Blake Billmyre
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
| | | | - Yong-Sun Bahn
- Yonsei University, Center for Fungal Pathogenesis, Department of Biotechnology, Seoul, Republic of Korea
| | - Weidong Chen
- Rutgers New Jersey Medical School, Department of Microbiology and Molecular Genetics, Newark, New Jersey, United States of America
| | - Yuan Chen
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
| | - Eve W. L. Chow
- University of Queensland, School of Chemistry and Molecular Biosciences, Brisbane, Queensland, Australia
| | - Jean-Yves Coppée
- Institut Pasteur, Plate-forme Transcriptome et Epigénome, Département Génomes et Génétique, Paris, France
| | - Anna Floyd-Averette
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
| | | | - Kimberly J. Gerik
- Washington University School of Medicine, Department of Molecular Microbiology, St. Louis, Missouri, United States of America
| | - Jonathan Goldberg
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Sara Gonzalez-Hilarion
- Institut Pasteur, Unité Biologie et Pathogénicité Fongiques, Département Génomes et Génétique, Paris, France
- INRA, USC2019, Paris, France
| | - Sharvari Gujja
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Joyce L. Hamlin
- University of Virginia, Department of Biochemistry and Molecular Genetics, Charlottesville, Virginia, United States of America
| | - Yen-Ping Hsueh
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
- California Institute of Technology, Division of Biology, Pasadena, California, United States of America
| | - Giuseppe Ianiri
- University of Missouri-Kansas City, School of Biological Sciences, Division of Cell Biology and Biophysics, Kansas City, Missouri, United States of America
| | - Steven Jones
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Chinnappa D. Kodira
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | - Lukasz Kozubowski
- Clemson University, Department of Genetics and Biochemistry, Clemson, South Carolina, United States of America
| | - Woei Lam
- Washington University School of Medicine, Department of Molecular Microbiology, St. Louis, Missouri, United States of America
| | - Marco Marra
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Larry D. Mesner
- University of Virginia, Department of Biochemistry and Molecular Genetics, Charlottesville, Virginia, United States of America
| | - Piotr A. Mieczkowski
- University of North Carolina, Department of Genetics, Chapel Hill, North Carolina, United States of America
| | - Frédérique Moyrand
- Institut Pasteur, Unité Biologie et Pathogénicité Fongiques, Département Génomes et Génétique, Paris, France
- INRA, USC2019, Paris, France
| | - Kirsten Nielsen
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
- University of Minnesota, Microbiology Department, Minneapolis, Minnesota, United States of America
| | - Caroline Proux
- Institut Pasteur, Plate-forme Transcriptome et Epigénome, Département Génomes et Génétique, Paris, France
| | | | - Jacqueline E. Schein
- Canada's Michael Smith Genome Sciences Centre, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Sheng Sun
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
| | - Carolin Wollschlaeger
- Institut Pasteur, Unité Biologie et Pathogénicité Fongiques, Département Génomes et Génétique, Paris, France
- INRA, USC2019, Paris, France
| | - Ian A. Wood
- University of Queensland, School of Mathematics and Physics, Brisbane, Queensland, Australia
| | - Qiandong Zeng
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
| | | | - Carol S. Newlon
- Rutgers New Jersey Medical School, Department of Microbiology and Molecular Genetics, Newark, New Jersey, United States of America
| | - John R. Perfect
- Duke University Medical Center, Duke Department of Medicine and Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
| | - Jennifer K. Lodge
- Washington University School of Medicine, Department of Molecular Microbiology, St. Louis, Missouri, United States of America
| | - Alexander Idnurm
- University of Missouri-Kansas City, School of Biological Sciences, Division of Cell Biology and Biophysics, Kansas City, Missouri, United States of America
| | - Jason E. Stajich
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
- University of California, Department of Plant Pathology & Microbiology, Riverside, California, United States of America
| | - James W. Kronstad
- Michael Smith Laboratories, Department of Microbiology and Immunology, Vancouver, British Columbia, Canada
| | - Kaustuv Sanyal
- Jawaharlal Nehru Centre for Advanced Scientific Research, Molecular Biology and Genetics Unit, Bangalore, India
| | - Joseph Heitman
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
- * E-mail: (GJ); (JH); (CAC); (FSD)
| | - James A. Fraser
- University of Queensland, School of Chemistry and Molecular Biosciences, Brisbane, Queensland, Australia
| | - Christina A. Cuomo
- Broad Institute of MIT and Harvard, Cambridge, Massachusetts, United States of America
- * E-mail: (GJ); (JH); (CAC); (FSD)
| | - Fred S. Dietrich
- Duke University Medical Center, Department of Molecular Genetics and Microbiology, Durham, North Carolina, United States of America
- * E-mail: (GJ); (JH); (CAC); (FSD)
| |
Collapse
|
17
|
Delineating the structural blueprint of the pre-mRNA 3'-end processing machinery. Mol Cell Biol 2014; 34:1894-910. [PMID: 24591651 DOI: 10.1128/mcb.00084-14] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Processing of mRNA precursors (pre-mRNAs) by polyadenylation is an essential step in gene expression. Polyadenylation consists of two steps, cleavage and poly(A) synthesis, and requires multiple cis elements in the pre-mRNA and a megadalton protein complex bearing the two essential enzymatic activities. While genetic and biochemical studies remain the major approaches in characterizing these factors, structural biology has emerged during the past decade to help understand the molecular assembly and mechanistic details of the process. With structural information about more proteins and higher-order complexes becoming available, we are coming closer to obtaining a structural blueprint of the polyadenylation machinery that explains both how this complex functions and how it is regulated and connected to other cellular processes.
Collapse
|
18
|
Siegel TN, Hon CC, Zhang Q, Lopez-Rubio JJ, Scheidig-Benatar C, Martins RM, Sismeiro O, Coppée JY, Scherf A. Strand-specific RNA-Seq reveals widespread and developmentally regulated transcription of natural antisense transcripts in Plasmodium falciparum. BMC Genomics 2014; 15:150. [PMID: 24559473 PMCID: PMC4007998 DOI: 10.1186/1471-2164-15-150] [Citation(s) in RCA: 84] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2013] [Accepted: 02/06/2014] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Advances in high-throughput sequencing have led to the discovery of widespread transcription of natural antisense transcripts (NATs) in a large number of organisms, where these transcripts have been shown to play important roles in the regulation of gene expression. Likewise, the existence of NATs has been observed in Plasmodium but our understanding towards their genome-wide distribution remains incomplete due to the limited depth and uncertainties in the level of strand specificity of previous datasets. RESULTS To gain insights into the genome-wide distribution of NATs in P. falciparum, we performed RNA-ligation based strand-specific RNA sequencing at unprecedented depth. Our data indicate that 78.3% of the genome is transcribed during blood-stage development. Moreover, our analysis reveals significant levels of antisense transcription from at least 24% of protein-coding genes and that while expression levels of NATs change during the intraerythrocytic developmental cycle (IDC), they do not correlate with the corresponding mRNA levels. Interestingly, antisense transcription is not evenly distributed across coding regions (CDSs) but strongly clustered towards the 3'-end of CDSs. Furthermore, for a significant subset of NATs, transcript levels correlate with mRNA levels of neighboring genes.Finally, we were able to identify the polyadenylation sites (PASs) for a subset of NATs, demonstrating that at least some NATs are polyadenylated. We also mapped the PASs of 3443 coding genes, yielding an average 3' untranslated region length of 523 bp. CONCLUSIONS Our strand-specific analysis of the P. falciparum transcriptome expands and strengthens the existing body of evidence that antisense transcription is a substantial phenomenon in P. falciparum. For a subset of neighboring genes we find that sense and antisense transcript levels are intricately linked while other NATs appear to be regulated independently of mRNA transcription. Our deep strand-specific dataset will provide a valuable resource for the precise determination of expression levels as it separates sense from antisense transcript levels, which we find to often significantly differ. In addition, the extensive novel data on 3' UTR length will allow others to perform searches for regulatory motifs in the UTRs and help understand post-translational regulation in P. falciparum.
Collapse
Affiliation(s)
- T Nicolai Siegel
- Biology of Host-Parasite Interactions Unit, Institut Pasteur, Paris, France.
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Zheng D, Tian B. RNA-binding proteins in regulation of alternative cleavage and polyadenylation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2014; 825:97-127. [PMID: 25201104 DOI: 10.1007/978-1-4939-1221-6_3] [Citation(s) in RCA: 36] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Almost all eukaryotic pre-mRNAs are processed at the 3' end by the cleavage and polyadenylation (C/P) reaction, which preludes termination of transcription and gives rise to the poly(A) tail of mature mRNA. Genomic studies in recent years have indicated that most eukaryotic mRNA genes have multiple cleavage and polyadenylation sites (pAs), leading to alternative cleavage and polyadenylation (APA) products. APA isoforms generally differ in their 3' untranslated regions (3' UTRs), but can also have different coding sequences (CDSs). APA expands the repertoire of transcripts expressed from the genome, and is highly regulated under various physiological and pathological conditions. Growing lines of evidence have shown that RNA-binding proteins (RBPs) play important roles in regulation of APA. Some RBPs are part of the machinery for C/P; others influence pA choice through binding to adjacent regions. In this chapter, we review cis elements and trans factors involved in C/P, the significance of APA, and increasingly elucidated roles of RBPs in APA regulation. We also discuss analysis of APA using transcriptome-wide techniques as well as molecular biology approaches.
Collapse
Affiliation(s)
- Dinghai Zheng
- Department of Biochemistry and Molecular Biology, University of Medicine and Dentistry of New Jersey (UMDNJ)-New Jersey Medical School, 185 South Orange Ave., Newark, NJ, 07103, USA
| | | |
Collapse
|
20
|
Majerciak V, Ni T, Yang W, Meng B, Zhu J, Zheng ZM. A viral genome landscape of RNA polyadenylation from KSHV latent to lytic infection. PLoS Pathog 2013; 9:e1003749. [PMID: 24244170 PMCID: PMC3828183 DOI: 10.1371/journal.ppat.1003749] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Accepted: 09/20/2013] [Indexed: 11/30/2022] Open
Abstract
RNA polyadenylation (pA) is one of the major steps in regulation of gene expression at the posttranscriptional level. In this report, a genome landscape of pA sites of viral transcripts in B lymphocytes with Kaposi sarcoma-associated herpesvirus (KSHV) infection was constructed using a modified PA-seq strategy. We identified 67 unique pA sites, of which 55 could be assigned for expression of annotated ∼90 KSHV genes. Among the assigned pA sites, twenty are for expression of individual single genes and the rest for multiple genes (average 2.7 genes per pA site) in cluster-gene loci of the genome. A few novel viral pA sites that could not be assigned to any known KSHV genes are often positioned in the antisense strand to ORF8, ORF21, ORF34, K8 and ORF50, and their associated antisense mRNAs to ORF21, ORF34 and K8 could be verified by 3′RACE. The usage of each mapped pA site correlates to its peak size, the larger (broad and wide) peak size, the more usage and thus, the higher expression of the pA site-associated gene(s). Similar to mammalian transcripts, KSHV RNA polyadenylation employs two major poly(A) signals, AAUAAA and AUUAAA, and is regulated by conservation of cis-elements flanking the mapped pA sites. Moreover, we found two or more alternative pA sites downstream of ORF54, K2 (vIL6), K9 (vIRF1), K10.5 (vIRF3), K11 (vIRF2), K12 (Kaposin A), T1.5, and PAN genes and experimentally validated the alternative polyadenylation for the expression of KSHV ORF54, K11, and T1.5 transcripts. Together, our data provide not only a comprehensive pA site landscape for understanding KSHV genome structure and gene expression, but also the first evidence of alternative polyadenylation as another layer of posttranscriptional regulation in viral gene expression. A genome-wide polyadenylation landscape in the expression of human herpesviruses has not been reported. In this study, we provide the first genome landscape of viral RNA polyadenylation sites in B cells from KSHV latent to lytic infection by using a modified PA-seq protocol and selectively validated by 3′ RACE. We found that KSHV genome contains 67 active pA sites for the expression of its ∼90 genes and a few antisense transcripts. Among the mapped pA sites, a large fraction of them are for the expression of cluster genes and the production of bicistronic or polycistronic transcripts from KSHV genome and only one-third are used for the expression of single genes. We found that the size of individual PA peaks is positively correlated with the usage of corresponding pA site, which is determined by the number of reads within the PA peak from latent to lytic KSHV infection, and the strength of cis-elements surrounding KSHV pA site determines the expression level of viral genes. Lastly, we identified and experimentally validated alternative polyadenylation of KSHV ORF54, T1.5, and K11 during viral lytic infection. To our knowledge, this is the first report on alternative polyadenylation events in KSHV infection.
Collapse
Affiliation(s)
- Vladimir Majerciak
- Tumor Virus RNA Biology Section, Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ting Ni
- DNA Sequencing and Genomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Wenjing Yang
- DNA Sequencing and Genomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Bowen Meng
- Tumor Virus RNA Biology Section, Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Jun Zhu
- DNA Sequencing and Genomics Core, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (JZ); (ZMZ)
| | - Zhi-Ming Zheng
- Tumor Virus RNA Biology Section, Gene Regulation and Chromosome Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (JZ); (ZMZ)
| |
Collapse
|
21
|
Genomewide mapping and screening of Kaposi's sarcoma-associated herpesvirus (KSHV) 3' untranslated regions identify bicistronic and polycistronic viral transcripts as frequent targets of KSHV microRNAs. J Virol 2013; 88:377-92. [PMID: 24155407 DOI: 10.1128/jvi.02689-13] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Kaposi's sarcoma-associated herpesvirus (KSHV) encodes over 90 genes and 25 microRNAs (miRNAs). The KSHV life cycle is tightly regulated to ensure persistent infection in the host. In particular, miRNAs, which primarily exert their effects by binding to the 3' untranslated regions (3'UTRs) of target transcripts, have recently emerged as key regulators of KSHV life cycle. Although studies with RNA cross-linking immunoprecipitation approach have identified numerous targets of KSHV miRNAs, few of these targets are of viral origin because most KSHV 3'UTRs have not been characterized. Thus, the extents of viral genes targeted by KSHV miRNAs remain elusive. Here, we report the mapping of the 3'UTRs of 74 KSHV genes and the effects of KSHV miRNAs on the control of these 3'UTR-mediated gene expressions. This analysis reveals new bicistronic and polycistronic transcripts of KSHV genes. Due to the 5'-distal open reading frames (ORFs), KSHV bicistronic or polycistronic transcripts have significantly longer 3'UTRs than do KSHV monocistronic transcripts. Furthermore, screening of the 3'UTR reporters has identified 28 potential new targets of KSHV miRNAs, of which 11 (39%) are bicistronic or polycistronic transcripts. Reporter mutagenesis demonstrates that miR-K3 specifically targets ORF31-33 transcripts at the lytic locus via two binding sites in the ORF33 coding region, whereas miR-K10a-3p and miR-K10b-3p and their variants target ORF71-73 transcripts at the latent locus through distinct binding sites in both 5'-distal ORFs and intergenic regions. Our results indicate that KSHV miRNAs frequently target the 5'-distal coding regions of bicistronic or polycistronic transcripts and highlight the unique features of KSHV miRNAs in regulating gene expression and life cycle.
Collapse
|
22
|
Sheppard S, Lawson ND, Zhu LJ. Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier. ACTA ACUST UNITED AC 2013; 29:2564-71. [PMID: 23962617 DOI: 10.1093/bioinformatics/btt446] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
MOTIVATION 3' end processing is important for transcription termination, mRNA stability and regulation of gene expression. To identify 3' ends, most techniques use an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Although heuristic filters have been applied in these cases, they typically result in a high proportion of both false-positive and -negative classifications. Therefore, there is a need to develop improved algorithms to better identify mis-priming events in oligo-dT primed sequences. RESULTS By analyzing sequence features flanking 3' ends derived from oligo-dT-based sequencing, we developed a naïve Bayes classifier to classify them as true or false/internally primed. The resulting algorithm is highly accurate, outperforms previous heuristic filters and facilitates identification of novel polyadenylation sites.
Collapse
Affiliation(s)
- Sarah Sheppard
- Program in Gene Function and Expression and Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 364 Plantation St, Worcester, MA 01605, USA
| | | | | |
Collapse
|
23
|
Pereira-Castro I, Costa AMS, Oliveira MJ, Barbosa I, Rocha AS, Azevedo L, da Costa LT. Characterization of human NLZ1/ZNF703 identifies conserved domains essential for proper subcellular localization and transcriptional repression. J Cell Biochem 2013; 114:120-33. [PMID: 22886885 DOI: 10.1002/jcb.24309] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2011] [Accepted: 07/26/2012] [Indexed: 11/06/2022]
Abstract
NET family members have recently emerged as important players in the development of multiple structures, from the trachea of fly larvae to the vertebrate eye and human breast cancers. However, their mechanisms of action are still poorly understood, and we lack a detailed characterization of their functional domains, as well as gene expression patterns-particularly in adult mammals. Here, we present a characterization of human NLZ1/ZNF703 (NocA-like zinc finger 1/Zinc finger 703), one of the two human NET family member genes. We show that the gene is ubiquitously expressed in adult human and mouse tissues, that three mRNA species with the same coding sequence are generated by alternative polyadenylation, and that the encoded protein contains six evolutionarily conserved domains, three of which are specific to NET proteins. Finally, we present functional evidence that these domains are necessary for proper subcellular distribution of and transcription repression by the NLZ1 protein, but not for its interaction with Groucho family co-repressors.
Collapse
Affiliation(s)
- Isabel Pereira-Castro
- IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
| | | | | | | | | | | | | |
Collapse
|
24
|
Franzén O, Jerlström-Hultqvist J, Einarsson E, Ankarklev J, Ferella M, Andersson B, Svärd SG. Transcriptome profiling of Giardia intestinalis using strand-specific RNA-seq. PLoS Comput Biol 2013; 9:e1003000. [PMID: 23555231 PMCID: PMC3610916 DOI: 10.1371/journal.pcbi.1003000] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2012] [Accepted: 02/02/2013] [Indexed: 01/08/2023] Open
Abstract
Giardia intestinalis is a common cause of diarrheal disease and it consists of eight genetically distinct genotypes or assemblages (A-H). Only assemblages A and B infect humans and are suggested to represent two different Giardia species. Correlations exist between assemblage type and host-specificity and to some extent symptoms. Phenotypical differences have been documented between assemblages and genome sequences are available for A, B and E. We have characterized and compared the polyadenylated transcriptomes of assemblages A, B and E. Four genetically different isolates were studied (WB (AI), AS175 (AII), P15 (E) and GS (B)) using paired-end, strand-specific RNA-seq. Most of the genome was transcribed in trophozoites grown in vitro, but at vastly different levels. RNA-seq confirmed many of the present annotations and refined the current genome annotation. Gene expression divergence was found to recapitulate the known phylogeny, and uncovered lineage-specific differences in expression. Polyadenylation sites were mapped for over 70% of the genes and revealed many examples of conserved and unexpectedly long 3′ UTRs. 28 open reading frames were found in a non-transcribed gene cluster on chromosome 5 of the WB isolate. Analysis of allele-specific expression revealed a correlation between allele-dosage and allele expression in the GS isolate. Previously reported cis-splicing events were confirmed and global mapping of cis-splicing identified only one novel intron. These observations can possibly explain differences in host-preference and symptoms, and it will be the basis for further studies of Giardia pathogenesis and biology. Giardia is a single cell intestinal parasite and a common cause of diarrhea in humans and animals. Giardia is an unusual eukaryote by possessing two nuclei, a highly reduced genome and simple transcriptional apparatus. We have characterized the transcriptome of Giardia at single nucleotide resolution, which allowed the calculation of digital gene expression values for the complete set of genes. We performed a comparison of gene expression divergence across three genotypes. Most of the genes were transcribed, and the data were used to refine and correct gene models. Several gene expression differences were identified between the genotypes. A non-transcribed cluster of genes was detected on chromosome 5, likely representing a silenced region. The data also allowed mapping of transcript termini, which provided the first global view of 3′ untranslated regions in this parasite. This study also gives the first genome-wide evidence of transcription of allelic variants in Giardia. In this study, we provide novel insights into the transcriptome of an important human pathogen and model eukaryote. The findings reported here likely relate to the lifestyle of this parasite and its adaptation to parasitism. The data provide starting points for functional investigation of Giardia's biology and diplomonads generally.
Collapse
Affiliation(s)
- Oscar Franzén
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | | | - Elin Einarsson
- Department of Cell and Molecular Biology, BMC, Uppsala University, Uppsala, Sweden
| | - Johan Ankarklev
- Department of Cell and Molecular Biology, BMC, Uppsala University, Uppsala, Sweden
| | - Marcela Ferella
- Department of Cell and Molecular Biology, BMC, Uppsala University, Uppsala, Sweden
| | - Björn Andersson
- Department of Cell and Molecular Biology, Karolinska Institutet, Stockholm, Sweden
| | - Staffan G. Svärd
- Department of Cell and Molecular Biology, BMC, Uppsala University, Uppsala, Sweden
- * E-mail:
| |
Collapse
|
25
|
Rigault C, Le Borgne F, Tazir B, Benani A, Demarquoy J. A high-fat diet increases L-carnitine synthesis through a differential maturation of the Bbox1 mRNAs. BIOCHIMICA ET BIOPHYSICA ACTA 2013; 1831:370-7. [PMID: 23127966 DOI: 10.1016/j.bbalip.2012.10.007] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2012] [Revised: 10/15/2012] [Accepted: 10/26/2012] [Indexed: 12/30/2022]
Abstract
l-carnitine is a key molecule in both mitochondrial and peroxisomal lipid metabolisms. l-carnitine is biosynthesized from gamma-butyrobetaine by a reaction catalyzed by the gamma-butyrobetaine hydroxylase (Bbox1). The aim of this work was to identify molecular mechanisms involved in the regulation of l-carnitine biosynthesis and availability. Using 3' RACE, we identified four alternatively polyadenylated Bbox1 mRNAs in rat liver. We utilized a combination of in vitro experiments using hybrid constructs containing the Bbox1 3' UTR and in vivo experiments on rat liver mRNAs to reveal specificities in the different Bbox1 mRNA isoforms, especially in terms of polyadenylation efficiency, mRNA stability and translation efficiency. This complex maturation process of the Bbox1 mRNAs in the liver was studied on rats fed a high-fat diet. High-fat diet selectively increased the level of three Bbox1 mRNA isoforms in rat liver and the alternative use of polyadenylation sites contributed to the global increase in Bbox1 enzymatic activity and l-carnitine levels. Our results show that the maturation of Bbox1 mRNAs is nutritionally regulated in the liver through a selective polyadenylation process to adjust l-carnitine biosynthesis to the energy supply.
Collapse
Affiliation(s)
- Caroline Rigault
- Université de Bourgogne, BioperoxIL, EA 7270, Faculté Gabriel, 6 blvd Gabriel, 21000 Dijon, France
| | | | | | | | | |
Collapse
|
26
|
Rehfeld A, Plass M, Krogh A, Friis-Hansen L. Alterations in polyadenylation and its implications for endocrine disease. Front Endocrinol (Lausanne) 2013; 4:53. [PMID: 23658553 PMCID: PMC3647115 DOI: 10.3389/fendo.2013.00053] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 01/10/2013] [Accepted: 04/22/2013] [Indexed: 12/17/2022] Open
Abstract
INTRODUCTION Polyadenylation is the process in which the pre-mRNA is cleaved at the poly(A) site and a poly(A) tail is added - a process necessary for normal mRNA formation. Genes with multiple poly(A) sites can undergo alternative polyadenylation (APA), producing distinct mRNA isoforms with different 3' untranslated regions (3' UTRs) and in some cases different coding regions. Two thirds of all human genes undergo APA. The efficiency of the polyadenylation process regulates gene expression and APA plays an important part in post-transcriptional regulation, as the 3' UTR contains various cis-elements associated with post-transcriptional regulation, such as target sites for micro-RNAs and RNA-binding proteins. Implications of alterations in polyadenylation for endocrine disease: Alterations in polyadenylation have been found to be causative of neonatal diabetes and IPEX (immune dysfunction, polyendocrinopathy, enteropathy, X-linked) and to be associated with type I and II diabetes, pre-eclampsia, fragile X-associated premature ovarian insufficiency, ectopic Cushing syndrome, and many cancer diseases, including several types of endocrine tumor diseases. PERSPECTIVES Recent developments in high-throughput sequencing have made it possible to characterize polyadenylation genome-wide. Antisense elements inhibiting or enhancing specific poly(A) site usage can induce desired alterations in polyadenylation, and thus hold the promise of new therapeutic approaches. SUMMARY This review gives a detailed description of alterations in polyadenylation in endocrine disease, an overview of the current literature on polyadenylation and summarizes the clinical implications of the current state of research in this field.
Collapse
Affiliation(s)
- Anders Rehfeld
- Genomic Medicine, Rigshospitalet, Copenhagen University HospitalCopenhagen, Denmark
| | - Mireya Plass
- Department of Biology, The Bioinformatics Centre, University of CopenhagenCopenhagen, Denmark
| | - Anders Krogh
- Department of Biology, The Bioinformatics Centre, University of CopenhagenCopenhagen, Denmark
| | - Lennart Friis-Hansen
- Genomic Medicine, Rigshospitalet, Copenhagen University HospitalCopenhagen, Denmark
- *Correspondence: Lennart Friis-Hansen, Genomic Medicine, Rigshospitalet, Copenhagen University Hospital, 4113, Blegdamsvej 9, DK2100 Copenhagen, Denmark. e-mail:
| |
Collapse
|
27
|
Hon CC, Weber C, Sismeiro O, Proux C, Koutero M, Deloger M, Das S, Agrahari M, Dillies MA, Jagla B, Coppee JY, Bhattacharya A, Guillen N. Quantification of stochastic noise of splicing and polyadenylation in Entamoeba histolytica. Nucleic Acids Res 2012; 41:1936-52. [PMID: 23258700 PMCID: PMC3561952 DOI: 10.1093/nar/gks1271] [Citation(s) in RCA: 60] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Alternative splicing and polyadenylation were observed pervasively in eukaryotic messenger RNAs. These alternative isoforms could either be consequences of physiological regulation or stochastic noise of RNA processing. To quantify the extent of stochastic noise in splicing and polyadenylation, we analyzed the alternative usage of splicing and polyadenylation sites in Entamoeba histolytica using RNA-Seq. First, we identified a large number of rarely spliced alternative junctions and then showed that the occurrence of these alternative splicing events is correlated with splicing site sequence, occurrence of constitutive splicing events and messenger RNA abundance. Our results implied the majority of these alternative splicing events are likely to be stochastic error of splicing machineries, and we estimated the corresponding error rates. Second, we observed extensive microheterogeneity of polyadenylation cleavage sites, and the extent of such microheterogeneity is correlated with the occurrence of constitutive cleavage events, suggesting most of such microheterogeneity is likely to be stochastic. Overall, we only observed a small fraction of alternative splicing and polyadenylation isoforms that are unlikely to be solely stochastic, implying the functional relevance of alternative splicing and polyadenylation in E. histolytica is limited. Lastly, we revised the gene models and annotated their 3′UTR in AmoebaDB, providing valuable resources to the community.
Collapse
Affiliation(s)
- Chung-Chau Hon
- Institut Pasteur, Unité Biologie Cellulaire du Parasitisme, Département Biologie cellulaire et infection, F-75015 Paris, France, INSERM U786, F-75015 Paris, France.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
28
|
Abstract
Recent studies have revealed widespread mRNA alternative polyadenylation (APA) in eukaryotes and its dynamic spatial and temporal regulation. APA not only generates proteomic and functional diversity, but also plays important roles in regulating gene expression. Global deregulation of APA has been demonstrated in a variety of human diseases. Recent exciting advances in the field have been made possible in a large part by high throughput analyses using newly developed experimental tools. Here I review the recent progress in global studies of APA and the insights that have emerged from these and other studies that use more conventional methods.
Collapse
Affiliation(s)
- Yongsheng Shi
- Department of Microbiology and Molecular Genetics, School of Medicine, University of California, Irvine, Irvine, California 92697, USA.
| |
Collapse
|
29
|
Lin Y, Li Z, Ozsolak F, Kim SW, Arango-Argoty G, Liu TT, Tenenbaum SA, Bailey T, Monaghan AP, Milos PM, John B. An in-depth map of polyadenylation sites in cancer. Nucleic Acids Res 2012; 40:8460-71. [PMID: 22753024 PMCID: PMC3458571 DOI: 10.1093/nar/gks637] [Citation(s) in RCA: 115] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2011] [Revised: 05/16/2012] [Accepted: 06/06/2012] [Indexed: 12/22/2022] Open
Abstract
We present a comprehensive map of over 1 million polyadenylation sites and quantify their usage in major cancers and tumor cell lines using direct RNA sequencing. We built the Expression and Polyadenylation Database to enable the visualization of the polyadenylation maps in various cancers and to facilitate the discovery of novel genes and gene isoforms that are potentially important to tumorigenesis. Analyses of polyadenylation sites indicate that a large fraction (∼30%) of mRNAs contain alternative polyadenylation sites in their 3' untranslated regions, independent of the cell type. The shortest 3' untranslated region isoforms are preferentially upregulated in cancer tissues, genome-wide. Candidate targets of alternative polyadenylation-mediated upregulation of short isoforms include POLR2K, and signaling cascades of cell-cell and cell-extracellular matrix contact, particularly involving regulators of Rho GTPases. Polyadenylation maps also helped to improve 3' untranslated region annotations and identify candidate regulatory marks such as sequence motifs, H3K36Me3 and Pabpc1 that are isoform dependent and occur in a position-specific manner. In summary, these results highlight the need to go beyond monitoring only the cumulative transcript levels for a gene, to separately analysing the expression of its RNA isoforms.
Collapse
Affiliation(s)
- Yuefeng Lin
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Zhihua Li
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Fatih Ozsolak
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Sang Woo Kim
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Gustavo Arango-Argoty
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Teresa T. Liu
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Scott A. Tenenbaum
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Timothy Bailey
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - A. Paula Monaghan
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Patrice M. Milos
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| | - Bino John
- Department of Computational and Systems Biology, University of Pittsburgh School of Medicine, Pittsburgh, PA 15260, Helicos BioSciences Corporation, One Kendall Square, Cambridge, MA 02139, College of Nanoscale Science and Engineering, University at Albany-Suny, Albany, NY, USA, Institute for Molecular Bioscience, the University of Queensland, Queensland, Australia and Department of Neurobiology, University of Pittsburgh, 3501 Fifth Avenue, Pittsburgh, PA 15260, USA
| |
Collapse
|
30
|
de Klerk E, Venema A, Anvar SY, Goeman JJ, Hu O, Trollet C, Dickson G, den Dunnen JT, van der Maarel SM, Raz V, 't Hoen PAC. Poly(A) binding protein nuclear 1 levels affect alternative polyadenylation. Nucleic Acids Res 2012; 40:9089-101. [PMID: 22772983 PMCID: PMC3467053 DOI: 10.1093/nar/gks655] [Citation(s) in RCA: 126] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The choice for a polyadenylation site determines the length of the 3′-untranslated region (3′-UTRs) of an mRNA. Inclusion or exclusion of regulatory sequences in the 3′-UTR may ultimately affect gene expression levels. Poly(A) binding protein nuclear 1 (PABPN1) is involved in polyadenylation of pre-mRNAs. An alanine repeat expansion in PABPN1 (exp-PABPN1) causes oculopharyngeal muscular dystrophy (OPMD). We hypothesized that previously observed disturbed gene expression patterns in OPMD muscles may have been the result of an effect of PABPN1 on alternative polyadenylation, influencing mRNA stability, localization and translation. A single molecule polyadenylation site sequencing method was developed to explore polyadenylation site usage on a genome-wide level in mice overexpressing exp-PABPN1. We identified 2012 transcripts with altered polyadenylation site usage. In the far majority, more proximal alternative polyadenylation sites were used, resulting in shorter 3′-UTRs. 3′-UTR shortening was generally associated with increased expression. Similar changes in polyadenylation site usage were observed after knockdown or overexpression of expanded but not wild-type PABPN1 in cultured myogenic cells. Our data indicate that PABPN1 is important for polyadenylation site selection and that reduced availability of functional PABPN1 in OPMD muscles results in use of alternative polyadenylation sites, leading to large-scale deregulation of gene expression.
Collapse
Affiliation(s)
- Eleonora de Klerk
- Center for Human and Clinical Genetics, Leiden University Medical Center, 2300 RC Leiden, The Netherlands
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
31
|
Lopes-Marques M, Pereira-Castro I, Amorim A, Azevedo L. Characterization of the human ornithine transcarbamylase 3' untranslated regulatory region. DNA Cell Biol 2011; 31:427-33. [PMID: 22054066 DOI: 10.1089/dna.2011.1391] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Mutations in the untranslated regulatory regions of genes may result in abnormal gene expression or transcriptional regulation. In this study, we characterize the ornithine transcarbamylase (OTC) mRNA isoforms of the X-linked OTC gene involved in the urea formation in the liver. Our data revealed that two major transcripts (OTC-t1 and OTC-t2) are more highly expressed than any of the other isoforms in all the tissues analyzed, though a longer transcript (OTC-t3) was also isolated and characterized from the brain sample. The OTC-t2 sequence fully matches the OTC mRNA reference sequence (NM_000531.5). All three isoforms use a canonical AAUAAA hexamer that is predicted to fold into a hairpin secondary structure which might be exposed to the cleavage and polyadenylation specificity factor. In addition, we observed that the OTC-t1 and OTC-t2 transcripts display heterogeneity at the cleavage sites in a tissue-dependent manner. Taken together, our data demonstrate that several mRNA isoforms are transcribed from the OTC gene, thereby indicating a wide degree of variability in post-transcriptional regulation.
Collapse
Affiliation(s)
- Monica Lopes-Marques
- Population Genetics Group, IPATIMUP-Institute of Molecular Pathology and Immunology of the University of Porto, Porto, Portugal
| | | | | | | |
Collapse
|
32
|
Shepard PJ, Choi EA, Lu J, Flanagan LA, Hertel KJ, Shi Y. Complex and dynamic landscape of RNA polyadenylation revealed by PAS-Seq. RNA (NEW YORK, N.Y.) 2011; 17:761-72. [PMID: 21343387 PMCID: PMC3062186 DOI: 10.1261/rna.2581711] [Citation(s) in RCA: 327] [Impact Index Per Article: 25.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2010] [Accepted: 01/11/2011] [Indexed: 05/20/2023]
Abstract
Alternative polyadenylation (APA) of mRNAs has emerged as an important mechanism for post-transcriptional gene regulation in higher eukaryotes. Although microarrays have recently been used to characterize APA globally, they have a number of serious limitations that prevents comprehensive and highly quantitative analysis. To better characterize APA and its regulation, we have developed a deep sequencing-based method called Poly(A) Site Sequencing (PAS-Seq) for quantitatively profiling RNA polyadenylation at the transcriptome level. PAS-Seq not only accurately and comprehensively identifies poly(A) junctions in mRNAs and noncoding RNAs, but also provides quantitative information on the relative abundance of polyadenylated RNAs. PAS-Seq analyses of human and mouse transcriptomes showed that 40%-50% of all expressed genes produce alternatively polyadenylated mRNAs. Furthermore, our study detected evolutionarily conserved polyadenylation of histone mRNAs and revealed novel features of mitochondrial RNA polyadenylation. Finally, PAS-Seq analyses of mouse embryonic stem (ES) cells, neural stem/progenitor (NSP) cells, and neurons not only identified more poly(A) sites than what was found in the entire mouse EST database, but also detected significant changes in the global APA profile that lead to lengthening of 3' untranslated regions (UTR) in many mRNAs during stem cell differentiation. Together, our PAS-Seq analyses revealed a complex landscape of RNA polyadenylation in mammalian cells and the dynamic regulation of APA during stem cell differentiation.
Collapse
Affiliation(s)
- Peter J Shepard
- Department of Microbiology and Molecular Genetics, University of California at Irvine, Irvine, California 92697, USA
| | | | | | | | | | | |
Collapse
|
33
|
Liu X, Jiang Y, Russell JE. A potential regulatory role for mRNA secondary structures within the prothrombin 3'UTR. Thromb Res 2010; 126:130-6. [PMID: 20553951 DOI: 10.1016/j.thromres.2010.04.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2010] [Revised: 03/10/2010] [Accepted: 04/20/2010] [Indexed: 11/20/2022]
Abstract
The distal 3'UTR of prothrombin mRNA exhibits significant sequence heterogeneity reflecting an inexact 3'-cleavage/polyadenylation reaction. This same region encompasses a single-nucleotide polymorphism that enhances the normal post-transcriptional processing of nascent prothrombin transcripts. Both observations indicate the importance of 3'UTR structures to physiologically relevant properties of prothrombin mRNA. Using a HepG2-based model system, we mapped both the primary structures of reporter mRNAs containing the prothrombin 3'UTR, as well as the secondary structures of common, informative 3'UTR processing variants. A chromatographic method was subsequently employed to assess the effects of structural heterogeneities on the binding of candidate trans-acting regulatory factors. We observed that prothrombin 3'UTRs are constitutively polyadenylated at seven or more positions, and can fold into at least two distinct stem-loop conformations. These alternate structures expose/sequester a consensus binding site for hnRNP-I/PTB-1, a trans-acting factor with post-transcriptional regulatory properties. hnRNP-I/PTB-1 exhibits different affinities for the alternate 3'UTR secondary structures in vitro, predicting a corresponding regulatory role in vivo. These analyses demonstrate a critical link between the structure of the prothrombin 3'UTR and its normal function, providing a basis for further investigations into the molecular pathophysiology of naturally occurring polymorphisms within this region.
Collapse
Affiliation(s)
- Xingge Liu
- Department of Medicine (Hematology-Oncology), University of Pennsylvania School of Medicine and The Children's Hospital of Philadelphia, Philadelphia, PA 19104, USA
| | | | | |
Collapse
|
34
|
Chan S, Choi EA, Shi Y. Pre-mRNA 3'-end processing complex assembly and function. WILEY INTERDISCIPLINARY REVIEWS-RNA 2010; 2:321-35. [PMID: 21957020 DOI: 10.1002/wrna.54] [Citation(s) in RCA: 114] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
The 3'-ends of almost all eukaryotic mRNAs are formed in a two-step process, an endonucleolytic cleavage followed by polyadenylation (the addition of a poly-adenosine or poly(A) tail). These reactions take place in the pre-mRNA 3' processing complex, a macromolecular machinery that consists of more than 20 proteins. A general framework for how the pre-mRNA 3' processing complex assembles and functions has emerged from extensive studies over the past several decades using biochemical, genetic, computational, and structural approaches. In this article, we review what we have learned about this important cellular machine and discuss the remaining questions and future challenges.
Collapse
Affiliation(s)
- Serena Chan
- Department of Microbiology and Molecular Genetics, University of California, Irvine, CA, USA
| | | | | |
Collapse
|
35
|
The transcriptome of the human pathogen Trypanosoma brucei at single-nucleotide resolution. PLoS Pathog 2010; 6:e1001090. [PMID: 20838601 PMCID: PMC2936537 DOI: 10.1371/journal.ppat.1001090] [Citation(s) in RCA: 225] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Accepted: 08/06/2010] [Indexed: 12/30/2022] Open
Abstract
The genome of Trypanosoma brucei, the causative agent of African trypanosomiasis, was published five years ago, yet identification of all genes and their transcripts remains to be accomplished. Annotation is challenged by the organization of genes transcribed by RNA polymerase II (Pol II) into long unidirectional gene clusters with no knowledge of how transcription is initiated. Here we report a single-nucleotide resolution genomic map of the T. brucei transcriptome, adding 1,114 new transcripts, including 103 non-coding RNAs, confirming and correcting many of the annotated features and revealing an extensive heterogeneity of 5′ and 3′ ends. Some of the new transcripts encode polypeptides that are either conserved in T. cruzi and Leishmania major or were previously detected in mass spectrometry analyses. High-throughput RNA sequencing (RNA-Seq) was sensitive enough to detect transcripts at putative Pol II transcription initiation sites. Our results, as well as recent data from the literature, indicate that transcription initiation is not solely restricted to regions at the beginning of gene clusters, but may occur at internal sites. We also provide evidence that transcription at all putative initiation sites in T. brucei is bidirectional, a recently recognized fundamental property of eukaryotic promoters. Our results have implications for gene expression patterns in other important human pathogens with similar genome organization (Trypanosoma cruzi, Leishmania sp.) and revealed heterogeneity in pre-mRNA processing that could potentially contribute to the survival and success of the parasite population in the insect vector and the mammalian host. Identifying genes essential for survival in the host is fundamental to unraveling the biology of human pathogens and understanding mechanisms of pathogenesis. The protozoan parasite Trypanosoma brucei causes devastating diseases in humans and animals in sub-Saharan Africa, and the publication in 2005 of the genome sequence provided the first glance at the coding potential of this organism. Although at present there is a catalogue of predicted protein coding genes, the challenge remains to identify all authentic genes, including their boundaries. We used next generation RNA sequencing (RNA-Seq) to map transcribed regions and RNA polymerase II transcription initiation sites on a genome-wide scale. This approach allowed us to improve and correct the current annotation, to reveal a widespread heterogeneity of RNA processing sites (trans-splicing and polyadenylation) and to estimate that most genes are expressed at levels corresponding to 1 to 10 mRNAs per cell. Our data indicate that different transcript forms representing the same gene are present stochastically within the mRNA population. This unanticipated scenario may contribute to determining gene expression landscapes to adapt to different environments in the parasite life cycle.
Collapse
|
36
|
Transcriptional and structural analyses of Amsacta moorei entomopoxvirus protein kinase gene (AMV197, pk). ANN MICROBIOL 2010. [DOI: 10.1007/s13213-010-0082-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022] Open
|
37
|
Zaretzki RL, Gilchrist MA, Briggs WM, Armagan A. Bias correction and Bayesian analysis of aggregate counts in SAGE libraries. BMC Bioinformatics 2010; 11:72. [PMID: 20128916 PMCID: PMC2829012 DOI: 10.1186/1471-2105-11-72] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2009] [Accepted: 02/03/2010] [Indexed: 12/02/2022] Open
Abstract
Background Tag-based techniques, such as SAGE, are commonly used to sample the mRNA pool of an organism's transcriptome. Incomplete digestion during the tag formation process may allow for multiple tags to be generated from a given mRNA transcript. The probability of forming a tag varies with its relative location. As a result, the observed tag counts represent a biased sample of the actual transcript pool. In SAGE this bias can be avoided by ignoring all but the 3' most tag but will discard a large fraction of the observed data. Taking this bias into account should allow more of the available data to be used leading to increased statistical power. Results Three new hierarchical models, which directly embed a model for the variation in tag formation probability, are proposed and their associated Bayesian inference algorithms are developed. These models may be applied to libraries at both the tag and aggregate level. Simulation experiments and analysis of real data are used to contrast the accuracy of the various methods. The consequences of tag formation bias are discussed in the context of testing differential expression. A description is given as to how these algorithms can be applied in that context. Conclusions Several Bayesian inference algorithms that account for tag formation effects are compared with the DPB algorithm providing clear evidence of superior performance. The accuracy of inferences when using a particular non-informative prior is found to depend on the expression level of a given gene. The multivariate nature of the approach easily allows both univariate and joint tests of differential expression. Calculations demonstrate the potential for false positive and negative findings due to variation in tag formation probabilities across samples when testing for differential expression.
Collapse
Affiliation(s)
- Russell L Zaretzki
- Department of Statistics, Operations, and Management Science, The University of Tennessee, 331 Stokely Management Center, Knoxville, TN 37996, USA.
| | | | | | | |
Collapse
|
38
|
Wang P, Yu P, Gao P, Shi T, Ma D. Discovery of novel human transcript variants by analysis of intronic single-block EST with polyadenylation site. BMC Genomics 2009; 10:518. [PMID: 19906316 PMCID: PMC2784480 DOI: 10.1186/1471-2164-10-518] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2009] [Accepted: 11/12/2009] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Alternative polyadenylation sites within a gene can lead to alternative transcript variants. Although bioinformatic analysis has been conducted to detect polyadenylation sites using nucleic acid sequences (EST/mRNA) in the public databases, one special type, single-block EST is much less emphasized. This bias leaves a large space to discover novel transcript variants. RESULTS In the present study, we identified novel transcript variants in the human genome by detecting intronic polyadenylation sites. Poly(A/T)-tailed ESTs were obtained from single-block ESTs and clustered into 10,844 groups standing for 5,670 genes. Most sites were not found in other alternative splicing databases. To verify that these sites are from expressed transcripts, we analyzed the supporting EST number of each site, blasted representative ESTs against known mRNA sequences, traced terminal sequences from cDNA clones, and compared with the data of Affymetrix tiling array. These analyses confirmed about 84% (9,118/10,844) of the novel alternative transcripts, especially, 33% (3,575/10,844) of the transcripts from 2,704 genes were taken as high-reliability. Additionally, RT-PCR confirmed 38% (10/26) of predicted novel transcript variants. CONCLUSION Our results provide evidence for novel transcript variants with intronic poly(A) sites. The expression of these novel variants was confirmed with computational and experimental tools. Our data provide a genome-wide resource for identification of novel human transcript variants with intronic polyadenylation sites, and offer a new view into the mystery of the human transcriptome.
Collapse
Affiliation(s)
- Pingzhang Wang
- Chinese National Human Genome Center, #3-707 North YongChang Road BDA, Beijing, PR China.
| | | | | | | | | |
Collapse
|
39
|
't Hoen PAC, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RHAM, de Menezes RX, Boer JM, van Ommen GJB, den Dunnen JT. Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res 2008; 36:e141. [PMID: 18927111 PMCID: PMC2588528 DOI: 10.1093/nar/gkn705] [Citation(s) in RCA: 560] [Impact Index Per Article: 35.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The hippocampal expression profiles of wild-type mice and mice transgenic for δC-doublecortin-like kinase were compared with Solexa/Illumina deep sequencing technology and five different microarray platforms. With Illumina's digital gene expression assay, we obtained ∼2.4 million sequence tags per sample, their abundance spanning four orders of magnitude. Results were highly reproducible, even across laboratories. With a dedicated Bayesian model, we found differential expression of 3179 transcripts with an estimated false-discovery rate of 8.5%. This is a much higher figure than found for microarrays. The overlap in differentially expressed transcripts found with deep sequencing and microarrays was most significant for Affymetrix. The changes in expression observed by deep sequencing were larger than observed by microarrays or quantitative PCR. Relevant processes such as calmodulin-dependent protein kinase activity and vesicle transport along microtubules were found affected by deep sequencing but not by microarrays. While undetectable by microarrays, antisense transcription was found for 51% of all genes and alternative polyadenylation for 47%. We conclude that deep sequencing provides a major advance in robustness, comparability and richness of expression profiling data and is expected to boost collaborative, comparative and integrative genomics studies.
Collapse
Affiliation(s)
- Peter A C 't Hoen
- The Center for Human and Clinical Genetics, Leiden University Medical Center, Leiden, The Netherlands.
| | | | | | | | | | | | | | | | | |
Collapse
|
40
|
Characterization of a nonclassical class I MHC gene in a reptile, the Galápagos marine iguana (Amblyrhynchus cristatus). PLoS One 2008; 3:e2859. [PMID: 18682845 PMCID: PMC2483932 DOI: 10.1371/journal.pone.0002859] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2008] [Accepted: 06/24/2008] [Indexed: 11/19/2022] Open
Abstract
Squamates are a diverse order of vertebrates, representing more than 7,000 species. Yet, descriptions of full-length major histocompatibility complex (MHC) genes in this group are nearly absent from the literature, while the number of MHC studies continues to rise in other vertebrate taxa. The lack of basic information about MHC organization in squamates inhibits investigation into the relationship between MHC polymorphism and disease, and leaves a large taxonomic gap in our understanding of amniote MHC evolution. Here, we use both cDNA and genomic sequence data to characterize a class I MHC gene (Amcr-UA) from the Galápagos marine iguana, a member of the squamate subfamily Iguaninae. Amcr-UA appears to be functional since it is expressed in the blood and contains many of the conserved peptide-binding residues that are found in classical class I genes of other vertebrates. In addition, comparison of Amcr-UA to homologous sequences from other iguanine species shows that the antigen-binding portion of this gene is under purifying selection, rather than balancing selection, and therefore may have a conserved function. A striking feature of Amcr-UA is that both the cDNA and genomic sequences lack the transmembrane and cytoplasmic domains that are necessary to anchor the class I receptor molecule into the cell membrane, suggesting that the product of this gene is secreted and consequently not involved in classical class I antigen-presentation. The truncated and conserved character of Amcr-UA lead us to define it as a nonclassical gene that is related to the few available squamate class I sequences. However, phylogenetic analysis placed Amcr-UA in a basal position relative to other published classical MHC genes from squamates, suggesting that this gene diverged near the beginning of squamate diversification.
Collapse
|
41
|
Carpentier SC, Coemans B, Podevin N, Laukens K, Witters E, Matsumura H, Terauchi R, Swennen R, Panis B. Functional genomics in a non-model crop: transcriptomics or proteomics? PHYSIOLOGIA PLANTARUM 2008; 133:117-30. [PMID: 18312499 DOI: 10.1111/j.1399-3054.2008.01069.x] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
There is no question that protein- and RNA-based measurements are complementary, but which approach has the highest return in the case of a non-model crop and what is the correlation between mRNA and proteins? We describe and evaluate in detail the advantages and pitfalls of both a proteomics and a transcriptomics approach. The information on the abundance of transcripts was obtained by serial analysis of gene expression (SAGE), while information on the abundance of proteins was obtained via two-dimensional gel electrophoresis.
Collapse
|
42
|
Zhu J, He F, Wang J, Yu J. Modeling transcriptome based on transcript-sampling data. PLoS One 2008; 3:e1659. [PMID: 18286206 PMCID: PMC2243018 DOI: 10.1371/journal.pone.0001659] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2007] [Accepted: 01/21/2008] [Indexed: 01/10/2023] Open
Abstract
Background Newly-evolved multiplex sequencing technology has been bringing transcriptome sequencing into an unprecedented depth. Millions of transcript tags now can be acquired in a single experiment through parallelization. The significant increase in throughput and reduction in cost required us to address some fundamental questions, such as how many transcript tags do we have to sequence for a given transcriptome? How could we estimate the total number of unique transcripts for different cell types (transcriptome diversity) and the distribution of their copy numbers (transcriptome dynamics)? What is the probability that a transcript with a given expression level to be detected at a certain sampling depth? Methodology/Principal Findings We developed a statistical model to evaluate these parameters based on transcriptome-sampling data. Three mixture models were exploited for their potentials to model the sampling frequencies. We demonstrated that relative abundances of all transcripts in a transcriptome follow the generalized inverse Gaussian distribution. The widely known beta and gamma distributions failed to fulfill the singular characteristics of relative abundance distribution, i.e., highly skewed toward zero and with a long tail. An estimator of transcriptome diversity and an analytical form of sampling growth curve were proposed in a coherent framework. Experimental data fitted this model very well and Monte Carlo simulations based on this model replicated sampling experiments in a remarkable precision. Conclusions Taking human embryonic stem cell as a prototype, we demonstrated that sequencing tens of thousands of transcript tags in an ordinary EST/SAGE experiment was far from sufficient. In order to fully characterize a human transcriptome, millions of transcript tags had to be sequenced. This model lays a statistical basis for transcriptome-sampling experiments and in essence can be used in all sampling-based data.
Collapse
Affiliation(s)
- Jiang Zhu
- Chinese Academy of Sciences (CAS) Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Graduate University of Chinese Academy of Sciences, Beijing, China
| | - Fuhong He
- Chinese Academy of Sciences (CAS) Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- Graduate University of Chinese Academy of Sciences, Beijing, China
| | - Jing Wang
- Chinese Academy of Sciences (CAS) Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- * To whom correspondence should be addressed. E-mail: (JW); (JY)
| | - Jun Yu
- Chinese Academy of Sciences (CAS) Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China
- * To whom correspondence should be addressed. E-mail: (JW); (JY)
| |
Collapse
|
43
|
Lee JY, Park JY, Tian B. Identification of mRNA polyadenylation sites in genomes using cDNA sequences, expressed sequence tags, and Trace. Methods Mol Biol 2008; 419:23-37. [PMID: 18369973 DOI: 10.1007/978-1-59745-033-1_2] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023]
Abstract
Polyadenylation of nascent transcripts is an essential step for most mRNAs in eukaryotic cells. It is directly involved in the termination of transcription and is coupled with other steps of pre-mRNA processing. Recent studies have shown that transcript variants resulting from alternative polyadenylation are widespread for human and mouse genes, contributing to the complexity of mRNA pool in the cell. In addition to 3'-most exons, alternative polyadenylation sites (or poly(A) sites) can be located in internal exons and introns. Identification of poly(A) sites in genomes is critical for understanding the occurrence and significance of alternative polyadenylation events. Bioinformatic methods using cDNA sequences, Expressed Sequence Tags (ESTs), and Trace offer a sensitive and systematic approach to detect poly(A) sites in genomes. Various criteria can be employed to enhance the specificity of the detection, including identifying sequences derived from internal priming of mRNA and polyadenylated RNAs during degradation.
Collapse
Affiliation(s)
- Ju Youn Lee
- Department of Biochemistry and Molecular Biology, New Jersey Medical School, University of Medicine and Dentistry of New Jersey, Newark, NJ, USA
| | | | | |
Collapse
|
44
|
Abstract
In recent years, genome-wide detection of alternative splicing based on Expressed Sequence Tag (EST) sequence alignments with mRNA and genomic sequences has dramatically expanded our understanding of the role of alternative splicing in functional regulation. This chapter reviews the data, methodology, and technical challenges of these genome-wide analyses of alternative splicing, and briefly surveys some of the uses to which such alternative splicing databases have been put. For example, with proper alternative splicing database schema design, it is possible to query genome-wide for alternative splicing patterns that are specific to particular tissues, disease states (e.g., cancer), gender, or developmental stages. EST alignments can be used to estimate exon inclusion or exclusion level of alternatively spliced exons and evolutionary changes for various species can be inferred from exon inclusion level. Such databases can also help automate design of probes for RT-PCR and microarrays, enabling high throughput experimental measurement of alternative splicing.
Collapse
|
45
|
Liu F, Jenssen TK, Trimarchi J, Punzo C, Cepko CL, Ohno-Machado L, Hovig E, Patrick Kuo W. Comparison of hybridization-based and sequencing-based gene expression technologies on biological replicates. BMC Genomics 2007; 8:153. [PMID: 17555589 PMCID: PMC1899500 DOI: 10.1186/1471-2164-8-153] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2006] [Accepted: 06/07/2007] [Indexed: 02/06/2023] Open
Abstract
Background High-throughput systems for gene expression profiling have been developed and have matured rapidly through the past decade. Broadly, these can be divided into two categories: hybridization-based and sequencing-based approaches. With data from different technologies being accumulated, concerns and challenges are raised about the level of agreement across technologies. As part of an ongoing large-scale cross-platform data comparison framework, we report here a comparison based on identical samples between one-dye DNA microarray platforms and MPSS (Massively Parallel Signature Sequencing). Results The DNA microarray platforms generally provided highly correlated data, while moderate correlations between microarrays and MPSS were obtained. Disagreements between the two types of technologies can be attributed to limitations inherent to both technologies. The variation found between pooled biological replicates underlines the importance of exercising caution in identification of differential expression, especially for the purposes of biomarker discovery. Conclusion Based on different principles, hybridization-based and sequencing-based technologies should be considered complementary to each other, rather than competitive alternatives for measuring gene expression, and currently, both are important tools for transcriptome profiling.
Collapse
Affiliation(s)
- Fang Liu
- Department of Tumor Biology, Rikshopitalet-Radiumhospitalet Medical Center, Montebello, NO-0310 Oslo, Norway
- PubGene AS, Vinderen, NO-0319 Oslo, Norway
| | | | - Jeff Trimarchi
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Claudio Punzo
- Department of Genetics, Harvard Medical School, Boston, MA, USA
| | - Connie L Cepko
- Department of Genetics, Harvard Medical School, Boston, MA, USA
- Howard Hughes Medical Institute, Harvard Medical School, Boston, MA, USA
| | | | - Eivind Hovig
- Department of Tumor Biology, Rikshopitalet-Radiumhospitalet Medical Center, Montebello, NO-0310 Oslo, Norway
- Department of Medical Informatics, Rikshopitalet-Radiumhospitalet Medical Center, Montebello, NO-0310 Oslo, Norway
| | - Winston Patrick Kuo
- Decision Systems Group, Brigham and Women's Hospital, Boston, MA, USA
- Department of Developmental Biology, Harvard School of Dental Medicine, Boston, MA, USA
- Department of Organismic and Evolutionary Biology/Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA
| |
Collapse
|
46
|
Gain and loss of polyadenylation signals during evolution of green algae. BMC Evol Biol 2007; 7:65. [PMID: 17442103 PMCID: PMC1868727 DOI: 10.1186/1471-2148-7-65] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2006] [Accepted: 04/18/2007] [Indexed: 11/24/2022] Open
Abstract
Background The Viridiplantae (green algae and land plants) consist of two monophyletic lineages: the Chlorophyta and the Streptophyta. Most green algae belong to the Chlorophyta, while the Streptophyta include all land plants and a small group of freshwater algae known as Charophyceae. Eukaryotes attach a poly-A tail to the 3' ends of most nuclear-encoded mRNAs. In embryophytes, animals and fungi, the signal for polyadenylation contains an A-rich sequence (often AAUAAA or related sequence) 13 to 30 nucleotides upstream from the cleavage site, which is commonly referred to as the near upstream element (NUE). However, it has been reported that the pentanucleotide UGUAA is used as polyadenylation signal for some genes in volvocalean algae. Results We set out to investigate polyadenylation signal differences between streptophytes and chlorophytes that may have emerged shortly after the evolutionary split between Streptophyta and Chlorophyta. We therefore analyzed expressed genes (ESTs) from three streptophyte algae, Mesostigma viride, Klebsormidium subtile and Coleochaete scutata, and from two early-branching chlorophytes, Pyramimonas parkeae and Scherffelia dubia. In addition, to extend the database, our analyses included ESTs from six other chlorophytes (Acetabularia acetabulum, Chlamydomonas reinhardtii, Helicosporidium sp. ex Simulium jonesii, Prototheca wickerhamii, Scenedesmus obliquus and Ulva linza) and one streptophyte (Closterium peracerosum). Our results indicate that polyadenylation signals in green algae vary widely. The UGUAA motif is confined to late-branching Chlorophyta. Most streptophyte algae do not have an A-rich sequence motif like that in embryophytes, animals and fungi. We observed polyadenylation signals similar to those of Arabidopsis and other land plants only in Mesostigma. Conclusion Polyadenylation signals in green algae show considerable variation. A new NUE (UGUAA) was invented in derived chlorophytes and replaced not only the A-rich NUE but the complete poly(A) signal in all chlorophytes investigated except Scherffelia (only NUE replaced) and Pyramimonas (UGUAA completely missing). The UGUAA element is completely absent from streptophytes. However, the structure of the poly(A) signal was often modified in streptophyte algae. In most species investigated, an A-rich NUE is missing; instead, these species seem to rely mainly on U-rich elements.
Collapse
|
47
|
Moucadel V, Lopez F, Ara T, Benech P, Gautheret D. Beyond the 3' end: experimental validation of extended transcript isoforms. Nucleic Acids Res 2007; 35:1947-57. [PMID: 17339231 PMCID: PMC1874610 DOI: 10.1093/nar/gkm062] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
High throughput EST and full-length cDNA sequencing have revealed extensive variations at the 3' ends of mammalian transcripts. Whether all of these changes are biologically meaningful has been the subject of controversy, as such, results may reflect in part transcription or polyadenylation leakage. We selected here a set of tandem poly(A) sites predicted from EST/cDNA sequence analysis that (i) are conserved between human and mouse, (ii) produce alternative 3' isoforms with unusual size features and (iii) are not documented in current genome databases, and we submitted these sites to experimental validation in mouse tissues. Out of 86 tested poly(A) sites from 44 genes, 84 were individually confirmed using a specially devised RT-PCR strategy. We then focused on validating the exon structure between distant tandem poly(A) sites separated by over 3 kb, and between stop codons and alternative poly(A) sites located at 4.5 kb or more, using a long-distance RT-PCR strategy. In most cases, long transcripts spanning the whole poly(A)-poly(A) or stop-poly(A) distance were detected, confirming that tandem sites were part of the same transcription unit. Given the apparent conservation of these long alternative 3' ends, different regulatory functions can be foreseen, depending on the location where transcription starts.
Collapse
Affiliation(s)
| | | | | | | | - Daniel Gautheret
- *To whom correspondence should be addressed. 33 (0)1 69 15 46 3233 (0)1 69 15 46 29
| |
Collapse
|
48
|
Gilat R, Shweiki D. A novel function for alternative polyadenylation as a rescue pathway from NMD surveillance. Biochem Biophys Res Commun 2007; 353:487-92. [PMID: 17188645 DOI: 10.1016/j.bbrc.2006.12.052] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2006] [Accepted: 12/07/2006] [Indexed: 10/23/2022]
Abstract
Premature termination codon (PTC) containing transcripts are subjected to a rapid degradation via nonsense-mediated decay (NMD) surveillance mechanism. By and large degradation is desired in order to prevent the translation of truncated, most likely deleterious, protein. Nevertheless, several dissimilar NMD-rescue events, capable of turning NMD-candidates into NMD-immune, are described. Yet, the extent and nature of this phenomenon is unknown. We screened the human genome for NMD-candidates transcripts. Among which we sub-grouped "pseudo-NMD" genes, which all their annotated transcripts contain PTCs, and therefore allegedly are transcribed but never translated. Here we show that alternative polyadenylation can rescue prematurely terminated transcripts, by truncating the pre-mRNA so that the PTC is now "legally" positioned. ESTs-based analysis shows that NMD-rescued genes are indeed expressed in human tissues. Furthermore, predicted NMD-rescue variants' existence is computationally verified. Hence, we suggest a novel role for the exon-truncated class of alternative polyadenylation as an NMD-rescue regulatory mechanism.
Collapse
Affiliation(s)
- Roi Gilat
- Bioinformatics Program, School of Computer Science, The Academic College of Tel Aviv-Yaffo, 4 Antokolsky St., Tel-Aviv 64044, Israel
| | | |
Collapse
|
49
|
Gilat R, Goncharov S, Esterman N, Shweiki D. Under-representation of PolyA/PolyT tailed ESTs in human ESTdb: an obstacle to alternative polyadenylation inference. Bioinformation 2006; 1:220-4. [PMID: 17597892 PMCID: PMC1891686] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2006] [Revised: 10/02/2006] [Accepted: 10/02/2006] [Indexed: 11/21/2022] Open
Abstract
Alternative polyadenylation is a key regulatory process which affects the 3' end formation of variants of the same transcription unit, thus altering gene expression pattern, and transcripts' cellular behaviour and characteristics. The common methodology for computational analysis of alternative polyadenylation signal utilization is based on EST data, specifically on PolyA/PolyT tailed ESTs. Studying the human ESTs dataset we detected a significant underrepresentation of PolyA/PolyT tailed ESTs, constituting only 10% of most libraries. Consequently, more than 50% of false-negative events are revealed in the analysis of alternatively polyadenylated variants' expression. We therefore argue that the ratios of PolyA/PolyT tailed ESTs, as represented in the human EST database, do not reflect the truepicture of 3' end variants formation of a given physiological situation. Thus the EST database should not be considered a reliable source for alternative polyadenylation signal usage inference.
Collapse
Affiliation(s)
| | | | | | - Dorit Shweiki
- Dorit Shweiki
E-mail:
; Phone: +972 3 5211853; Fax: +972 3 5211871; Corresponding author
| |
Collapse
|
50
|
Majerciak V, Yamanegi K, Zheng ZM. Gene structure and expression of Kaposi's sarcoma-associated herpesvirus ORF56, ORF57, ORF58, and ORF59. J Virol 2006; 80:11968-81. [PMID: 17020939 PMCID: PMC1676266 DOI: 10.1128/jvi.01394-06] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Though similar to those of herpesvirus saimiri and Epstein-Barr virus (EBV), the Kaposi's sarcoma-associated herpesvirus (KSHV) genome features more splice genes and encodes many genes with bicistronic or polycistronic transcripts. In the present study, the gene structure and expression of KSHV ORF56 (primase), ORF57 (MTA), ORF58 (EBV BMRF2 homologue), and ORF59 (DNA polymerase processivity factor) were analyzed in butyrate-activated KSHV(+) JSC-1 cells. ORF56 was expressed at low abundance as a bicistronic ORF56/57 transcript that utilized the same intron, with two alternative branch points, as ORF57 for its RNA splicing. ORF56 was transcribed from two transcription start sites, nucleotides (nt) 78994 (minor) and 79075 (major), but selected the same poly(A) signal as ORF57 for RNA polyadenylation. The majority of ORF56 and ORF57 transcripts were cleaved at nt 83628, although other nearby cleavage sites were selectable. On the opposite strand of the viral genome, colinear ORF58 and ORF59 were transcribed from different transcription start sites, nt 95821 (major) or 95824 (minor) for ORF58 and nt 96790 (minor) or 96794 (major) for ORF59, but shared overlapping poly(A) signals at nt 94492 and 94488. Two cleavage sites, at nt 94477 and nt 94469, could be equally selected for ORF59 polyadenylation, but only the cleavage site at nt 94469 could be selected for ORF58 polyadenylation without disrupting the ORF58 stop codon immediately upstream. ORF58 was expressed in low abundance as a monocistronic transcript, with a long 5' untranslated region (UTR) but a short 3' UTR, whereas ORF59 was expressed in high abundance as a bicistronic transcript, with a short 5' UTR and a long 3' UTR similar to those of polycistronic ORF60 and ORF62. Both ORF56 and ORF59 are targets of ORF57 and were up-regulated significantly in the presence of ORF57, a posttranscriptional regulator.
Collapse
Affiliation(s)
- Vladimir Majerciak
- HIV and AIDS Malignancy Branch, Center for Cancer Research, NCI/NIH, 10 Center Dr., Rm. 10 S255, MSC-1868, Bethesda, MD 20892-1868, USA
| | | | | |
Collapse
|