1
|
Stroup EK, Ji Z. Delineating yeast cleavage and polyadenylation signals using deep learning. Genome Res 2024; 34:1066-1080. [PMID: 38914436 PMCID: PMC11368178 DOI: 10.1101/gr.278606.123] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2023] [Accepted: 06/17/2024] [Indexed: 06/26/2024]
Abstract
3'-end cleavage and polyadenylation is an essential process for eukaryotic mRNA maturation. In yeast species, the polyadenylation signals that recruit the processing machinery are degenerate and remain poorly characterized compared with the well-defined regulatory elements in mammals. Here we address this issue by developing deep learning models to deconvolute degenerate cis-regulatory elements and quantify their positional importance in mediating yeast poly(A) site formation, cleavage heterogeneity, and strength. In S. cerevisiae, cleavage heterogeneity is promoted by the depletion of U-rich elements around poly(A) sites as well as multiple occurrences of upstream UA-rich elements. Sites with high cleavage heterogeneity show overall lower strength. The site strength and tandem site distances modulate alternative polyadenylation (APA) under the diauxic stress. Finally, we develop a deep learning model to reveal the distinct motif configuration of S. pombe poly(A) sites, which show more precise cleavage than S. cerevisiae Altogether, our deep learning models provide unprecedented insights into poly(A) site formation of yeast species, and our results highlight divergent poly(A) signals across distantly related species.
Collapse
Affiliation(s)
- Emily Kunce Stroup
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois 60611, USA
| | - Zhe Ji
- Department of Pharmacology, Feinberg School of Medicine, Northwestern University, Chicago, Illinois 60611, USA;
- Department of Biomedical Engineering, McCormick School of Engineering, Northwestern University, Evanston, Illinois 60628, USA
| |
Collapse
|
2
|
Shao Z, Hu J, Jandura A, Wilk R, Jachimowicz M, Ma L, Hu C, Sundquist A, Das I, Samuel-Larbi P, Brill JA, Krause HM. Spatially revealed roles for lncRNAs in Drosophila spermatogenesis, Y chromosome function and evolution. Nat Commun 2024; 15:3806. [PMID: 38714658 PMCID: PMC11076287 DOI: 10.1038/s41467-024-47346-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 03/25/2024] [Indexed: 05/10/2024] Open
Abstract
Unlike coding genes, the number of lncRNA genes in organism genomes is relatively proportional to organism complexity. From plants to humans, the tissues with highest numbers and levels of lncRNA gene expression are the male reproductive organs. To learn why, we initiated a genome-wide analysis of Drosophila lncRNA spatial expression patterns in these tissues. The numbers of genes and levels of expression observed greatly exceed those previously reported, due largely to a preponderance of non-polyadenylated transcripts. In stark contrast to coding genes, the highest numbers of lncRNAs expressed are in post-meiotic spermatids. Correlations between expression levels, localization and previously performed genetic analyses indicate high levels of function and requirement. More focused analyses indicate that lncRNAs play major roles in evolution by controlling transposable element activities, Y chromosome gene expression and sperm construction. A new type of lncRNA-based particle found in seminal fluid may also contribute to reproductive outcomes.
Collapse
Affiliation(s)
- Zhantao Shao
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
| | - Jack Hu
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
| | - Allison Jandura
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Cell Biology Program, The Hospital for Sick Children, Toronto, ON, Canada
| | - Ronit Wilk
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
| | - Matthew Jachimowicz
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Cell Biology Program, The Hospital for Sick Children, Toronto, ON, Canada
| | - Lingfeng Ma
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
- Cell Biology Program, The Hospital for Sick Children, Toronto, ON, Canada
| | - Chun Hu
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
| | - Abby Sundquist
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
| | - Indrani Das
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada
| | | | - Julie A Brill
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
- Cell Biology Program, The Hospital for Sick Children, Toronto, ON, Canada.
| | - Henry M Krause
- Donnelly Ctr., 160 College St., University of Toronto, Toronto, ON, Canada.
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada.
| |
Collapse
|
3
|
Stroup EK, Ji Z. Delineating yeast cleavage and polyadenylation signals using deep learning. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.10.561764. [PMID: 37873420 PMCID: PMC10592759 DOI: 10.1101/2023.10.10.561764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
3'-end cleavage and polyadenylation is an essential process for eukaryotic mRNA maturation. In yeast species, the polyadenylation signals that recruit the processing machinery are degenerate and remain poorly characterized compared to well-defined regulatory elements in mammals. Especially, recent deep sequencing experiments showed extensive cleavage heterogeneity for some mRNAs in Saccharomyces cerevisiae and uncovered the polyA motif differences between S. cerevisiae vs. Schizosaccharomyces pombe . The findings raised the fundamental question of how polyadenylation signals are formed in yeast. Here we addressed this question by developing deep learning models to deconvolute degenerate cis -regulatory elements and quantify their positional importance in mediating yeast polyA site formation, cleavage heterogeneity, and strength. In S. cerevisiae , cleavage heterogeneity is promoted by the depletion of U-rich elements around polyA sites as well as multiple occurrences of upstream UA-rich elements. Sites with high cleavage heterogeneity show overall lower strength. The site strength and tandem site distances modulate alternative polyadenylation (APA) under the diauxic stress. Finally, we developed a deep learning model to reveal the distinct motif configuration of S. pombe polyA sites which show more precise cleavage than S. cerevisiae . Altogether, our deep learning models provide unprecedented insights into polyA site formation across yeast species.
Collapse
|
4
|
Manley BF, Lotharukpong JS, Barrera-Redondo J, Llewellyn T, Yildirir G, Sperschneider J, Corradi N, Paszkowski U, Miska EA, Dallaire A. A highly contiguous genome assembly reveals sources of genomic novelty in the symbiotic fungus Rhizophagus irregularis. G3 (BETHESDA, MD.) 2023; 13:jkad077. [PMID: 36999556 PMCID: PMC10234402 DOI: 10.1093/g3journal/jkad077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Accepted: 03/17/2023] [Indexed: 06/02/2023]
Abstract
The root systems of most plant species are aided by the soil-foraging capacities of symbiotic arbuscular mycorrhizal (AM) fungi of the Glomeromycotina subphylum. Despite recent advances in our knowledge of the ecology and molecular biology of this mutualistic symbiosis, our understanding of the AM fungi genome biology is just emerging. Presented here is a close to T2T genome assembly of the model AM fungus Rhizophagus irregularis DAOM197198, achieved through Nanopore long-read DNA sequencing and Hi-C data. This haploid genome assembly of R. irregularis, alongside short- and long-read RNA-Sequencing data, was used to produce a comprehensive annotation catalog of gene models, repetitive elements, small RNA loci, and DNA cytosine methylome. A phylostratigraphic gene age inference framework revealed that the birth of genes associated with nutrient transporter activity and transmembrane ion transport systems predates the emergence of Glomeromycotina. While nutrient cycling in AM fungi relies on genes that existed in ancestor lineages, a burst of Glomeromycotina-restricted genetic innovation is also detected. Analysis of the chromosomal distribution of genetic and epigenetic features highlights evolutionarily young genomic regions that produce abundant small RNAs, suggesting active RNA-based monitoring of genetic sequences surrounding recently evolved genes. This chromosome-scale view of the genome of an AM fungus genome reveals previously unexplored sources of genomic novelty in an organism evolving under an obligate symbiotic life cycle.
Collapse
Affiliation(s)
- Bethan F Manley
- SPUN|Society for the Protection of Underground Networks, 3500 South DuPont Highway, Suite EI-101, Dover, DE 19901, USA
- Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK
| | - Jaruwatana S Lotharukpong
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, Tübingen 72076, Germany
| | - Josué Barrera-Redondo
- Department of Algal Development and Evolution, Max Planck Institute for Biology, Max-Planck-Ring 5, Tübingen 72076, Germany
| | - Theo Llewellyn
- Comparative Fungal Biology, Royal Botanic Gardens Kew, Jodrell Laboratory, Richmond TW9 3DS, UK
- Department of Life Sciences, Imperial College London, London SW7 2AZ, UK
| | - Gokalp Yildirir
- Department of Biology, University of Ottawa, Ottawa, ON, Canada K1N 6N5
| | - Jana Sperschneider
- Agriculture and Food, Commonwealth Scientific and Industrial Research Organisation, Canberra, ACT 2601, Australia
| | - Nicolas Corradi
- Department of Biology, University of Ottawa, Ottawa, ON, Canada K1N 6N5
| | - Uta Paszkowski
- Crop Science Centre, Department of Plant Sciences, University of Cambridge, Cambridge CB3 0LE, UK
| | - Eric A Miska
- Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QW, UK
| | - Alexandra Dallaire
- Gurdon Institute, University of Cambridge, Cambridge CB2 1QN, UK
- Comparative Fungal Biology, Royal Botanic Gardens Kew, Jodrell Laboratory, Richmond TW9 3DS, UK
- Department of Biochemistry, University of Cambridge, Tennis Court Road, Cambridge CB2 1QW, UK
| |
Collapse
|
5
|
de Felippes FF, Waterhouse PM. Plant terminators: the unsung heroes of gene expression. JOURNAL OF EXPERIMENTAL BOTANY 2023; 74:2239-2250. [PMID: 36477559 PMCID: PMC10082929 DOI: 10.1093/jxb/erac467] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 09/21/2022] [Accepted: 11/25/2022] [Indexed: 06/06/2023]
Abstract
To be properly expressed, genes need to be accompanied by a terminator, a region downstream of the coding sequence that contains the information necessary for the maturation of the mRNA 3' end. The main event in this process is the addition of a poly(A) tail at the 3' end of the new transcript, a critical step in mRNA biology that has important consequences for the expression of genes. Here, we review the mechanism leading to cleavage and polyadenylation of newly transcribed mRNAs and how this process can affect the final levels of gene expression. We give special attention to an aspect often overlooked, the effect that different terminators can have on the expression of genes. We also discuss some exciting findings connecting the choice of terminator to the biogenesis of small RNAs, which are a central part of one of the most important mechanisms of regulation of gene expression in plants.
Collapse
Affiliation(s)
| | - Peter M Waterhouse
- Centre for Agriculture and the Bioeconomy, Institute for Future Environments, Queensland University of Technology (QUT), Brisbane, QLD, Australia
- ARC Centre of Excellence for Plant Success in Nature & Agriculture, QUT, Brisbane, QLD, Australia
| |
Collapse
|
6
|
Hadar S, Meller A, Saida N, Shalgi R. Stress-induced transcriptional readthrough into neighboring genes is linked to intron retention. iScience 2022; 25:105543. [PMID: 36505935 PMCID: PMC9732411 DOI: 10.1016/j.isci.2022.105543] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 07/10/2022] [Accepted: 11/07/2022] [Indexed: 11/11/2022] Open
Abstract
Exposure to certain stresses leads to readthrough transcription. Using polyA-selected RNA-seq in mouse fibroblasts subjected to heat shock, oxidative, or osmotic stress, we found that readthrough transcription can proceed into proximal downstream genes, in a phenomenon previously termed "read-in." We found that read-in genes share distinctive genomic characteristics; they are GC-rich and extremely short , with genomic features conserved in human. Using ribosome profiling, we found that read-in genes show significantly reduced translation. Strikingly, read-in genes demonstrate marked intron retention, mostly in their first introns, which could not be explained solely by their short introns and GC-richness, features often associated with intron retention. Finally, we revealed H3K36me3 enrichment upstream to read-in genes. Moreover, demarcation of exon-intron junctions by H3K36me3 was absent in read-in first introns. Our data portray a relationship between read-in and intron retention, suggesting they may have co-evolved to facilitate reduced translation of read-in genes during stress.
Collapse
Affiliation(s)
- Shani Hadar
- Department of Biochemistry, Rappaport Faculty of Medicine, Technion–Israel Institute of Technology, Haifa 31096, Israel
| | - Anatoly Meller
- Department of Biochemistry, Rappaport Faculty of Medicine, Technion–Israel Institute of Technology, Haifa 31096, Israel
| | - Naseeb Saida
- Department of Biochemistry, Rappaport Faculty of Medicine, Technion–Israel Institute of Technology, Haifa 31096, Israel
| | - Reut Shalgi
- Department of Biochemistry, Rappaport Faculty of Medicine, Technion–Israel Institute of Technology, Haifa 31096, Israel
| |
Collapse
|
7
|
Schärfen L, Zigackova D, Reimer KA, Stark MR, Slat VA, Francoeur NJ, Wells ML, Zhou L, Blackshear PJ, Neugebauer KM, Rader SD. Identification of Alternative Polyadenylation in Cyanidioschyzon merolae Through Long-Read Sequencing of mRNA. Front Genet 2022; 12:818697. [PMID: 35154260 PMCID: PMC8831791 DOI: 10.3389/fgene.2021.818697] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2021] [Accepted: 12/22/2021] [Indexed: 12/04/2022] Open
Abstract
Alternative polyadenylation (APA) is widespread among metazoans and has been shown to have important impacts on mRNA stability and protein expression. Beyond a handful of well-studied organisms, however, its existence and consequences have not been well investigated. We therefore turned to the deep-branching red alga, Cyanidioschyzon merolae, to study the biology of polyadenylation in an organism highly diverged from humans and yeast. C. merolae is an acidothermophilic alga that lives in volcanic hot springs. It has a highly reduced genome (16.5 Mbp) and has lost all but 27 of its introns and much of its splicing machinery, suggesting that it has been under substantial pressure to simplify its RNA processing pathways. We used long-read sequencing to assess the key features of C. merolae mRNAs, including splicing status and polyadenylation cleavage site (PAS) usage. Splicing appears to be less efficient in C. merolae compared with yeast, flies, and mammalian cells. A high proportion of transcripts (63%) have at least two distinct PAS's, and 34% appear to utilize three or more sites. The apparent polyadenylation signal UAAA is used in more than 90% of cases, in cells grown in both rich media or limiting nitrogen. Our documentation of APA for the first time in this non-model organism highlights its conservation and likely biological importance of this regulatory step in gene expression.
Collapse
Affiliation(s)
- Leonard Schärfen
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Dagmar Zigackova
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Kirsten A. Reimer
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Martha R. Stark
- Department of Chemistry, University of Northern British Columbia, Prince George, BC, Canada
| | - Viktor A. Slat
- Department of Chemistry, University of Northern British Columbia, Prince George, BC, Canada
| | - Nancy J. Francoeur
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, United States
| | - Melissa L. Wells
- The Signal Transduction Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, United States
| | - Lecong Zhou
- Integrative Bioinformatics Support Group, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, United States
| | - Perry J. Blackshear
- The Signal Transduction Laboratory, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, Durham, NC, United States
| | - Karla M. Neugebauer
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States
| | - Stephen D. Rader
- Department of Chemistry, University of Northern British Columbia, Prince George, BC, Canada
| |
Collapse
|
8
|
Turner RE, Harrison PF, Swaminathan A, Kraupner-Taylor CA, Goldie BJ, See M, Peterson AL, Schittenhelm RB, Powell DR, Creek DJ, Dichtl B, Beilharz TH. Genetic and pharmacological evidence for kinetic competition between alternative poly(A) sites in yeast. eLife 2021; 10:65331. [PMID: 34232857 PMCID: PMC8263057 DOI: 10.7554/elife.65331] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2020] [Accepted: 06/22/2021] [Indexed: 01/23/2023] Open
Abstract
Most eukaryotic mRNAs accommodate alternative sites of poly(A) addition in the 3’ untranslated region in order to regulate mRNA function. Here, we present a systematic analysis of 3’ end formation factors, which revealed 3’UTR lengthening in response to a loss of the core machinery, whereas a loss of the Sen1 helicase resulted in shorter 3’UTRs. We show that the anti-cancer drug cordycepin, 3’ deoxyadenosine, caused nucleotide accumulation and the usage of distal poly(A) sites. Mycophenolic acid, a drug which reduces GTP levels and impairs RNA polymerase II (RNAP II) transcription elongation, promoted the usage of proximal sites and reversed the effects of cordycepin on alternative polyadenylation. Moreover, cordycepin-mediated usage of distal sites was associated with a permissive chromatin template and was suppressed in the presence of an rpb1 mutation, which slows RNAP II elongation rate. We propose that alternative polyadenylation is governed by temporal coordination of RNAP II transcription and 3’ end processing and controlled by the availability of 3’ end factors, nucleotide levels and chromatin landscape.
Collapse
Affiliation(s)
- Rachael Emily Turner
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Paul F Harrison
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia.,Monash Bioinformatics Platform, Monash University, Melbourne, Australia
| | - Angavai Swaminathan
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Calvin A Kraupner-Taylor
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Belinda J Goldie
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| | - Michael See
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia.,Monash Bioinformatics Platform, Monash University, Melbourne, Australia
| | - Amanda L Peterson
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Australia
| | - Ralf B Schittenhelm
- Monash Proteomics & Metabolomics Facility, Department of Biochemistry and Molecular Biology, Monash Biomedicine Discovery Institute, Monash University, Melbourne, Australia
| | - David R Powell
- Monash Bioinformatics Platform, Monash University, Melbourne, Australia
| | - Darren J Creek
- Drug Delivery, Disposition and Dynamics, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Australia
| | - Bernhard Dichtl
- School of Life and Environmental Sciences, Deakin University, Geelong, Australia
| | - Traude H Beilharz
- Development and Stem Cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Australia
| |
Collapse
|
9
|
Davis ZH, Mediani L, Antoniani F, Vinet J, Li S, Alberti S, Lu B, Holehouse AS, Carra S, Brandman O. Protein products of nonstop mRNA disrupt nucleolar homeostasis. Cell Stress Chaperones 2021; 26:549-561. [PMID: 33619693 PMCID: PMC8065075 DOI: 10.1007/s12192-021-01200-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2020] [Revised: 02/10/2021] [Accepted: 02/15/2021] [Indexed: 12/28/2022] Open
Abstract
Stalled mRNA translation results in the production of incompletely synthesized proteins that are targeted for degradation by ribosome-associated quality control (RQC). Here we investigated the fate of defective proteins translated from stall-inducing, nonstop mRNA that escape ubiquitylation by the RQC protein LTN1. We found that nonstop protein products accumulated in nucleoli and this localization was driven by polylysine tracts produced by translation of the poly(A) tails of nonstop mRNA. Nucleolar sequestration increased the solubility of invading proteins but disrupted nucleoli, altering their dynamics, morphology, and resistance to stress in cell culture and intact flies. Our work elucidates how stalled translation may affect distal cellular processes and may inform studies on the pathology of diseases caused by failures in RQC and characterized by nucleolar stress.
Collapse
Affiliation(s)
- Zoe H Davis
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA
| | - Laura Mediani
- Centre for Neuroscience and Nanotechnology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio, Emilia, Modena, Italy
| | - Francesco Antoniani
- Centre for Neuroscience and Nanotechnology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio, Emilia, Modena, Italy
| | - Jonathan Vinet
- Centre for Neuroscience and Nanotechnology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio, Emilia, Modena, Italy
| | - Shuangxi Li
- Department of Pathology, Stanford University, Stanford, CA, 94305, USA
| | - Simon Alberti
- Biotechnology Center (BIOTEC), Center for Molecular and Cellular Bioengineering (CMCB), Technische Universitat Dresden, Tatzberg 47/49, 01307, Dresden, Germany
| | - Bingwei Lu
- Department of Pathology, Stanford University, Stanford, CA, 94305, USA
| | - Alex S Holehouse
- Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, St. Louis, MO, 63110, USA
- Center for Science and Engineering of Living Systems (CSELS), Washington University in St. Louis, St. Louis, MO, 63130, USA
| | - Serena Carra
- Centre for Neuroscience and Nanotechnology, Department of Biomedical, Metabolic and Neural Sciences, University of Modena and Reggio, Emilia, Modena, Italy.
| | - Onn Brandman
- Department of Biochemistry, Stanford University, Stanford, CA, 94305, USA.
| |
Collapse
|
10
|
Multi-strategic RNA-seq analysis reveals a high-resolution transcriptional landscape in cotton. Nat Commun 2019; 10:4714. [PMID: 31624240 PMCID: PMC6797763 DOI: 10.1038/s41467-019-12575-x] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2019] [Accepted: 09/18/2019] [Indexed: 11/09/2022] Open
Abstract
Cotton is an important natural fiber crop, however, its comprehensive and high-resolution gene map is lacking. Here we integrate four complementary high-throughput techniques, including Pacbio long read Iso-seq, strand-specific RNA-seq, CAGE-seq, and PolyA-seq, to systematically explore the transcription landscape across 16 tissues or different organ types in Gossypium arboreum. We devise a computational pipeline, named IGIA, to reconstruct accurate gene structures from the integrated data. Our results reveal a dynamic and diverse transcriptional map in cotton: tissue-specific gene expression, alternative usage of TSSs and polyadenylation sites, hotspot of alternative splicing, and transcriptional read-through. These regulated events affect many genes in various aspects such as gain or loss of functional RNA motifs and protein domains, fine-tuning of DNA binding activity, and co-regulation for genes in the same complex or pathway. The methods and findings provide valuable resources for further functional genomic studies such as understanding natural SNP variations for plant community.
Collapse
|
11
|
Wang R, Nambiar R, Zheng D, Tian B. PolyA_DB 3 catalogs cleavage and polyadenylation sites identified by deep sequencing in multiple genomes. Nucleic Acids Res 2019; 46:D315-D319. [PMID: 29069441 PMCID: PMC5753232 DOI: 10.1093/nar/gkx1000] [Citation(s) in RCA: 143] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Accepted: 10/12/2017] [Indexed: 12/11/2022] Open
Abstract
PolyA_DB is a database cataloging cleavage and polyadenylation sites (PASs) in several genomes. Previous versions were based mainly on expressed sequence tags (ESTs), which had a limited amount and could lead to inaccurate PAS identification due to the presence of internal A-rich sequences in transcripts. Here, we present an updated version of the database based solely on deep sequencing data. First, PASs are mapped by the 3′ region extraction and deep sequencing (3′READS) method, ensuring unequivocal PAS identification. Second, a large volume of data based on diverse biological samples increases PAS coverage by 3.5-fold over the EST-based version and provides PAS usage information. Third, strand-specific RNA-seq data are used to extend annotated 3′ ends of genes to obtain more thorough annotations of alternative polyadenylation (APA) sites. Fourth, conservation information of PAS across mammals sheds light on significance of APA sites. The database (URL: http://www.polya-db.org/v3) currently holds PASs in human, mouse, rat and chicken, and has links to the UCSC genome browser for further visualization and for integration with other genomic data.
Collapse
Affiliation(s)
- Ruijia Wang
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School and Rutgers Cancer Institute of New Jersey, Newark, NJ 07103, USA
| | - Ram Nambiar
- Department of Computer Science, New Jersey Institute of Technology, Newark, NJ 07102, USA
| | - Dinghai Zheng
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School and Rutgers Cancer Institute of New Jersey, Newark, NJ 07103, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School and Rutgers Cancer Institute of New Jersey, Newark, NJ 07103, USA
| |
Collapse
|
12
|
Salomé PA, Merchant SS. A Series of Fortunate Events: Introducing Chlamydomonas as a Reference Organism. THE PLANT CELL 2019; 31:1682-1707. [PMID: 31189738 PMCID: PMC6713297 DOI: 10.1105/tpc.18.00952] [Citation(s) in RCA: 128] [Impact Index Per Article: 25.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Revised: 05/20/2019] [Accepted: 06/08/2019] [Indexed: 05/13/2023]
Abstract
The unicellular alga Chlamydomonas reinhardtii is a classical reference organism for studying photosynthesis, chloroplast biology, cell cycle control, and cilia structure and function. It is also an emerging model for studying sensory cilia, the production of high-value bioproducts, and in situ structural determination. Much of the early appeal of Chlamydomonas was rooted in its promise as a genetic system, but like other classic model organisms, this rise to prominence predated the discovery of the structure of DNA, whole-genome sequences, and molecular techniques for gene manipulation. The haploid genome of C. reinhardtii facilitates genetic analyses and offers many of the advantages of microbial systems applied to a photosynthetic organism. C. reinhardtii has contributed to our understanding of chloroplast-based photosynthesis and cilia biology. Despite pervasive transgene silencing, technological advances have allowed researchers to address outstanding lines of inquiry in algal research. The most thoroughly studied unicellular alga, C. reinhardtii, is the current standard for algal research, and although genome editing is still far from efficient and routine, it nevertheless serves as a template for other algae. We present a historical retrospective of the rise of C. reinhardtii to illuminate its past and present. We also present resources for current and future scientists who may wish to expand their studies to the realm of microalgae.
Collapse
Affiliation(s)
- Patrice A Salomé
- University of California, Los Angeles, Department of Chemistry and Biochemistry, Los Angeles, CA 90095
| | - Sabeeha S Merchant
- University of California, Los Angeles, Department of Chemistry and Biochemistry, Los Angeles, CA 90095
- University of California, Berkeley, Departments of Plant and Microbial Biology and Molecular and Cell Biology, Berkeley, CA 94720
| |
Collapse
|
13
|
Cell Cycle Kinase Polo Is Controlled by a Widespread 3' Untranslated Region Regulatory Sequence in Drosophila melanogaster. Mol Cell Biol 2019; 39:MCB.00581-18. [PMID: 31085682 DOI: 10.1128/mcb.00581-18] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 05/04/2019] [Indexed: 01/06/2023] Open
Abstract
Alternative polyadenylation generates transcriptomic diversity, although the physiological impact and regulatory mechanisms involved are still poorly understood. The cell cycle kinase Polo is controlled by alternative polyadenylation in the 3' untranslated region (3'UTR), with critical physiological consequences. Here, we characterized the molecular mechanisms required for polo alternative polyadenylation. We identified a conserved upstream sequence element (USE) close to the polo proximal poly(A) signal. Transgenic flies without this sequence show incorrect selection of polo poly(A) signals with consequent downregulation of Polo expression levels and insufficient/defective activation of Polo kinetochore targets Mps1 and Aurora B. Deletion of the USE results in abnormal mitoses in neuroblasts, revealing a role for this sequence in vivo We found that Hephaestus binds to the USE RNA and that hephaestus mutants display defects in polo alternative polyadenylation concomitant with a striking reduction in Polo protein levels, leading to mitotic errors and aneuploidy. Bioinformatic analyses show that the USE is preferentially localized upstream of noncanonical polyadenylation signals in Drosophila melanogaster genes. Taken together, our results revealed the molecular mechanisms involved in polo alternative polyadenylation, with remarkable physiological functions in Polo expression and activity at the kinetochores, and disclosed a new in vivo function for USEs in Drosophila melanogaster.
Collapse
|
14
|
Zhu S, Wu X, Fu H, Ye C, Chen M, Jiang Z, Ji G. Modeling of Genome-Wide Polyadenylation Signals in Xenopus tropicalis. Front Genet 2019; 10:647. [PMID: 31333724 PMCID: PMC6616101 DOI: 10.3389/fgene.2019.00647] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 06/18/2019] [Indexed: 12/22/2022] Open
Abstract
Alternative polyadenylation (APA) is an important post-transcriptional modification event to process messenger RNA (mRNA) for transcriptional termination, transport, and translation. In the present study, we characterized poly(A) signals in Xenopus tropicalis using 70,918 highly confident poly(A) sites derived from 16,511 protein-coding genes to understand their roles in the regulation of embryo development and gender difference. We examined potential factors, including the gene length, the number of introns in a gene, and the intron length, that may affect the prevalence of APA. We observed 12 prominent poly(A) signal patterns, which accounted for approximately 92% of total APA sites in Xenopus tropicalis. Among them, three patterns are specific to X. tropicalis, so they are absent in other animals such as humans or mice. We catalogued APA sites based on their genomic regions and developed a bioinformatics pipeline to identify over-represented signal patterns for each class. Then the schema of cis elements for APA sites in each genomic region was proposed. More importantly, APA usage is dramatically dynamic in embryos along five developmental stages and well-coordinated with the maternal-to-zygotic transition event. We used an entropy-based method to identify developmental stage-specific APA sites and identified significant signal patterns around specific sites and constitutive sites. We found that the APA frequency in different genomic regions varies with developmental stages and that those sites located in intron or coding sequence regions contribute most to the dynamics of gene expression during developmental stages. This study deciphers the characteristics and poly(A) signal patterns for both canonical APA sites and non-canonical APA sites across different developmental stages and gender dimorphisms in X. tropicalis, providing new insights into the dynamic regulation of distal and proximal APA.
Collapse
Affiliation(s)
- Sheng Zhu
- Department of Automation, Xiamen University, Xiamen, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.,Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, China
| | - Hongjuan Fu
- Department of Automation, Xiamen University, Xiamen, China
| | - Congting Ye
- National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.,Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, College of the Environment and Ecology, Xiamen University, Xiamen, China
| | - Moliang Chen
- Department of Automation, Xiamen University, Xiamen, China
| | - Zhihua Jiang
- Department of Animal Sciences and Center for Reproductive Biology, Washington State University, Pullman, WA, United States
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen, China.,National Institute for Data Science in Health and Medicine, Xiamen University, Xiamen, China.,Innovation Center for Cell Signaling Network, Xiamen University, Xiamen, China
| |
Collapse
|
15
|
Zhao Z, Wu X, Ji G, Liang C, Li QQ. Genome-Wide Comparative Analyses of Polyadenylation Signals in Eukaryotes Suggest a Possible Origin of the AAUAAA Signal. Int J Mol Sci 2019; 20:ijms20040958. [PMID: 30813258 PMCID: PMC6413133 DOI: 10.3390/ijms20040958] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2019] [Revised: 02/18/2019] [Accepted: 02/19/2019] [Indexed: 01/09/2023] Open
Abstract
Pre-mRNA cleavage and polyadenylation is an essential step for almost all mRNA in eukaryotes. The cis-elements around the poly(A) sites, however, are very diverse among different organisms. We characterized the poly(A) signals of seven different species, and compared them with that of four well-studied organisms. We found that ciliates do not show any dominant poly(A) signal; a triplet (UAA) and tetramers (UAAA and GUAA) are dominant in diatoms and red alga, respectively; and green alga Ostreococcus uses UGUAA as its poly(A) signal. Spikemoss and moss use conserved AAUAAA signals that are similar to other land plants. Our analysis suggests that the first two bases (NN in NNUAAA) are likely degenerated whereas UAAA appears to be the core motif. Combined with other published results, it is suggested that the highly conserved poly(A) signal AAUAAA may be derived from UAA with an intermediate, putative UAAA, following a pathway of UAA→UAAA→AAUAAA.
Collapse
Affiliation(s)
- Zhixin Zhao
- College of Biopharmaceutical and Food Engineering, Shangluo University, Shangluo 726000, China.
- Department of Biology, Miami University, Oxford, OH 45056, USA.
| | - Xiaohui Wu
- Department of Automation, Xiamen University, Xiamen 361005, China.
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen 361005, China.
| | - Chun Liang
- Department of Biology, Miami University, Oxford, OH 45056, USA.
| | - Qingshun Quinn Li
- Department of Biology, Miami University, Oxford, OH 45056, USA.
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystems, and College of the Environment and Ecology, Xiamen University, Xiamen 361102, China.
- Graduate College of Biomedical Sciences, Western University of Health Sciences, Pomona, CA 91766, USA.
| |
Collapse
|
16
|
Wang R, Zheng D, Yehia G, Tian B. A compendium of conserved cleavage and polyadenylation events in mammalian genes. Genome Res 2018; 28:1427-1441. [PMID: 30143597 PMCID: PMC6169888 DOI: 10.1101/gr.237826.118] [Citation(s) in RCA: 62] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2018] [Accepted: 08/08/2018] [Indexed: 12/22/2022]
Abstract
Cleavage and polyadenylation is essential for 3' end processing of almost all eukaryotic mRNAs. Recent studies have shown widespread alternative cleavage and polyadenylation (APA) events leading to mRNA isoforms with different 3' UTRs and/or coding sequences. Here, we present a compendium of conserved cleavage and polyadenylation sites (PASs) in mammalian genes, based on approximately 1.2 billion 3' end sequencing reads from more than 360 human, mouse, and rat samples. We show that ∼80% of mammalian mRNA genes contain at least one conserved PAS, and ∼50% have conserved APA events. PAS conservation generally reduces promiscuous 3' end processing, stabilizing gene expression levels across species. Conservation of APA correlates with gene age, gene expression features, and gene functions. Genes with certain functions, such as cell morphology, cell proliferation, and mRNA metabolism, are particularly enriched with conserved APA events. Whereas tissue-specific genes typically have a low APA rate, brain-specific genes tend to evolve APA. In addition, we show enrichment of mRNA destabilizing motifs in alternative 3' UTR sequences, leading to substantial differences in mRNA stability between 3' UTR isoforms. Using conserved PASs, we reveal sequence motifs surrounding APA sites and a preference of adenosine at the cleavage site. Furthermore, we show that mutations of U-rich motifs around the PAS often accompany APA profile differences between species. Analysis of lncRNA PASs indicates a mechanism of PAS fixation through evolution of A-rich motifs. Taken together, our results present a comprehensive view of PAS evolution in mammals, and a phylogenic perspective on APA functions.
Collapse
Affiliation(s)
- Ruijia Wang
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey 07103, USA
| | - Dinghai Zheng
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey 07103, USA
| | - Ghassan Yehia
- Genome Editing Core Facility, Rutgers University, New Brunswick, New Jersey 08901, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey 07103, USA
| |
Collapse
|
17
|
Wang R, Zheng D, Yehia G, Tian B. A compendium of conserved cleavage and polyadenylation events in mammalian genes. Genome Res 2018. [PMID: 30143597 DOI: 10.1101/gr.237826.118.28] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2023]
Abstract
Cleavage and polyadenylation is essential for 3' end processing of almost all eukaryotic mRNAs. Recent studies have shown widespread alternative cleavage and polyadenylation (APA) events leading to mRNA isoforms with different 3' UTRs and/or coding sequences. Here, we present a compendium of conserved cleavage and polyadenylation sites (PASs) in mammalian genes, based on approximately 1.2 billion 3' end sequencing reads from more than 360 human, mouse, and rat samples. We show that ∼80% of mammalian mRNA genes contain at least one conserved PAS, and ∼50% have conserved APA events. PAS conservation generally reduces promiscuous 3' end processing, stabilizing gene expression levels across species. Conservation of APA correlates with gene age, gene expression features, and gene functions. Genes with certain functions, such as cell morphology, cell proliferation, and mRNA metabolism, are particularly enriched with conserved APA events. Whereas tissue-specific genes typically have a low APA rate, brain-specific genes tend to evolve APA. In addition, we show enrichment of mRNA destabilizing motifs in alternative 3' UTR sequences, leading to substantial differences in mRNA stability between 3' UTR isoforms. Using conserved PASs, we reveal sequence motifs surrounding APA sites and a preference of adenosine at the cleavage site. Furthermore, we show that mutations of U-rich motifs around the PAS often accompany APA profile differences between species. Analysis of lncRNA PASs indicates a mechanism of PAS fixation through evolution of A-rich motifs. Taken together, our results present a comprehensive view of PAS evolution in mammals, and a phylogenic perspective on APA functions.
Collapse
Affiliation(s)
- Ruijia Wang
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey 07103, USA
| | - Dinghai Zheng
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey 07103, USA
| | - Ghassan Yehia
- Genome Editing Core Facility, Rutgers University, New Brunswick, New Jersey 08901, USA
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
- Rutgers Cancer Institute of New Jersey, Newark, New Jersey 07103, USA
| |
Collapse
|
18
|
Zhu Y, Vaughn JC. Experimental Verification and Evolutionary Origin of 5'-UTR Polyadenylation Sites in Arabidopsis thaliana. FRONTIERS IN PLANT SCIENCE 2018; 9:969. [PMID: 30026753 PMCID: PMC6041940 DOI: 10.3389/fpls.2018.00969] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/31/2017] [Accepted: 06/15/2018] [Indexed: 06/08/2023]
Abstract
Messenger RNA (mRNA) polyadenylation is an indispensable step during post-transcriptional pre-mRNA processing for most genes in eukaryotes. The usage of one poly(A) site over another is known as alternative polyadenylation (APA). APA has been implicated in gene expression regulation through its role of selecting the ends of a transcript. Recent studies of polyadenylation profiles in the Arabidopsis database unexpectedly predicted that a portion of the poly(A) sites are located in the 5'-UTR, which remains to be experimentally verified. We selected 16 genes from a dataset of 744, based on criteria designed to minimize problems in interpretation. Here, we experimentally verify 5'-UTR-APA in Arabidopsis for 10 of the 16 selected genes, and show for the first time existence of independent polyadenylated 5'-UTR transcripts, arising due to alternative polyadenylation. We used 3'-RACE and sequencing to validate poly(A) sites and northern blot to show that the observed short upstream transcripts do not arise from the 3'-end of a previously unrecognized convergent gene. Evidence is reported showing that two of the independent upstream open reading frame (uORF) transcripts studied, one containing a complex dual uORF, very likely arose by exon shuffling following duplication of the 5'-end from the downstream major open reading frame (mORF). Finally, results are presented to show that the uORF in this gene may encode two short functional proteins, based on observation of amino acid sequence conservation encoded by the dual uORFs.
Collapse
|
19
|
Wei LH, Song P, Wang Y, Lu Z, Tang Q, Yu Q, Xiao Y, Zhang X, Duan HC, Jia G. The m 6A Reader ECT2 Controls Trichome Morphology by Affecting mRNA Stability in Arabidopsis. THE PLANT CELL 2018; 30:968-985. [PMID: 29716990 PMCID: PMC6002187 DOI: 10.1105/tpc.17.00934] [Citation(s) in RCA: 212] [Impact Index Per Article: 35.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/04/2017] [Revised: 04/17/2018] [Accepted: 04/30/2018] [Indexed: 05/19/2023]
Abstract
The epitranscriptomic mark N6-methyladenosine (m6A) can be written, read, and erased via the action of a complex network of proteins. m6A binding proteins read m6A marks and transduce their downstream regulatory effects by altering RNA metabolic processes. The characterization of m6A readers is an essential prerequisite for understanding the roles of m6A in plants, but the identities of m6A readers have been unclear. Here, we characterized the YTH-domain family protein ECT2 as an Arabidopsis thaliana m6A reader whose m6A binding function is required for normal trichome morphology. We developed the formaldehyde cross-linking and immunoprecipitation method to identify ECT2-RNA interaction sites at the transcriptome-wide level. This analysis demonstrated that ECT2 binding sites are strongly enriched in the 3' untranslated regions (3' UTRs) of target genes and led to the identification of a plant-specific m6A motif. Sequencing analysis suggested that ECT2 plays dual roles in regulating 3' UTR processing in the nucleus and facilitating mRNA stability in the cytoplasm. Disruption of ECT2 accelerated the degradation of three ECT2 binding transcripts related to trichome morphogenesis, thereby affecting trichome branching. The results shed light on the underlying mechanisms of the roles of m6A in RNA metabolism, as well as plant development and physiology.
Collapse
Affiliation(s)
- Lian-Huan Wei
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Peizhe Song
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Ye Wang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Zhike Lu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Qian Tang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Qiong Yu
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Yu Xiao
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Xiao Zhang
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Hong-Chao Duan
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Guifang Jia
- Synthetic and Functional Biomolecules Center, Beijing National Laboratory for Molecular Sciences, Key Laboratory of Bioorganic Chemistry and Molecular Engineering of Ministry of Education, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
- National Engineering Research Center of Pesticide (Tianjin), Nankai University, Tianjin 300071, China
| |
Collapse
|
20
|
Sanfilippo P, Wen J, Lai EC. Landscape and evolution of tissue-specific alternative polyadenylation across Drosophila species. Genome Biol 2017; 18:229. [PMID: 29191225 PMCID: PMC5707805 DOI: 10.1186/s13059-017-1358-0] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 11/08/2017] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Drosophila melanogaster has one of best-described transcriptomes of any multicellular organism. Nevertheless, the paucity of 3'-sequencing data in this species precludes comprehensive assessment of alternative polyadenylation (APA), which is subject to broad tissue-specific control. RESULTS Here, we generate deep 3'-sequencing data from 23 developmental stages, tissues, and cell lines of D. melanogaster, yielding a comprehensive atlas of ~ 62,000 polyadenylated ends. These data broadly extend the annotated transcriptome, identify ~ 40,000 novel 3' termini, and reveal that two-thirds of Drosophila genes are subject to APA. Furthermore, we dramatically expand the numbers of genes known to be subject to tissue-specific APA, such as 3' untranslated region (UTR) lengthening in head and 3' UTR shortening in testis, and characterize new tissue and developmental 3' UTR patterns. Our thorough 3' UTR annotations permit reassessment of post-transcriptional regulatory networks, via conserved miRNA and RNA binding protein sites. To evaluate the evolutionary conservation and divergence of APA patterns, we generate developmental and tissue-specific 3'-seq libraries from Drosophila yakuba and Drosophila virilis. We document broadly analogous tissue-specific APA trends in these species, but also observe significant alterations in 3' end usage across orthologs. We exploit the population of functionally evolving poly(A) sites to gain clear evidence that evolutionary divergence in core polyadenylation signal (PAS) and downstream sequence element (DSE) motifs drive broad alterations in 3' UTR isoform expression across the Drosophila phylogeny. CONCLUSIONS These data provide a critical resource for the Drosophila community and offer many insights into the complex control of alternative tissue-specific 3' UTR formation and its consequences for post-transcriptional regulatory networks.
Collapse
Affiliation(s)
- Piero Sanfilippo
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, 10065, USA
- Louis V. Gerstner, Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, New York, 10065, USA
| | - Jiayu Wen
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, 10065, USA
- Present address: Biochemistry and Biomedical Sciences, Research School of Biology, ANU College of Science, The Australian National University, Canberra, ACT 2601, Australia
| | - Eric C Lai
- Department of Developmental Biology, Sloan-Kettering Institute, New York, New York, 10065, USA.
- Louis V. Gerstner, Jr. Graduate School of Biomedical Sciences, Memorial Sloan Kettering Cancer Center, New York, New York, 10065, USA.
| |
Collapse
|
21
|
Uwimana N, Collin P, Jeronimo C, Haibe-Kains B, Robert F. Bidirectional terminators in Saccharomyces cerevisiae prevent cryptic transcription from invading neighboring genes. Nucleic Acids Res 2017; 45:6417-6426. [PMID: 28383698 PMCID: PMC5499651 DOI: 10.1093/nar/gkx242] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 03/30/2017] [Indexed: 12/12/2022] Open
Abstract
Transcription can be quite disruptive for chromatin so cells have evolved mechanisms to preserve chromatin integrity during transcription, thereby preventing the emergence of cryptic transcripts from spurious promoter sequences. How these transcripts are regulated and processed remains poorly characterized. Notably, very little is known about the termination of cryptic transcripts. Here, we used RNA-Seq to identify and characterize cryptic transcripts in Spt6 mutant cells (spt6-1004) in Saccharomyces cerevisiae. We found polyadenylated cryptic transcripts running both sense and antisense relative to genes in this mutant. Cryptic promoters were enriched for TATA boxes, suggesting that the underlying DNA sequence defines the location of cryptic promoters. While intragenic sense cryptic transcripts terminate at the terminator of the genes that host them, we found that antisense cryptic transcripts preferentially terminate near the 3΄-end of the upstream gene. This finding led us to demonstrate that most terminators in yeast are bidirectional, leading to termination and polyadenylation of transcripts coming from both directions. We propose that S. cerevisiae has evolved this mechanism in order to prevent/attenuate spurious transcription from invading neighbouring genes, a feature that is particularly critical for organisms with small compact genomes.
Collapse
Affiliation(s)
- Nicole Uwimana
- Institut de recherches cliniques de Montréal, Montréal, Québec H2W 1R7, Canada
| | - Pierre Collin
- Institut de recherches cliniques de Montréal, Montréal, Québec H2W 1R7, Canada
| | - Célia Jeronimo
- Institut de recherches cliniques de Montréal, Montréal, Québec H2W 1R7, Canada
| | - Benjamin Haibe-Kains
- Princess Margaret Cancer Centre, University Health Network, Toronto, Ontario M5G 2M9, Canada.,Department of Medical Biophysics, University of Toronto, Toronto, Ontario M5G 1L7, Canada.,Department of Computer Science, University of Toronto, Toronto, Ontario M5T 3A1, Canada.,Ontario Institute of Cancer Research, Toronto, Ontario M5G 1L7, Canada
| | - François Robert
- Institut de recherches cliniques de Montréal, Montréal, Québec H2W 1R7, Canada.,Département de médecine, Faculté de médecine, Université de Montréal, Québec H3T 1J4, Canada
| |
Collapse
|
22
|
Robert F. Bidirectional terminators: an underestimated aspect of gene regulation. Curr Genet 2017; 64:389-391. [PMID: 29018946 DOI: 10.1007/s00294-017-0763-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/03/2017] [Revised: 10/05/2017] [Accepted: 10/06/2017] [Indexed: 10/18/2022]
Abstract
Recent experimental and computational work revealed that transcriptional terminators in Saccharomyces cerevisiae can terminate transcription coming from both directions. This mechanism helps budding yeast cope with the pervasive nature of transcription by limiting aberrant transcription from invading neighboring genes.
Collapse
Affiliation(s)
- François Robert
- Institut de recherches cliniques de Montréal (IRCM), 110 Avenue des Pins Ouest, Montréal, QC, H2W 1R7, Canada.
- Département de Médecine, Faculté de Médecine, Université de Montréal, 2900 Boulevard Edouard-Montpetit, Montréal, QC, H3T 1J4, Canada.
| |
Collapse
|
23
|
Liu X, Hoque M, Larochelle M, Lemay JF, Yurko N, Manley JL, Bachand F, Tian B. Comparative analysis of alternative polyadenylation in S. cerevisiae and S. pombe. Genome Res 2017; 27:1685-1695. [PMID: 28916539 PMCID: PMC5630032 DOI: 10.1101/gr.222331.117] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2017] [Accepted: 08/23/2017] [Indexed: 11/25/2022]
Abstract
Alternative polyadenylation (APA) is a widespread mechanism that generates mRNA isoforms with distinct properties. Here we have systematically mapped and compared cleavage and polyadenylation sites (PASs) in two yeast species, S. cerevisiae and S. pombe. Although >80% of the mRNA genes in each species were found to display APA, S. pombe showed greater 3′ UTR size differences among APA isoforms than did S. cerevisiae. PASs in different locations of gene are surrounded with distinct sequences in both species and are often associated with motifs involved in the Nrd1-Nab3-Sen1 termination pathway. In S. pombe, strong motifs surrounding distal PASs lead to higher abundances of long 3′ UTR isoforms than short ones, a feature that is opposite in S. cerevisiae. Differences in PAS placement between convergent genes lead to starkly different antisense transcript landscapes between budding and fission yeasts. In both species, short 3′ UTR isoforms are more likely to be expressed when cells are growing in nutrient-rich media, although different gene groups are affected in each species. Significantly, 3′ UTR shortening in S. pombe coordinates with up-regulation of expression for genes involved in translation during cell proliferation. Using S. pombe strains deficient for Pcf11 or Pab2, we show that reduced expression of 3′-end processing factors lengthens 3′ UTR, with Pcf11 having a more potent effect than Pab2. Taken together, our data indicate that APA mechanisms in S. pombe and S. cerevisiae are largely different: S. pombe has many of the APA features of higher species, and Pab2 in S. pombe has a different role in APA regulation than its mammalian homolog, PABPN1.
Collapse
Affiliation(s)
- Xiaochuan Liu
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
| | - Mainul Hoque
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
| | - Marc Larochelle
- RNA Group, Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada
| | - Jean-François Lemay
- RNA Group, Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada
| | - Nathan Yurko
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA
| | - James L Manley
- Department of Biological Sciences, Columbia University, New York, New York 10027, USA
| | - François Bachand
- RNA Group, Department of Biochemistry, Université de Sherbrooke, Sherbrooke, Quebec J1E 4K8, Canada
| | - Bin Tian
- Department of Microbiology, Biochemistry and Molecular Genetics, Rutgers New Jersey Medical School, Newark, New Jersey 07103, USA
| |
Collapse
|
24
|
Turner RE, Pattison AD, Beilharz TH. Alternative polyadenylation in the regulation and dysregulation of gene expression. Semin Cell Dev Biol 2017; 75:61-69. [PMID: 28867199 DOI: 10.1016/j.semcdb.2017.08.056] [Citation(s) in RCA: 44] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2017] [Revised: 08/30/2017] [Accepted: 08/30/2017] [Indexed: 01/08/2023]
Abstract
Transcriptional control shapes a cell's transcriptome composition, but it is RNA processing that refines its expression. The untranslated regions (UTRs) of mRNA are hotspots for regulatory control. Features in these can impact mRNA stability, localisation and translation. Here we describe how alternative cleavage and polyadenylation can change mRNA fate by changing the length of its 3'UTR.
Collapse
Affiliation(s)
- Rachael Emily Turner
- Development and stem cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, 3800, Australia
| | - Andrew David Pattison
- Development and stem cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, 3800, Australia
| | - Traude Helene Beilharz
- Development and stem cells Program, Monash Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, Victoria, 3800, Australia.
| |
Collapse
|
25
|
Prediction of Poly(A) Sites by Poly(A) Read Mapping. PLoS One 2017; 12:e0170914. [PMID: 28135292 PMCID: PMC5279776 DOI: 10.1371/journal.pone.0170914] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 01/12/2017] [Indexed: 11/19/2022] Open
Abstract
RNA-seq reads containing part of the poly(A) tail of transcripts (denoted as poly(A) reads) provide the most direct evidence for the position of poly(A) sites in the genome. However, due to reduced coverage of poly(A) tails by reads, poly(A) reads are not routinely identified during RNA-seq mapping. Nevertheless, recent studies for several herpesviruses successfully employed mapping of poly(A) reads to identify herpesvirus poly(A) sites using different strategies and customized programs. To more easily allow such analyses without requiring additional programs, we integrated poly(A) read mapping and prediction of poly(A) sites into our RNA-seq mapping program ContextMap 2. The implemented approach essentially generalizes previously used poly(A) read mapping approaches and combines them with the context-based approach of ContextMap 2 to take into account information provided by other reads aligned to the same location. Poly(A) read mapping using ContextMap 2 was evaluated on real-life data from the ENCODE project and compared against a competing approach based on transcriptome assembly (KLEAT). This showed high positive predictive value for our approach, evidenced also by the presence of poly(A) signals, and considerably lower runtime than KLEAT. Although sensitivity is low for both methods, we show that this is in part due to a high extent of spurious results in the gold standard set derived from RNA-PET data. Sensitivity improves for poly(A) sites of known transcripts or determined with a more specific poly(A) sequencing protocol and increases with read coverage on transcript ends. Finally, we illustrate the usefulness of the approach in a high read coverage scenario by a re-analysis of published data for herpes simplex virus 1. Thus, with current trends towards increasing sequencing depth and read length, poly(A) read mapping will prove to be increasingly useful and can now be performed automatically during RNA-seq mapping with ContextMap 2.
Collapse
|
26
|
Gruber AJ, Schmidt R, Gruber AR, Martin G, Ghosh S, Belmadani M, Keller W, Zavolan M. A comprehensive analysis of 3' end sequencing data sets reveals novel polyadenylation signals and the repressive role of heterogeneous ribonucleoprotein C on cleavage and polyadenylation. Genome Res 2016; 26:1145-59. [PMID: 27382025 PMCID: PMC4971764 DOI: 10.1101/gr.202432.115] [Citation(s) in RCA: 146] [Impact Index Per Article: 18.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2015] [Accepted: 05/31/2016] [Indexed: 12/22/2022]
Abstract
Alternative polyadenylation (APA) is a general mechanism of transcript diversification in mammals, which has been recently linked to proliferative states and cancer. Different 3′ untranslated region (3′ UTR) isoforms interact with different RNA-binding proteins (RBPs), which modify the stability, translation, and subcellular localization of the corresponding transcripts. Although the heterogeneity of pre-mRNA 3′ end processing has been established with high-throughput approaches, the mechanisms that underlie systematic changes in 3′ UTR lengths remain to be characterized. Through a uniform analysis of a large number of 3′ end sequencing data sets, we have uncovered 18 signals, six of which are novel, whose positioning with respect to pre-mRNA cleavage sites indicates a role in pre-mRNA 3′ end processing in both mouse and human. With 3′ end sequencing we have demonstrated that the heterogeneous ribonucleoprotein C (HNRNPC), which binds the poly(U) motif whose frequency also peaks in the vicinity of polyadenylation (poly(A)) sites, has a genome-wide effect on poly(A) site usage. HNRNPC-regulated 3′ UTRs are enriched in ELAV-like RBP 1 (ELAVL1) binding sites and include those of the CD47 gene, which participate in the recently discovered mechanism of 3′ UTR–dependent protein localization (UDPL). Our study thus establishes an up-to-date, high-confidence catalog of 3′ end processing sites and poly(A) signals, and it uncovers an important role of HNRNPC in regulating 3′ end processing. It further suggests that U-rich elements mediate interactions with multiple RBPs that regulate different stages in a transcript's life cycle.
Collapse
Affiliation(s)
- Andreas J Gruber
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Ralf Schmidt
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Andreas R Gruber
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Georges Martin
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Souvik Ghosh
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Manuel Belmadani
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Walter Keller
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| | - Mihaela Zavolan
- Computational and Systems Biology, Biozentrum, University of Basel, 4056 Basel, Switzerland
| |
Collapse
|
27
|
Simms CL, Thomas EN, Zaher HS. Ribosome-based quality control of mRNA and nascent peptides. WILEY INTERDISCIPLINARY REVIEWS-RNA 2016; 8. [PMID: 27193249 DOI: 10.1002/wrna.1366] [Citation(s) in RCA: 75] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2016] [Revised: 04/25/2016] [Accepted: 04/26/2016] [Indexed: 11/06/2022]
Abstract
Quality control processes are widespread and play essential roles in detecting defective molecules and removing them in order to maintain organismal fitness. Aberrant messenger RNA (mRNA) molecules, unless properly managed, pose a significant hurdle to cellular proteostasis. Often mRNAs harbor premature stop codons, possess structures that present a block to the translational machinery, or lack stop codons entirely. In eukaryotes, the three cytoplasmic mRNA-surveillance processes, nonsense-mediated decay (NMD), no-go decay (NGD), and nonstop decay (NSD), evolved to cope with these aberrant mRNAs, respectively. Nonstop mRNAs and mRNAs that inhibit translation elongation are especially problematic as they sequester valuable ribosomes from the translating ribosome pool. As a result, in addition to RNA degradation, NSD and NGD are intimately coupled to ribosome rescue in all domains of life. Furthermore, protein products produced from all three classes of defective mRNAs are more likely to malfunction. It is not surprising then that these truncated nascent protein products are subject to degradation. Over the past few years, many studies have begun to document a central role for the ribosome in initiating the RNA and protein quality control processes. The ribosome appears to be responsible for recognizing the target mRNAs as well as for recruiting the factors required to carry out the processes of ribosome rescue and nascent protein decay. WIREs RNA 2017, 8:e1366. doi: 10.1002/wrna.1366 For further resources related to this article, please visit the WIREs website.
Collapse
Affiliation(s)
- Carrie L Simms
- Department of Biology, Washington University in St. Louis, St. Louis, MO, USA
| | - Erica N Thomas
- Department of Biology, Washington University in St. Louis, St. Louis, MO, USA
| | - Hani S Zaher
- Department of Biology, Washington University in St. Louis, St. Louis, MO, USA
| |
Collapse
|
28
|
Abstract
Messenger RNA polyadenylation is one of the essential processing steps during eukaryotic gene expression. The site of polyadenylation [poly(A) site] marks the end of a transcript, which is also the end of a gene in most cases. A computation program that is able to recognize poly(A) sites would not only be useful for genome annotation in finding genes ends, but also for predicting alternative poly(A) sites. PASS [Poly(A) Site Sleuth] and PAC [Poly(A) site Classifier] were developed to predict poly(A) sites in plants. PASS was built based on the Generalized Hidden Markov Model (GHMM), which consists of four functional modules: input model, poly(A) site recognition module, graphic process module, and output module. PAC is a classification model, integrating several features that define the poly(A) sites including K-gram pattern, Z-curve, position-specific scoring matrix, and first-order inhomogeneous Markov sub-model. PAC can be used to predict poly(A) sites from species whose polyadenylation profile is unknown. The result of PASS and PAC is an output of a few files with one of them containing the score or probability of being a poly(A) site for each position of a given sequence. While the models were built mostly based on poly(A) profile data from Arabidopsis, it is also functional in other higher plants since their profiles are quite similar.
Collapse
Affiliation(s)
- Xiaohui Wu
- Department of Automation, Xiamen University, 422 Siming South Road, Xiamen, Fujian, 361005, China,
| | | | | |
Collapse
|
29
|
Kast A, Voges R, Schroth M, Schaffrath R, Klassen R, Meinhardt F. Autoselection of cytoplasmic yeast virus like elements encoding toxin/antitoxin systems involves a nuclear barrier for immunity gene expression. PLoS Genet 2015; 11:e1005005. [PMID: 25973601 PMCID: PMC4431711 DOI: 10.1371/journal.pgen.1005005] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2014] [Accepted: 01/14/2015] [Indexed: 12/13/2022] Open
Abstract
Cytoplasmic virus like elements (VLEs) from Kluyveromyces lactis (Kl), Pichia acaciae (Pa) and Debaryomyces robertsiae (Dr) are extremely A/T-rich (>75%) and encode toxic anticodon nucleases (ACNases) along with specific immunity proteins. Here we show that nuclear, not cytoplasmic expression of either immunity gene (PaORF4, KlORF3 or DrORF5) results in transcript fragmentation and is insufficient to establish immunity to the cognate ACNase. Since rapid amplification of 3' ends (RACE) as well as linker ligation of immunity transcripts expressed in the nucleus revealed polyadenylation to occur along with fragmentation, ORF-internal poly(A) site cleavage due to the high A/T content is likely to prevent functional expression of the immunity genes. Consistently, lowering the A/T content of PaORF4 to 55% and KlORF3 to 46% by gene synthesis entirely prevented transcript cleavage and permitted functional nuclear expression leading to full immunity against the respective ACNase toxin. Consistent with a specific adaptation of the immunity proteins to the cognate ACNases, cross-immunity to non-cognate ACNases is neither conferred by PaOrf4 nor KlOrf3. Thus, the high A/T content of cytoplasmic VLEs minimizes the potential of functional nuclear recruitment of VLE encoded genes, in particular those involved in autoselection of the VLEs via a toxin/antitoxin principle. The rather wide-spread and extremely A/T rich yeast virus like elements (VLEs, also termed linear plasmids) which encode toxic anticodon nucleases (ACNases) ensure autoselection in the cytoplasm by preventing functional nuclear capture of the cognate immunity genes, but how? When expressed in the nucleus, the mRNA of the VLE immunity genes is split into fragments to which poly(A) tails are added. Consistently, lowering the A/T content by gene synthesis prevented transcript cleavage and permitted functional nuclear expression providing full immunity against the respective ACNase toxin. Thus, internal poly(A) cleavage is likely to prevent functional nuclear immunity gene expression.
Collapse
Affiliation(s)
- Alene Kast
- Institut für Molekulare Mikrobiologie und Biotechnologie, Westfälische Wilhelms-Universität Münster, Münster, Germany
| | - Raphael Voges
- Institut für Molekulare Mikrobiologie und Biotechnologie, Westfälische Wilhelms-Universität Münster, Münster, Germany
| | - Michael Schroth
- Fachgebiet Mikrobiologie, Universität Kassel, Kassel, Germany
| | | | - Roland Klassen
- Fachgebiet Mikrobiologie, Universität Kassel, Kassel, Germany
- * E-mail: (RK); (FM)
| | - Friedhelm Meinhardt
- Institut für Molekulare Mikrobiologie und Biotechnologie, Westfälische Wilhelms-Universität Münster, Münster, Germany
- * E-mail: (RK); (FM)
| |
Collapse
|
30
|
Shalem O, Sharon E, Lubliner S, Regev I, Lotan-Pompan M, Yakhini Z, Segal E. Systematic dissection of the sequence determinants of gene 3' end mediated expression control. PLoS Genet 2015; 11:e1005147. [PMID: 25875337 PMCID: PMC4398552 DOI: 10.1371/journal.pgen.1005147] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2014] [Accepted: 03/17/2015] [Indexed: 01/09/2023] Open
Abstract
The 3'end genomic region encodes a wide range of regulatory process including mRNA stability, 3' end processing and translation. Here, we systematically investigate the sequence determinants of 3' end mediated expression control by measuring the effect of 13,000 designed 3' end sequence variants on constitutive expression levels in yeast. By including a high resolution scanning mutagenesis of more than 200 native 3' end sequences in this designed set, we found that most mutations had only a mild effect on expression, and that the vast majority (~90%) of strongly effecting mutations localized to a single positive TA-rich element, similar to a previously described 3' end processing efficiency element, and resulted in up to ten-fold decrease in expression. Measurements of 3' UTR lengths revealed that these mutations result in mRNAs with aberrantly long 3'UTRs, confirming the role for this element in 3' end processing. Interestingly, we found that other sequence elements that were previously described in the literature to be part of the polyadenylation signal had a minor effect on expression. We further characterize the sequence specificities of the TA-rich element using additional synthetic 3' end sequences and show that its activity is sensitive to single base pair mutations and strongly depends on the A/T content of the surrounding sequences. Finally, using a computational model, we show that the strength of this element in native 3' end sequences can explain some of their measured expression variability (R = 0.41). Together, our results emphasize the importance of efficient 3' end processing for endogenous protein levels and contribute to an improved understanding of the sequence elements involved in this process.
Collapse
Affiliation(s)
- Ophir Shalem
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
| | - Eilon Sharon
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
| | - Shai Lubliner
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
| | - Ifat Regev
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
| | - Maya Lotan-Pompan
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
| | - Zohar Yakhini
- Department of Computer Science, Technion, Haifa, Israel
- Agilent Laboratories, Tel Aviv, Israel
| | - Eran Segal
- Department of Computer Science and Applied Mathematics, The Weizmann Institute of Science, Rehovot, Israel
- Department of Molecular Cell Biology, The Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| |
Collapse
|
31
|
Mahajan NS, Dewangan V, Lomate PR, Joshi RS, Mishra M, Gupta VS, Giri AP. Structural features of diverse Pin-II proteinase inhibitor genes from Capsicum annuum. PLANTA 2015; 241:319-331. [PMID: 25269396 DOI: 10.1007/s00425-014-2177-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/13/2014] [Accepted: 09/15/2014] [Indexed: 06/03/2023]
Abstract
The proteinase inhibitor (PI) genes from Capsicum annuum were characterized with respect to their UTR, introns and promoter elements. The occurrence of PIs with circularly permuted domain organization was evident. Several potato inhibitor II (Pin-II) type proteinase inhibitor (PI) genes have been analyzed from Capsicum annuum (L.) with respect to their differential expression during plant defense response. However, complete gene characterization of any of these C. annuum PIs (CanPIs) has not been carried out so far. Complete gene architectures of a previously identified CanPI-7 (Beads-on-string, Type A) and a member of newly isolated Bracelet type B, CanPI-69 are reported in this study. The 5' UTR (untranslated region), 3'UTR, and intronic sequences of both the CanPI genes were obtained. The genomic sequence of CanPI-7 exhibited, exon 1 (49 base pair, bp) and exon 2 (740 bp) interrupted by a 294-bp long type I intron. We noted the occurrence of three multi-domain PIs (CanPI-69, 70, 71) with circularly permuted domain organization. CanPI-69 was found to possess exon 1 (49 bp), exon 2 (551 bp) and a 584-bp long type I intron. The upstream sequence analysis of CanPI-7 and CanPI-69 predicted various transcription factor-binding sites including TATA and CAAT boxes, hormone-responsive elements (ABRELATERD1, DOFCOREZM, ERELEE4), and a defense-responsive element (WRKY71OS). Binding of transcription factors such as zinc finger motif MADS-box and MYB to the promoter regions was confirmed using electrophoretic mobility shift assay followed by mass spectrometric identification. The 3' UTR analysis for 25 CanPI genes revealed unique/distinct 3' UTR sequence for each gene. Structures of three domain CanPIs of type A and B were predicted and further analyzed for their attributes. This investigation of CanPI gene architecture will enable the better understanding of the genetic elements present in CanPIs.
Collapse
Affiliation(s)
- Neha S Mahajan
- Division of Biochemical Sciences, Plant Molecular Biology Unit, CSIR-National Chemical Laboratory, Dr. Homi Bhabha Road, Pune, 411 008, Maharashtra, India
| | | | | | | | | | | | | |
Collapse
|
32
|
Li XQ, Du D. Motif types, motif locations and base composition patterns around the RNA polyadenylation site in microorganisms, plants and animals. BMC Evol Biol 2014; 14:162. [PMID: 25052519 PMCID: PMC4360255 DOI: 10.1186/s12862-014-0162-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Accepted: 07/14/2014] [Indexed: 12/22/2022] Open
Abstract
Background The polyadenylation of RNA is critical for gene functioning, but the conserved sequence motifs (often called signal or signature motifs), motif locations and abundances, and base composition patterns around mRNA polyadenylation [poly(A)] sites are still uncharacterized in most species. The evolutionary tendency for poly(A) site selection is still largely unknown. Results We analyzed the poly(A) site regions of 31 species or phyla. Different groups of species showed different poly(A) signal motifs: UUACUU at the poly(A) site in the parasite Trypanosoma cruzi; UGUAAC (approximately 13 bases upstream of the site) in the alga Chlamydomonas reinhardtii; UGUUUG (or UGUUUGUU) at mainly the fourth base downstream of the poly(A) site in the parasite Blastocystis hominis; and AAUAAA at approximately 16 bases and approximately 19 bases upstream of the poly(A) site in animals and plants, respectively. Polyadenylation signal motifs are usually several hundred times more abundant around poly(A) sites than in whole genomes. These predominant motifs usually had very specific locations, whether upstream of, at, or downstream of poly(A) sites, depending on the species or phylum. The poly(A) site was usually an adenosine (A) in all analyzed species except for B. hominis, and there was weak A predominance in C. reinhardtii. Fungi, animals, plants, and the protist Phytophthora infestans shared a general base abundance pattern (or base composition pattern) of “U-rich—A-rich—U-rich—Poly(A) site—U-rich regions”, or U-A-U-A-U for short, with some variation for each kingdom or subkingdom. Conclusion This study identified the poly(A) signal motifs, motif locations, and base composition patterns around mRNA poly(A) sites in protists, fungi, plants, and animals and provided insight into poly(A) site evolution.
Collapse
Affiliation(s)
- Xiu-Qing Li
- Molecular Genetics Laboratory, Potato Research Centre, Agriculture and Agri-Food Canada, 850 Lincoln Road, Fredericton, New Brunswick, E3B 4Z7, Canada.
| | - Donglei Du
- Quantitative Methods Research Group, Faculty of Business Administration, University of New Brunswick, 7 Macaulay Lane, Fredericton, NB, E3B 5A3, Canada.
| |
Collapse
|
33
|
Wu X, Gaffney B, Hunt AG, Li QQ. Genome-wide determination of poly(A) sites in Medicago truncatula: evolutionary conservation of alternative poly(A) site choice. BMC Genomics 2014; 15:615. [PMID: 25048171 PMCID: PMC4117952 DOI: 10.1186/1471-2164-15-615] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2014] [Accepted: 07/15/2014] [Indexed: 11/16/2022] Open
Abstract
Background Alternative polyadenylation (APA) plays an important role in the post-transcriptional regulation of gene expression. Little is known about how APA sites may evolve in homologous genes in different plant species. To this end, comparative studies of APA sites in different organisms are needed. In this study, a collection of poly(A) sites in Medicago truncatula, a model system for legume plants, has been generated and compared with APA sites in Arabidopsis thaliana. Results The poly(A) tags from a deep-sequencing protocol were mapped to the annotated M. truncatula genome, and the identified poly(A) sites used to update the annotations of 14,203 genes. The results show that 64% of M. truncatula genes possess more than one poly(A) site, comparable to the percentages reported for Arabidopsis and rice. In addition, the poly(A) signals associated with M. truncatula genes were similar to those seen in Arabidopsis and other plants. The 3′-UTR lengths are correlated in pairs of orthologous genes between M. truncatula and Arabidopsis. Very little conservation of intronic poly(A) sites was found between Arabidopsis and M. truncatula, which suggests that such sites are likely to be species-specific in plants. In contrast, there is a greater conservation of CDS-localized poly(A) sites in these two species. A sizeable number of M. truncatula antisense poly(A) sites were found. A high percentage of the associated target genes possess Arabidopsis orthologs that are also associated with antisense sites. This is suggestive of important roles for antisense regulation of these target genes. Conclusions Our results reveal some distinct patterns of sense and antisense poly(A) sites in Arabidopsis and M. truncatula. In so doing, this study lends insight into general evolutionary trends of alternative polyadenylation in plants. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-615) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | | | - Arthur G Hunt
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, KY, USA.
| | | |
Collapse
|
34
|
Doroshenk KA, Tian L, Crofts AJ, Kumamaru T, Okita TW. Characterization of RNA binding protein RBP-P reveals a possible role in rice glutelin gene expression and RNA localization. PLANT MOLECULAR BIOLOGY 2014; 85:381-394. [PMID: 24682961 DOI: 10.1007/s11103-014-0191-z] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/11/2013] [Accepted: 03/22/2014] [Indexed: 06/03/2023]
Abstract
RNA binding proteins (RBPs) play an important role in mRNA metabolism including synthesis, maturation, transport, localization, and stability. In developing rice seeds, RNAs that code for the major storage proteins are transported to specific domains of the cortical endoplasmic reticulum (ER) by a regulated mechanism requiring RNA cis-localization elements, or zipcodes. Putative trans-acting RBPs that recognize prolamine RNA zipcodes required for restricted localization to protein body-ER have previously been identified. Here, we describe the identification of RBP-P using a Northwestern blot approach as an RBP that recognizes and binds to glutelin zipcode RNA, which is required for proper RNA localization to cisternal-ER. RBP-P protein expression coincides with that of glutelin during seed maturation and is localized to both the nucleus and cytosol. RNA-immunoprecipitation and subsequent RT-PCR analysis further demonstrated that RBP-P interacts with glutelin RNAs. In vitro RNA-protein UV-crosslinking assays showed that recombinant RBP-P binds strongly to glutelin mRNA, and in particular, 3' UTR and zipcode RNA. RBP-P also exhibited strong binding activity to a glutelin intron sequence, suggesting that RBP-P might participate in mRNA splicing. Overall, these results support a multifunctional role for RBP-P in glutelin mRNA metabolism, perhaps in nuclear pre-mRNA splicing and cytosolic localization to the cisternal-ER.
Collapse
Affiliation(s)
- Kelly A Doroshenk
- Institute of Biological Chemistry, Washington State University, Pullman, WA, 99164-6340, USA
| | | | | | | | | |
Collapse
|
35
|
Laishram RS. Poly(A) polymerase (PAP) diversity in gene expression--star-PAP vs canonical PAP. FEBS Lett 2014; 588:2185-97. [PMID: 24873880 PMCID: PMC6309179 DOI: 10.1016/j.febslet.2014.05.029] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2014] [Revised: 05/02/2014] [Accepted: 05/15/2014] [Indexed: 01/09/2023]
Abstract
Almost all eukaryotic mRNAs acquire a poly(A) tail at the 3'-end by a concerted RNA processing event: cleavage and polyadenylation. The canonical PAP, PAPα, was considered the only nuclear PAP involved in general polyadenylation of mRNAs. A phosphoinositide-modulated nuclear PAP, Star-PAP, was then reported to regulate a select set of mRNAs in the cell. In addition, several non-canonical PAPs have been identified with diverse cellular functions. Further, canonical PAP itself exists in multiple isoforms thus illustrating the diversity of PAPs. In this review, we compare two nuclear PAPs, Star-PAP and PAPα with a general overview of PAP diversity in the cell. Emerging evidence suggests distinct niches of target pre-mRNAs for the two PAPs and that modulation of these PAPs regulates distinct cellular functions.
Collapse
Affiliation(s)
- Rakesh S Laishram
- Cancer Research Program, Rajiv Gandhi Centre for Biotechnology, Thiruvananthapuram 695014, India.
| |
Collapse
|
36
|
Li XQ. Comparative analysis of the base compositions of the pre-mRNA 3' cleaved-off region and the mRNA 3' untranslated region relative to the genomic base composition in animals and plants. PLoS One 2014; 9:e99928. [PMID: 24941005 PMCID: PMC4062462 DOI: 10.1371/journal.pone.0099928] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2013] [Accepted: 05/20/2014] [Indexed: 12/26/2022] Open
Abstract
The precursor messenger RNA (pre-mRNA) three-prime cleaved-off region (3′COR) and the mRNA three-prime untranslated region (3′UTR) play critical roles in regulating gene expression. The differences in base composition between these regions and the corresponding genomes are still largely uncharacterized in animals and plants. In this study, the base compositions of non-redundant 3′CORs and 3′UTRs were compared with the corresponding whole genomes of eleven animals, four dicotyledonous plants, and three monocotyledonous (cereal) plants. Among the four bases (A, C, G, and U for adenine, cytosine, guanine, and uracil, respectively), U (which corresponds to T, for thymine, in DNA) was the most frequent, A the second most frequent, G the third most frequent, and C the least frequent in most of the species in both the 3′COR and 3′UTR regions. In comparison with the whole genomes, in both regions the U content was usually the most overrepresented (particularly in the monocotyledonous plants), and the C content was the most underrepresented. The order obtained for the species groups, when ranked from high to low according to the U contents in the 3′COR and 3′UTR was as follows: dicotyledonous plants, monocotyledonous plants, non-mammal animals, and mammals. In contrast, the genomic T content was highest in dicotyledonous plants, lowest in monocotyledonous plants, and intermediate in animals. These results suggest the following: 1) there is a mechanism operating in both animals and plants which is biased toward U and against C in the 3′COR and 3′UTR; 2) the 3′UTR and 3′COR, as functional units, minimized the difference between dicotyledonous and monocotyledonous plants, while the dicotyledonous and monocotyledonous genomes evolved into two extreme groups in terms of base composition.
Collapse
Affiliation(s)
- Xiu-Qing Li
- Potato Research Centre, Agriculture and Agri-Food Canada, Fredericton, New Brunswick, Canada
- * E-mail:
| |
Collapse
|
37
|
Bioinformatics analysis of alternative polyadenylation in green alga Chlamydomonas reinhardtii using transcriptome sequences from three different sequencing platforms. G3-GENES GENOMES GENETICS 2014; 4:871-83. [PMID: 24626288 PMCID: PMC4025486 DOI: 10.1534/g3.114.010249] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
Messenger RNA 3′-end formation is an essential posttranscriptional processing step for most eukaryotic genes. Different from plants and animals where AAUAAA and its variants routinely are found as the main poly(A) signal, Chlamydomonas reinhardtii uses UGUAA as the major poly(A) signal. The advance of sequencing technology provides an enormous amount of sequencing data for us to explore the variations of poly(A) signals, alternative polyadenylation (APA), and its relationship with splicing in this algal species. Through genome-wide analysis of poly(A) sites in C. reinhardtii, we identified a large number of poly(A) sites: 21,041 from Sanger expressed sequence tags, 88,184 from 454, and 195,266 from Illumina sequence reads. In comparison with previous collections, more new poly(A) sites are found in coding sequences and intron and intergenic regions by deep-sequencing. Interestingly, G-rich signals are particularly abundant in intron and intergenic regions. The prevalence of different poly(A) signals between coding sequences and a 3′-untranslated region implies potentially different polyadenylation mechanisms. Our data suggest that the APA occurs in about 68% of C. reinhardtii genes. Using Gene Ontolgy analysis, we found most of the APA genes are involved in RNA regulation and metabolic process, protein synthesis, hydrolase, and ligase activities. Moreover, intronic poly(A) sites are more abundant in constitutively spliced introns than retained introns, suggesting an interplay between polyadenylation and splicing. Our results support that APA, as in higher eukaryotes, may play significant roles in increasing transcriptome diversity and gene expression regulation in this algal species. Our datasets also provide useful information for accurate annotation of transcript ends in C. reinhardtii.
Collapse
|
38
|
Delineating the structural blueprint of the pre-mRNA 3'-end processing machinery. Mol Cell Biol 2014; 34:1894-910. [PMID: 24591651 DOI: 10.1128/mcb.00084-14] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Processing of mRNA precursors (pre-mRNAs) by polyadenylation is an essential step in gene expression. Polyadenylation consists of two steps, cleavage and poly(A) synthesis, and requires multiple cis elements in the pre-mRNA and a megadalton protein complex bearing the two essential enzymatic activities. While genetic and biochemical studies remain the major approaches in characterizing these factors, structural biology has emerged during the past decade to help understand the molecular assembly and mechanistic details of the process. With structural information about more proteins and higher-order complexes becoming available, we are coming closer to obtaining a structural blueprint of the polyadenylation machinery that explains both how this complex functions and how it is regulated and connected to other cellular processes.
Collapse
|
39
|
Tanaka M, Tokuoka M, Gomi K. Effects of codon optimization on the mRNA levels of heterologous genes in filamentous fungi. Appl Microbiol Biotechnol 2014; 98:3859-67. [PMID: 24682479 DOI: 10.1007/s00253-014-5609-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2014] [Revised: 02/07/2014] [Accepted: 02/10/2014] [Indexed: 10/25/2022]
Abstract
Filamentous fungi, particularly Aspergillus species, have recently attracted attention as host organisms for recombinant protein production. Because the secretory yields of heterologous proteins are generally low compared with those of homologous proteins or proteins from closely related fungal species, several strategies to produce substantial amounts of recombinant proteins have been conducted. Codon optimization is a powerful tool for improving the production levels of heterologous proteins. Although codon optimization is generally believed to improve the translation efficiency of heterologous genes without affecting their mRNA levels, several studies have indicated that codon optimization causes an increase in the steady-state mRNA levels of heterologous genes in filamentous fungi. However, the mechanism that determines the low mRNA levels when native heterologous genes are expressed was poorly understood. We recently showed that the transcripts of heterologous genes are polyadenylated prematurely within the coding region and that the heterologous gene transcripts can be stabilized significantly by codon optimization, which is probably attributable to the prevention of premature polyadenylation in Aspergillus oryzae. In this review, we describe the detailed mechanism of premature polyadenylation and the rapid degradation of mRNA transcripts derived from heterologous genes in filamentous fungi.
Collapse
Affiliation(s)
- Mizuki Tanaka
- Department of Bioindustrial Informatics and Genomics, Laboratory of Bioindustrial Genomics, Graduate School of Agricultural Science, Tohoku University, 1-1 Tsutsumidori-Amamiyamachi, Aoba-ku, Sendai, 981-8555, Japan,
| | | | | |
Collapse
|
40
|
Xie B, Jankovic BR, Bajic VB, Song L, Gao X. Poly(A) motif prediction using spectral latent features from human DNA sequences. Bioinformatics 2013; 29:i316-25. [PMID: 23813000 PMCID: PMC3694652 DOI: 10.1093/bioinformatics/btt218] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
MOTIVATION Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA. Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent. A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. However, most of these methods involve extensive manual feature engineering, which can be time-consuming and can require in-depth domain knowledge. RESULTS We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). Generative learning provides a rich palette on which the uncertainty and diversity of sequence information can be handled, while discriminative learning allows the performance of the classification task to be directly optimized. Here, we used hidden Markov models for fitting the DNA sequence dynamics, and developed an efficient spectral algorithm for extracting latent variable information from these models. These spectral latent features were then fed into support vector machines to fine-tune the classification performance. We evaluated our proposed method on a comprehensive human poly(A) dataset that consists of 14 740 samples from 12 of the most abundant variants of human poly(A) motifs. Compared with one of the previous state-of-the-art methods in the literature (the random forest model with expert-crafted features), our method reduces the average error rate, false-negative rate and false-positive rate by 26, 15 and 35%, respectively. Meanwhile, our method makes ~30% fewer error predictions relative to the other string kernels. Furthermore, our method can be used to visualize the importance of oligomers and positions in predicting poly(A) motifs, from which we can observe a number of characteristics in the surrounding regions of true and false motifs that have not been reported before. AVAILABILITY http://sfb.kaust.edu.sa/Pages/Software.aspx. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Bo Xie
- College of Computing, Georgia Institute of Technology, Atlanta, GA 30332, USA
| | | | | | | | | |
Collapse
|
41
|
Tseng SH, Cheng CY, Huang MZ, Chung MY, Su TS. Modulation of formation of the 3'-end of the human argininosuccinate synthetase mRNA by GT-repeat polymorphism. INTERNATIONAL JOURNAL OF BIOCHEMISTRY AND MOLECULAR BIOLOGY 2013; 4:179-190. [PMID: 24380022 PMCID: PMC3867704] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Received: 10/21/2013] [Accepted: 11/26/2013] [Indexed: 06/03/2023]
Abstract
Microsatellites are abundant in the human genome and may acquire context-dependent functions. A highly polymorphic GT microsatellite is located downstream of the poly(A) signal of the human argininosuccinate synthetase (ASS1) gene. The ASS1 participates in urea and nitric oxide production and is a rate-limiting enzyme in arginine biosynthesis. To examine possible involvement of the GT microsatellite in ASS1 mRNA 3'-end formation, ASS1 minigene constructs were used in transient transfection for assessment of poly(A) site usage by S1 nuclease mapping. Synthesis of the major human ASS1 mRNA is found to be controlled by two consecutive non-canonical poly(A) signals, UAUAAA and AUUAAA, located 7 nucleotides apart where a U-rich sequence and the GU microsatellite serve as their respective downstream GU/U-rich elements. Moreover, AUUAAA utilization is affected by the GU-repeat number possibly leading to differential regulation of ASS1 polyadenylation in individuals with different repeat numbers. Interestingly, the less efficient UAUAAA motif is noted to be the major ASS1 poly(A) signal possibly as a result of an indispensable downstream U-rich element and restricted utilization of the AUUAAA motif by the presence of extended GU-repeats. The UAUAAA motif and the GT microsatellite are conserved only in primates whereas AUUAAA motif is present in all mammals analyzed. The suboptimal UAUAAA motif and the utilization of the polymorphic GT microsatellite as polyadenylation signal of the ASS1 gene may be used as a strategy in primates to modulate ASS1 level in response to interactions of genetic and environmental factors.
Collapse
Affiliation(s)
- Shih-Heng Tseng
- Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming UniversityTaipei, Taiwan
| | - Cheng-Yi Cheng
- Department of Medical Research, Taipei Veterans General HospitalTaipei, Taiwan
| | - Miao-Zeng Huang
- Department of Medical Research, Taipei Veterans General HospitalTaipei, Taiwan
| | - Ming-Yi Chung
- Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming UniversityTaipei, Taiwan
- Department of Medical Research, Taipei Veterans General HospitalTaipei, Taiwan
| | - Tsung-Sheng Su
- Department of Life Sciences and Institute of Genome Sciences, National Yang-Ming UniversityTaipei, Taiwan
- Department of Medical Research, Taipei Veterans General HospitalTaipei, Taiwan
| |
Collapse
|
42
|
Li XQ, Du D. RNA polyadenylation sites on the genomes of microorganisms, animals, and plants. PLoS One 2013; 8:e79511. [PMID: 24260238 PMCID: PMC3832601 DOI: 10.1371/journal.pone.0079511] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2013] [Accepted: 09/29/2013] [Indexed: 01/15/2023] Open
Abstract
Pre–messenger RNA (mRNA) 3′-end cleavage and subsequent polyadenylation strongly regulate gene expression. In comparison with the upstream or downstream motifs, relatively little is known about the feature differences of polyadenylation [poly(A)] sites among major kingdoms. We suspect that the precise poly(A) sites are very selective, and we therefore mapped mRNA poly(A) sites on complete and nearly complete genomes using mRNA sequences available in the National Center for Biotechnology Information (NCBI) Nucleotide database. In this paper, we describe the mRNA nucleotide [i.e., the poly(A) tail attachment position] that is directly in attachment with the poly(A) tail and the pre-mRNA nucleotide [i.e., the poly(A) tail starting position] that corresponds to the first adenosine of the poly(A) tail in the 29 most-mapped species (2 fungi, 2 protists, 18 animals, and 7 plants). The most representative pre-mRNA dinucleotides covering these two positions were UA, CA, and GA in 17, 10, and 2 of the species, respectively. The pre-mRNA nucleotide at the poly(A) tail starting position was typically an adenosine [i.e., A-type poly(A) sites], sometimes a uridine, and occasionally a cytidine or guanosine. The order was U>C>G at the attachment position but A>>U>C≥G at the starting position. However, in comparison with the mRNA nucleotide composition (base composition), the poly(A) tail attachment position selected C over U in plants and both C and G over U in animals, in both A-type and non-A-type poly(A) sites. Animals, dicot plants, and monocot plants had clear differences in C/G ratios at the poly(A) tail attachment position of the non-A-type poly(A) sites. This study of poly(A) site evolution indicated that the two positions within poly(A) sites had distinct nucleotide compositions and were different among kingdoms.
Collapse
Affiliation(s)
- Xiu-Qing Li
- Molecular Genetics Laboratory, Potato Research Centre, Agriculture and Agri-Food Canada, Fredericton, New Brunswick, Canada
- * E-mail:
| | - Donglei Du
- Quantitative Methods Research Group, Faculty of Business Administration, University of New Brunswick, Fredericton, New Brunswick, Canada
| |
Collapse
|
43
|
Sheppard S, Lawson ND, Zhu LJ. Accurate identification of polyadenylation sites from 3' end deep sequencing using a naive Bayes classifier. ACTA ACUST UNITED AC 2013; 29:2564-71. [PMID: 23962617 DOI: 10.1093/bioinformatics/btt446] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
MOTIVATION 3' end processing is important for transcription termination, mRNA stability and regulation of gene expression. To identify 3' ends, most techniques use an oligo-dT primer to construct deep sequencing libraries. However, this approach can lead to identification of artifactual polyadenylation sites due to internal priming in homopolymeric stretches of adenines. Although heuristic filters have been applied in these cases, they typically result in a high proportion of both false-positive and -negative classifications. Therefore, there is a need to develop improved algorithms to better identify mis-priming events in oligo-dT primed sequences. RESULTS By analyzing sequence features flanking 3' ends derived from oligo-dT-based sequencing, we developed a naïve Bayes classifier to classify them as true or false/internally primed. The resulting algorithm is highly accurate, outperforms previous heuristic filters and facilitates identification of novel polyadenylation sites.
Collapse
Affiliation(s)
- Sarah Sheppard
- Program in Gene Function and Expression and Program in Bioinformatics and Integrative Biology, University of Massachusetts Medical School, 364 Plantation St, Worcester, MA 01605, USA
| | | | | |
Collapse
|
44
|
Measurements of the impact of 3' end sequences on gene expression reveal wide range and sequence dependent effects. PLoS Comput Biol 2013; 9:e1002934. [PMID: 23505350 PMCID: PMC3591272 DOI: 10.1371/journal.pcbi.1002934] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Accepted: 01/08/2013] [Indexed: 12/21/2022] Open
Abstract
A full understanding of gene regulation requires an understanding of the contributions that the various regulatory regions have on gene expression. Although it is well established that sequences downstream of the main promoter can affect expression, our understanding of the scale of this effect and how it is encoded in the DNA is limited. Here, to measure the effect of native S. cerevisiae 3′ end sequences on expression, we constructed a library of 85 fluorescent reporter strains that differ only in their 3′ end region. Notably, despite being driven by the same strong promoter, our library spans a continuous twelve-fold range of expression values. These measurements correlate with endogenous mRNA levels, suggesting that the 3′ end contributes to constitutive differences in mRNA levels. We used deep sequencing to map the 3′UTR ends of our strains and show that determination of polyadenylation sites is intrinsic to the local 3′ end sequence. Polyadenylation mapping was followed by sequence analysis, we found that increased A/T content upstream of the main polyadenylation site correlates with higher expression, both in the library and genome-wide, suggesting that native genes differ by the encoded efficiency of 3′ end processing. Finally, we use single cells fluorescence measurements, in different promoter activation levels, to show that 3′ end sequences modulate protein expression dynamics differently than promoters, by predominantly affecting the size of protein production bursts as opposed to the frequency at which these bursts occur. Altogether, our results lead to a more complete understanding of gene regulation by demonstrating that 3′ end regions have a unique and sequence dependent effect on gene expression. A basic question in gene expression is the relative contribution of different regulatory layers and genomic regions to the differences in protein levels. In this work we concentrated on the effect of 3′ end sequences. For this, we constructed a library of yeast strains that differ only by a native 3′ end region integrated downstream to a reported gene driven by a constant inducible promoter. Thus we could attribute all differences in reporter expression between the strains to the different 3′ end sequences. Interestingly, we found that despite being driven by the same strong, inducible promoter, our library spanned a wide and continuous range of expression levels of more than twelve-fold. As these measurements represent the sole effect of the 3′ end region, we quantify the contribution of these sequences to the variance in mRNA levels by comparing our measurements to endogenous mRNA levels. We follow by sequence analysis to find a simple sequence signature that correlates with expression. In addition, single cell analysis reveals distinct noise dynamics of 3′ end mediated differences in expression compared to different levels of promoter activation leading to a more complete understanding of gene expression which also incorporates the effect of these regions.
Collapse
|
45
|
Benachenhou F, Sperber GO, Bongcam-Rudloff E, Andersson G, Boeke JD, Blomberg J. Conserved structure and inferred evolutionary history of long terminal repeats (LTRs). Mob DNA 2013; 4:5. [PMID: 23369192 PMCID: PMC3601003 DOI: 10.1186/1759-8753-4-5] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2012] [Accepted: 12/14/2012] [Indexed: 11/30/2022] Open
Abstract
Background Long terminal repeats (LTRs, consisting of U3-R-U5 portions) are important elements of retroviruses and related retrotransposons. They are difficult to analyse due to their variability. The aim was to obtain a more comprehensive view of structure, diversity and phylogeny of LTRs than hitherto possible. Results Hidden Markov models (HMM) were created for 11 clades of LTRs belonging to Retroviridae (class III retroviruses), animal Metaviridae (Gypsy/Ty3) elements and plant Pseudoviridae (Copia/Ty1) elements, complementing our work with Orthoretrovirus HMMs. The great variation in LTR length of plant Metaviridae and the few divergent animal Pseudoviridae prevented building HMMs from both of these groups. Animal Metaviridae LTRs had the same conserved motifs as retroviral LTRs, confirming that the two groups are closely related. The conserved motifs were the short inverted repeats (SIRs), integrase recognition signals (5´TGTTRNR…YNYAACA 3´); the polyadenylation signal or AATAAA motif; a GT-rich stretch downstream of the polyadenylation signal; and a less conserved AT-rich stretch corresponding to the core promoter element, the TATA box. Plant Pseudoviridae LTRs differed slightly in having a conserved TATA-box, TATATA, but no conserved polyadenylation signal, plus a much shorter R region. The sensitivity of the HMMs for detection in genomic sequences was around 50% for most models, at a relatively high specificity, suitable for genome screening. The HMMs yielded consensus sequences, which were aligned by creating an HMM model (a ‘Superviterbi’ alignment). This yielded a phylogenetic tree that was compared with a Pol-based tree. Both LTR and Pol trees supported monophyly of retroviruses. In both, Pseudoviridae was ancestral to all other LTR retrotransposons. However, the LTR trees showed the chromovirus portion of Metaviridae clustering together with Pseudoviridae, dividing Metaviridae into two portions with distinct phylogeny. Conclusion The HMMs clearly demonstrated a unitary conserved structure of LTRs, supporting that they arose once during evolution. We attempted to follow the evolution of LTRs by tracing their functional foundations, that is, acquisition of RNAse H, a combined promoter/ polyadenylation site, integrase, hairpin priming and the primer binding site (PBS). Available information did not support a simple evolutionary chain of events.
Collapse
Affiliation(s)
- Farid Benachenhou
- Section of Virology, Department of Medical Sciences, Uppsala University, Uppsala, Sweden.
| | | | | | | | | | | |
Collapse
|
46
|
Molecular cloning, expression profiles and subcellular localization of cyclin B in ovary of the mud crab, Scylla paramamosain. Genes Genomics 2013. [DOI: 10.1007/s13258-013-0077-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
47
|
Abstract
BACKGROUND Polyadenylation is present in all three domains of life, making it the most conserved post-transcriptional process compared with splicing and 5'-capping. Even though most mammalian poly(A) sites contain a highly conserved hexanucleotide in the upstream region and a far less conserved U/GU-rich sequence in the downstream region, there are many exceptions. Furthermore, poly(A) sites in other species, such as plants and invertebrates, exhibit high deviation from this genomic structure, making the construction of a general poly(A) site recognition model challenging. We surveyed nine poly(A) site prediction methods published between 1999 and 2011. All methods exploit the skewed nucleotide profile across the poly(A) sites, and the highly conserved poly(A) signal as the primary features for recognition. These methods typically use a large number of features, which increases the dimensionality of the models to crippling degrees, and typically are not validated against many kinds of genomes. RESULTS We propose a poly(A) site model that employs minimal features to capture the essence of poly(A) sites, and yet, produces better prediction accuracy across diverse species. Our model consists of three dior-trinucleotide profiles identified through principle component analysis, and the predicted nucleosome occupancy flanking the poly(A) sites. We validated our model using two machine learning methods: logistic regression and linear discriminant analysis. Results show that models achieve 85-92% sensitivity and 85-96% specificity in seven animals and plants. When we applied one model from one species to predict poly(A) sites from other species, the sensitivity scores correlate with phylogenetic distances. CONCLUSIONS A four-feature model geared towards small motifs was sufficient to accurately learn and predict poly(A) sites across eukaryotes.
Collapse
Affiliation(s)
- Eric S Ho
- Department of Molecular Genetics, Microbiology and Immunology, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School, Piscataway, New Jersey, USA.
| | | | | |
Collapse
|
48
|
Ghosh S, Wang Y, Cook JA, Chhiba L, Vaughn JC. A molecular, phylogenetic and functional study of the dADAR mRNA truncated isoform during Drosophila embryonic development reveals an editing-independent function. ACTA ACUST UNITED AC 2013; 3:20-30. [PMID: 25414802 PMCID: PMC4235677 DOI: 10.4236/ojas.2013.34a2003] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Adenosine Deaminases Acting on RNA (ADARs) have been studied in many animal phyla, where they have been shown to deaminate specific adenosines into inosines in duplex mRNA regions. In Drosophila, two isoform classes are encoded, designated full-length (contains the editase domain) and truncated (lacks this domain). Much is known about the full-length isoform, which plays a major role in regulating functions of voltage-gated ion channel proteins in the adult brain. In contrast, almost nothing is known about the functional significance of the truncated isoform. In situ hybridization shows that both isoform mRNA classes are maternally derived and transcripts for both localize primarily to the developing central nervous system. Quantitative RT-PCR shows that about 35% of all dADAR mRNA transcripts belong to the truncated class in embryos. 3′-RACE results show that abundance of the truncated isoform class is developmentally regulated, with a longer transcript appearing after the mid-blastula transition. 3′-UTR sequences for the truncated isoform have been determined from diverse Drosophila species and important regulatory regions including stop codons have been mapped. Western analysis shows that both mRNA isoform classes are translated into protein during embryonic development, as full-length variant levels gradually diminish. The truncated protein isoform is present in every Drosophila species studied, extending over a period spanning about 40 × 106 years, implying a conserved function. Previous work has shown that a dADAR protein isoform binds to the evolutionarily conserved rnp-4f pre-mRNA stem-loop located in the 5′-UTR to regulate splicing, while no RNA editing was observed, suggesting the hypothesis that it is the non-catalytic truncated isoform which regulates splicing. To test this hypothesis, we have utilized RNAi technology, the results of which support the hypothesis. These results demonstrate a novel, non-catalytic function for the truncated dADAR protein isoform in Drosophila embryonic development, which is very likely evolutionarily conserved.
Collapse
Affiliation(s)
- Sushmita Ghosh
- Department of Biology, Cell Molecular and Structural Biology Program, Miami University, Oxford, USA
| | - Yaqi Wang
- Department of Biology, Cell Molecular and Structural Biology Program, Miami University, Oxford, USA
| | - John A Cook
- Department of Biology, Cell Molecular and Structural Biology Program, Miami University, Oxford, USA
| | - Lea Chhiba
- Department of Biology, Cell Molecular and Structural Biology Program, Miami University, Oxford, USA
| | - Jack C Vaughn
- Department of Biology, Cell Molecular and Structural Biology Program, Miami University, Oxford, USA
| |
Collapse
|
49
|
Liu S, Zhang Y, Zhou Z, Waldbieser G, Sun F, Lu J, Zhang J, Jiang Y, Zhang H, Wang X, Rajendran KV, Khoo L, Kucuktas H, Peatman E, Liu Z. Efficient assembly and annotation of the transcriptome of catfish by RNA-Seq analysis of a doubled haploid homozygote. BMC Genomics 2012; 13:595. [PMID: 23127152 PMCID: PMC3582483 DOI: 10.1186/1471-2164-13-595] [Citation(s) in RCA: 101] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2012] [Accepted: 08/09/2012] [Indexed: 01/29/2023] Open
Abstract
Background Upon the completion of whole genome sequencing, thorough genome annotation that associates genome sequences with biological meanings is essential. Genome annotation depends on the availability of transcript information as well as orthology information. In teleost fish, genome annotation is seriously hindered by genome duplication. Because of gene duplications, one cannot establish orthologies simply by homology comparisons. Rather intense phylogenetic analysis or structural analysis of orthologies is required for the identification of genes. To conduct phylogenetic analysis and orthology analysis, full-length transcripts are essential. Generation of large numbers of full-length transcripts using traditional transcript sequencing is very difficult and extremely costly. Results In this work, we took advantage of a doubled haploid catfish, which has two sets of identical chromosomes and in theory there should be no allelic variations. As such, transcript sequences generated from next-generation sequencing can be favorably assembled into full-length transcripts. Deep sequencing of the doubled haploid channel catfish transcriptome was performed using Illumina HiSeq 2000 platform, yielding over 300 million high-quality trimmed reads totaling 27 Gbp. Assembly of these reads generated 370,798 non-redundant transcript-derived contigs. Functional annotation of the assembly allowed identification of 25,144 unique protein-encoding genes. A total of 2,659 unique genes were identified as putative duplicated genes in the catfish genome because the assembly of the corresponding transcripts harbored PSVs or MSVs (in the form of pseudo-SNPs in the assembly). Of the 25,144 contigs with unique protein hits, around 20,000 contigs matched 50% length of reference proteins, and over 14,000 transcripts were identified as full-length with complete open reading frames. The characterization of consensus sequences surrounding start codon and the stop codon confirmed the correct assembly of the full-length transcripts. Conclusions The large set of transcripts assembled in this study is the most comprehensive set of genome resources ever developed from catfish, which will provide the much needed resources for functional genome research in catfish, serving as a reference transcriptome for genome annotation, analysis of gene duplication, gene family structures, and digital gene expression analysis. The putative set of duplicated genes provide a starting point for genome scale analysis of gene duplication in the catfish genome, and should be a valuable resource for comparative genome analysis, genome evolution, and genome function studies.
Collapse
Affiliation(s)
- Shikai Liu
- The Fish Molecular Genetics and Biotechnology Laboratory, Department of Fisheries and Allied Aquacultures and Program of Cell and Molecular Biosciences, Aquatic Genomics Unit, Auburn University, Auburn, AL 36849, USA
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
50
|
Thomas PE, Wu X, Liu M, Gaffney B, Ji G, Li QQ, Hunt AG. Genome-wide control of polyadenylation site choice by CPSF30 in Arabidopsis. THE PLANT CELL 2012; 24:4376-88. [PMID: 23136375 PMCID: PMC3531840 DOI: 10.1105/tpc.112.096107] [Citation(s) in RCA: 83] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/30/2012] [Revised: 10/08/2012] [Accepted: 10/18/2012] [Indexed: 05/22/2023]
Abstract
The Arabidopsis thaliana ortholog of the 30-kD subunit of the mammalian Cleavage and Polyadenylation Specificity Factor (CPSF30) has been implicated in the responses of plants to oxidative stress, suggesting a role for alternative polyadenylation. To better understand this, poly(A) site choice was studied in a mutant (oxt6) deficient in CPSF30 expression using a genome-scale approach. The results indicate that poly(A) site choice in a large majority of Arabidopsis genes is altered in the oxt6 mutant. A number of poly(A) sites were identified that are seen only in the wild type or oxt6 mutant. Interestingly, putative polyadenylation signals associated with sites that are seen only in the oxt6 mutant are decidedly different from the canonical plant polyadenylation signal, lacking the characteristic A-rich near-upstream element (where AAUAAA can be found); this suggests that CPSF30 functions in the handling of the near-upstream element. The sets of genes that possess sites seen only in the wild type or mutant were enriched for those involved in stress and defense responses, a result consistent with the properties of the oxt6 mutant. Taken together, these studies provide new insights into the mechanisms and consequences of CPSF30-mediated alternative polyadenylation.
Collapse
Affiliation(s)
- Patrick E. Thomas
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, Kentucky 40546-0312
| | - Xiaohui Wu
- Department of Botany, Miami University, Oxford, Ohio 45056
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Man Liu
- Department of Botany, Miami University, Oxford, Ohio 45056
| | - Bobby Gaffney
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, Kentucky 40546-0312
| | - Guoli Ji
- Department of Automation, Xiamen University, Xiamen 361005, China
| | - Qingshun Q. Li
- Department of Botany, Miami University, Oxford, Ohio 45056
- Key Laboratory of the Ministry of Education for Coastal and Wetland Ecosystem, College of the Environment and Ecology, Xiamen University, Xiamen, Fujian 361102, China
- Rice Research Institute, Fujian Academy of Agricultural Sciences, Fuzhou, Fujian 350019, China
| | - Arthur G. Hunt
- Department of Plant and Soil Sciences, University of Kentucky, Lexington, Kentucky 40546-0312
- Address correspondence to
| |
Collapse
|